Michal Valko

, Chief Models Officer at a startup, researcher at Inria, and a lecturer at MVA/ENS PS.

large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models

News

News: new New game theory paper accepted to ICML 2025!
News: new I became the Chief Models Officer at a stealth startup!
News: We released Llama 3, check it out!
News: Four LLM alignment papers accepted to ICML 2024!
News: I became a Principal LLama Engineer of Meta's GenAI in Paris!
News: IPO paper accepted to AISTATS 2024!
News: One RLHF theory and one exploration bonus paper at ICLR 2024!
News: Fast-forward ⏩ alignment research with Nash learning from human feedback!
News: Another RLHF result, these time includes learning rates for RLHF!
News: A new RLHF paper from our group!
News: An RL paper on learning rate randomization accepted to NeurIPS 2023!
News: BIG NEWS: Oustanding paper award at ICML 2023 for our game theory work!
News: Nine papers including two orals accepted to ICML 2023!

News: new BYOL-Hindsight lead by Dan Jarret accepted to NeurIPS 2022 DeepRL Workshop!
News: new Apply for DeepMind Paris internships for Summer 2023.
News: new Two papers on RL accepted to NeurIPS 2022!
News: new Congrats to Jean Tarbouriech for defending his thesis on July 6th, 2022!
News: new BYOL-Explore that performs representation learning and exploration together is out!
News: new Our paper From Dirichlet to Rubin got a long oral at ICML 2022! (< 2%)
News: new Three papers accepted to ICML 2022!
News: new Congrats to Omar Darwiche Domingues for defending his thesis on March 18th, 2022!
News: new BGRL accepted to ICLR 2022!
News: new Two RL papers accepted to AISTATS 2022!
News: Apply for DeepMind Paris internships for Summer 2022.
News: Five papers accepted to NeurIPS 2021, including 1 oral and 2 spotlights!
News: Congrats to Xuedong Shang for defending his thesis on Sep 29th, 2021!
News: BraVe, self-supervised learning framework for video, accepted to ICCV 2021!
News: IXOMD invited to be presented as RL Theory seminar!
News: Our minimax SSP work invited to be presented as RL Theory seminar!
News: Six papers accepted to ICML 2021, including the long talk on UCBMQ!
News: KeRNS algorithm for non-stationary kernel RL accepted to AISTATS 2021!
News: Three papers on reinforcement learning theory accepted to ALT 2021!
News: After 4 years, the full Spectral Bandits paper with the lower bound and the comprehensive set of experiment is now online.
News: I will be giving 3 online talks on BYOL in Novemebr and December 2020..
News: Congrats to Julien Seznec for defending his thesis on Dec 15th, 2020!
News: Congrats to Pierre Perrault for defending his thesis on Nov 30th, 2020!
News: The Graphs in ML MVA course will start on January 5th, 2021 and will be taught by Daniele Calandriello.
News: Five papers accepted to NeurIPS 2020 including two oral talks for BYOL and DISCO and 1 spotlight!.
News: I am serving as an area chair for ICLR 2021.
News: very hot news Yannic Kilcher made a youtube video about our BYOL work!
News: very hot news Three months of lockdown lead to our three months intense self-supervised learning :-). BYOL is out!.
News: Eight papers accepted to ICML 2020. "See" you in Vienna!
News: Covariance-adapting semi-bandits paper accepted to COLT 2020. "See" you in Graz!
News: Congrats to Guillaume Gautier for defending his thesis on May 19th, 2020!
News: I am serving as an area chair for NeurIPS 2020.
News: Four papers accepted to AISTATS 2020. "See" you in Palermo!
News: I am giving an invited course on reinforcement learning at Math of Machine Learning Winter School during February 19-22th, 2020 in Sochi, Russia.
News: Congrats to Omar and Guillaume for their NeuriPS 2019 travel awards!
News: Four papers accepted to NeurIPS 2019. See you in Vancouver and Whistler!
News: I am serving as an area chair for NeurIPS 2019.
News: I am giving an invited talk during October 16-18th, 2019 at GIF 2019, Yerevan, Armenia.
News: I am giving a talk during September 25-26, 2019, at Recent developments in kernel methods, UCL,London, UK.
News: I am giving an invited talk during September 26-27, 2019, 2019 at Lancaster and Deepmind Bandit Workshop, London, UK.
News: I am giving an invited talk on July 23th, 2019 for Cisco in Kraków, Poland.
News: BOLD (ANR) project accepted for 2019 - 2023 (PI: V. Perchet)
News: I am giving an invited talk Yandex HQ on July 5th, 2019 at Yandex HQ 2019, Moscow, Russia.
News: I am giving an invited talk during July 3-8th, 2019 at RAAI Summer School 2019, Moscow Institute of Physics and Technology.
News: We are organizing Reinforcement Learning Summer SCOOL on 1-12 July 2019 in Lille, France.
News: I give an invited talk on June 14-15th, 2019 at ICML workshop on negative dependence
News: On June 3-4th, 2019 we are organizing a The power of graphs workshop with Laura Toni.
News: I give an invited talk on May 28th, 2019 at Theoretical Computer Science seminar at CU in Bratislava.
News: A GP-UCB sparsification paper accepted to COLT 2019. See you in Phoenix!
News: Two papers accepted to ICML 2019. See you in Long Beach!
News: We are organizing Optimizing Human Learning 2019 workshop.
News: Theoretical Computer Science seminar at CU in Bratislava.
News: I give an invited talk on February 20nd, 2019 at Data Analytics Meetings at UPJŠ in Košice.
News: Pierre Ménard joins as a postdoc! News: Three papers accepted to AISTATS 2019. See you in Okinawa!
News: I am giving an invited talk on January 25th, 2019 at Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, UK.
News: Two papers on black-box optimization accepted to ALT 2019!
News: I am giving two invited talks on January 7th and 8th, 2019 at Verimag, CNRS Grenoble, France.
News: Pierre Perrault gives invited talk on Stochastic multi-arm bandit problem and some extensions, November, 23rd, 2018 at Lambda seminar at Université de Bordeaux
News: DPPy: Sampling determinantal point processes with Python released!
News: I am on the program committee for COLT 2019.
News: Brownian motion optimization accepted to NeurIPS 2018! See you in Montréal!
News: I am giving an invited talk on September 10-13th, 2018 at International Workshop on Optimization and Machine Learning at CIMI, Toulouse.
News: A paper on optimistic optimization accepted to EWRL 2018.
News: A paper on scattering for deep learning accepted to ECCV 2018.
News: Starting October 1st, 2018, I will be teaching a graduate course on Graphs in Machine Learning in MVA Master at ENS Paris-Saclay!
News: A paper on distributed graph sparsification accepted to ICML 2018. See you in Stockholm!
News: A bandit paper on best of both worlds accepted to COLT 2018. See you in Stockholm!
News: Received Inria award for scientific excellence for 2018 - 2021: Prime d'excellence scientifique
News: Congrats to Daniele Calandriello for winning the prize for the Best AI Thesis from France in 2018. inriaCP inriaCP cnrs lille1 actu lavoixdunord newstank
News: I am serving as an area chair for NIPS 2018.
News: new We are organizing Optimizing Human Learning workshop.
News: Call for a PhD student.
News: I am co-organizing CNRS Summer school on Networks, Graphs, and Machine Learning (RESCOM 2018) in Porquerolles, June 18-22, 2018.
News: I give an invited talk for Workshop on Graph Learning, LINCS, Paris; on May 14th, 2018, LINCS, Paris
News: I give an invited talk at Journée Big data, Polytech'Lille on March 22nd, 2018, in Lille.
News: Adobe research highlights our work on online influence maximization presented at NIPS 2017.
News: I give an invited talk for GDR ISIS on February 8th, 2018 at Télécom ParisTech.
News: I give an invited talk on January 7th, 2018 at MIST 2018.
News: Daniele Calandriello defended his thesis on December 18th, 2017.
News: I give a talk on November 9th, 2017 at Plateau Inria Euratechnologies.
News: Two papers accepted to NIPS 2017. See you in California!
News: I give a talk on Sep 19th, 2017 at DeepMind, London.
News: Starting October 2rd, 2017, I will be teaching a graduate course on Graphs in Machine Learning in MVA Master at ENS Paris-Saclay!
News: Équipe associeé Nord-Europeénne accepted to work with Alexandra Carpentier on Adaptive allocation of resources for recommender systems.
News: I give a talk on September 18th, 2017 at Decision Theory and Network Science: Methods and Applications, Lancaster, UK (STOR-i 2017)
News: I give a talk on July 11th, 2017 at ICML 2017 workshop on Picky Learners
News: Congrats to Guillaume and Daniele for their travel grants to ICML 2017.
News: Two papers accepted to ICML 2017. See you in Australia!
News: I give a talk on June 28th, 2017 for L’Institut de Mathématiques de Toulouse.
News: I give a talk on June 14th, 2017 at Journées Scientifiques Inria 2017 in Nice, France.
News: I give a popularization talk on "Comment maximiser la détection des influenceurs sur les réseaux sociaux", May 30th, 2017 at Inria Lille - 13:45.
News: I give a talk on March 22nd, 2017 at Universität Potsdam at Amazon in Berlin.
News: Spectral Bandits accepted for publication to JMLR.
News: Congrats to Daniele for receiving a travel grants to AISTATS 2017.
News: Two papers accepted to AISTATS 2017. See you in Florida!
News: I give an invited talk on December 21st, 2016 at Textkernel talks series in Amsterdam.
News: Starting October 3rd, 2016, I will be teaching a graduate course on Graphs in Machine Learning in MVA Master at ENS Paris-Saclay!
News: I gave an invited talk on September 22nd, 2016 at Theoretical Computer Science seminar in Bratislava.
News: My habilitation thesis, Bandits on graphs and structures, is now online.
News: TrailBlazer paper on sample-efficient Monte-Carlo planning accepted as oral presentation to NIPS 2016!
News: I gave an invited talk on September 15-19th, 2016 at Information technologies - Applications and Theory 2016 conference at High Tatras, Slovakia.
News: I gave an invited talk on June 16th, 2016 at Graph-based Learning and Graph Mining in Lille.
News: On June 15th, 15h30 at ENS Paris-Saclay, I defended my HdR thesis on Bandits on Graphs and Structures!, HdR Committee: Nicolas Vayatis, Gábor Lugosi, Aurélien Garivier, Vianney Perchet, Nicolò Cesa-Bianchi, Mark Herbster, Rémi Munos
News: One paper accepted to ICML 2016 and two to UAI 2016. See you in NYC!
News: I gave an invited talk on May 13th, 2016 at Network Science Thematic Semester, at ENS Lyon
News: Bayesian policy gradient and actor-critic algorithms accepted to JMLR.
News: Two graph bandit papers accepted to AISTATS 2016!
News: I gave an invited talk at Multi-armed Bandit Workshop in Lancaster, UK.
News: Starting September 28th, 2015, I will be teaching a graduate course on Graphs in Machine Learning in MVA Master at ENS Paris-Saclay!
News: Polymatroid Bandits accepted for publication to JMLR.
News: Parallel Optimistic Optimization paper accepted to NIPS 2015!
News: I became a reelected member of Inria Evaluation Committee for 2015-2019.
News: Two bandit papers accepted to ICML 2015 in Lille, France.
News: Inria's press interview with N. Vayatis and myself about MVA's Graphs in ML: french, english.
News: MESSI (MaxEnt SSIRL) paper accepted to IJCAI 2015 in Argentina.
News: Intel advertising face recognition.
News: EduBand, associated team project with Carnegie Mellon, accepted for 2015 - 2018 (with A. Lazaric and E. Brunskill)
News: Starting in January 2015, I will be teaching a graduate course on Graphs in Machine Learning in MVA Master at ENS Paris-Saclay!
News: Three papers on practical bandit settings accepted to NIPS 2014!
News: SequeL is hosting ICML 2015! Please submit a paper and come to Lille!
News: Extra-Learn (ANR) project accepted for 2014 - 2017 (PI: A. Lazaric)
News: Ford and Intel Mobii project using Face Recognition at engadget.com.
News: Ford prototype using Face Recognition at intel.com.
News: I was elected to be a member of Inria Evaluation Committee for 2014-2015.
News: Two Spectral Bandit papers accepted to ICML 2014 and AAAI 2014!
News: I organized a plenary Inria talk by: Jennifer Healey on Transportation Futures: Gossiping Cars and Chatty Cities (50 attendees).
News: Received Inria award for scientific excellence for 2014 - 2017: Prime d'excellence scientifique
News: Bandits attack function optimization paper accepted to CEC 2014.
News: Call for a PhD student.
News: We established an Erasmus agreement between École Centrale, Lille and Comenius University, Bratislava for Computer Science.
News: I will become CR1, an experienced junior scientist, in 2014.
News: October 2013: INTEL/Inria collaboration: Signed INTEL funded project on Internet of Things research.
News: Recorded talk from ICML 2013 on StoSOO is online.
News: Kernelised contentextual bandits paper accepted to UAI 2013, see you in Bellevue, WA!
News: Inria publishes an article about our work at INTEL on Face Recognition
News: StoSOO paper accepted to ICML 2013! Check back soon for the paper and the code.
News: Call for a PhD student.
News: I am on the organizing committee of JFPDA 2013.
News: FG 2013 paper accepted. See you in April in Shanghai!
News: Our article was accepted to the JMLR post-proceedings of EWRL 2012.
News: I changed my office to "Bureau A05."
News: I will be giving talk at Large-Scale Online Learning and Decision-Making Workshop in Windsor, UK.
News: An article accepted to JBI 2012.
News: I became a junior researcher at Inria Lille - Nord Europe with SequeL team.
News: I will be visiting Inria Sophia-Antipolis in July 12-18.
News: I will be at ICML and EWRL 2012 in Edinburgh in July 2012.
News: EWRL 2012 paper accepted for presentation.
News: I will be the opening speaker at Slovak Oxford Science 2012.
News: An article accepted to International Journal of Dentistry.
News: ICDM 2011 paper accepted.
News: I received PhD in Machine Learning from University of Pittsburgh.
News: I submitted the final version of my dissertation on August 18th, 2011.
News: I defended my dissertation on August 1st, 2011.
News: I will be giving talk at Microsoft Research on July 6th, 10:30am in Research Room A
News: I will be at ICML from June 24th till July 2nd.
News: ICML 2011 - Global paper accepted: Conditional Anomaly Detection Using Soft Harmonic Functions: An Application to Clinical Alerting
News: I received the award for the Runner-Up for Best Reasearch Poster, Elevator Pitch and Scavenger Hunt Award on CS Day, 03/24/2011
News: I won the Computer Science Research Competition 2011.
News: Paper accepted to the Grad Expo Conference, 02/07/2011
News: I passed my proposal defense on December 20th, 2010, 2:30pm.
News: I received Academic Entrepreneurship Certificate.
News: My thesis committee: Miloš Hauskrecht, Liz Marai, Diane Litman, John Lafferty(CMU)
News: Homer Warner Best Paper Award at AMIA 2010.
News: AMIA 2010 paper accepted.
News: Google Best Paper Award at OLCV - CVPR 2010.
News: UAI 2010 paper accepted.
News: I will be in Texas, USA from May 2^nf until May 10^th on tour with Pitt Men's Glee Club.
News: Intel Labs Internship during Summer 2010
News: I am currently an intern at Intel Research, Santa Clara, CA.
News: 2008 awards updated: mellon, research competition, poster 1 and poster 2

older news

Bio

Michal is the Chief Models Officer at a stealth startup, tenured researcher at Inria, and the lecturer at MVA at ENS Paris-Saclay. Michal is primarily interested in designing algorithms that would require as little human supervision as possible. He is working on methods and settings that are able to deal with minimal feedback, such as deep reinforcement learning, bandit algorithms, self-supervised learning, or self play. Michal has recently worked on representation learning, word models and deep (reinforcement) learning algorithms that have some theoretical underpinning. In the past he has also worked on sequential algorithms with structured decisions where exploiting the structure leads to provably faster learning. Michal is now working on new generation of large large models (LMMs), in addition to providing algorithmic solutions for their scalable test-time inference, fine-tuning and alignment. He received his Ph.D. in 2011 from the University of Pittsburgh, before getting a tenure at Inria in 2012 and co-creating Google DeepMind Paris with Rémi Munos. In 2024, he became the principal Llama engineer at Meta, building online reinforcement learning stack and research for Llama 3.

Students and postdocs

Giorgio Racca, 2025-2028, ETH, PhD student, with Amartya Sanyal
Côme Fiegel, 2022-2025, ENS Ulm, PhD student, with Pierre Ménard and Vianney Perchet

Daniil Tiapkin, 2023-2025, École Polytechnique/HSE, PhD student, with A. Naumov, D. Belomestny, É. Moulines, and P. Ménard
Lisa Bedin, 2021-2025, École Polytechnique, PhD student, with É. Moulines
Pierre Ménard, 2019 - 2020, ENS Rennes/U Toulouse, postdoc, with Emilie Kaufmann ↝ U. Magdeburg
Édouard Oyallon, 2017 - 2018, ENS Rennes/ENS Ulm, postdoc ↝ EC Paris ↝ CNRS

Dan Jarrett, 2022, University of Cambridge, visiting PhD student, with Correntin Tallec
Jean Tarbouriech, 2019 - 2022, X/MVA, PhD student, with Alessandro Lazaric
Omar Darwiche Domingues, 2018 - 2022, EC Paris/MVA, PhD student, with Emilie Kaufmann ↝ Owkin
Xuedong Shang, 2017 - 2021, ENS Rennes, PhD student, with Emilie Kaufmann ↝ Barclays
Pierre Perrault, 2017 - 2020, ENS Cachan/MVA, PhD student, with Vianney Perchet ↝ IDEMIA
Julien Seznec, 2017 - 202120, ENS Ulm/MVA, PhD student, with Alessandro Lazaric and ↝ Education Nationale
Guillaume Gautier, 2017 - 2020, EC Lille/MVA, PhD student, with Rémi Bardenet ↝ Grenoble ↝ CNRS
Jean-Bastien Grill, 2014 - 2019, ENS Ulm/MVA, PhD student, with Rémi Munos ↝ DeepMind
Yunhao Tang, 2019-2020 and 2021, Columbia University, visiting PhD student, with Rémi Munos ↝ DeepMind
Aadirupa Saha, 2019 - 2020, IIS Bangalore, visiting PhD student, with Pierre Gaillard ↝ Microsoft Research
Kaige Yang, 2019, University College London, visiting PhD student, with Pierre Ménard ↝ VU Amsterdam
Rianne de Heide, 2019, CWI/Leiden University, visiting PhD student, with Emilie Kaufmann
Tomáš Kocák, 2013 - 2016, Comenius University, PhD student, with Rémi Munos ↝ ENS Lyon ↝ U. Potsdam
Daniele Calandriello, 2014 - 2017, Polimi, PhD student, AFIA, 1st prize, with Alessandro Lazaric ↝ IIT ↝ DeepMind

Côme Fiegel, 2022, ENS Ulm, M2 student, with Pierre Ménard and Vianney Perchet
Daniil Tiapkin, 2021-2023, HSE, MSc. student, with Alexey Naumov, Denis Belomestny, Éric Moulines, and Pierre Ménard
David Cheikhi, 2020 - 2021, Columbia Universitu, NYC/École Polytechnique, Paris, with Pierre Ménard
Robert Müller, 2020, Technical University of Munich, M2 student, with Pierre Ménard
Ahmed Choukarah, 2020, ENS Ulm, L3 student, with Pierre Ménard
Côme Fiegel, 2019, ENS Ulm, L3 student, with Victor Gabillon
Axel Elaldi, 2017-2018, master student, École Centrale de Lille ↝ ENS Paris-Saclay/MVA
Xuedong Shang, 2017, master student, ENS Rennes, with Emilie Kaufmann ↝ Inria
Guillaume Gautier, 2016, master student, École Normale Supérieure, Paris-Saclay, with Rémi Bardenet ↝ Inria/CNRS
Andrea Locatelli, 2015-2016, ENSAM/ENS Paris-Saclay, with Alexandra Carpentier ↝ Universität Potsdam
Souhail Toumdi, 2015 - 2016, master student, École Centrale de Lille, with Rémi Bardenet ↝ ENS Paris-Saclay/MVA
Akram Erraqabi, 2015, master student, École Polytechnique, Paris ↝ Université de Montréal
Mastane Achab, 2015, master student, École Polytechnique, Paris, with G. Neu ↝ l'ENS PS ↝ TPT ↝ UPF Barcelona
Jean-Bastien Grill, 2014, master student, École Normale Supérieure, Paris, with Rémi Munos ↝ Inria
Alexandre Dubus, 2012-2013, master student, Université Lille1 - Sciences et Technologies ↝ Inria
Karim Jedda, 2012-2013, master student, École Centrale de Lille ↝ ProSiebenSat.1
Alexis Wehrli, 2012-2013, master student, École Centrale de Lille ↝ ERDF

Contact

Stealth Startup
San Francisco, California, US
Paris, France

Inria Lille - Nord Europe, equipe Scool (bureau: A05)
Parc Scientifique de la Haute Borne
40 avenue Halley
59650 Villeneuve d'Ascq, France
office phone: +33 3 59 57 7801

Centre Borelli, ENS Paris-Saclay (bureau: vacataires)
4, avenue des Sciences
91190 Gif-sur-Yvette