Michal Valko : Research

Michal Valko, Chief Models Officer at a startup, researcher at Inria, and a lecturer at MVA/ENS PS.

  large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models

News

News: Life update I became the Chief Models Officer at a stealth startup!
News: new We released Llama 3, check it out!
News: new Four LLM alignment papers accepted two orals accepted to ICML 2024!
News: I became a Principal LLama Engineer of Meta's GenAI in Paris!
News: IPO paper accepted to AISTATS 2024!
News: One RLHF theory and one exploration bonus paper at ICLR 2024!
News: Fast-forward ⏩ alignment research with Nash learning from human feedback!
News: Another
RLHF result, these time includes learning rates for RLHF!
News: A new RLHF paper from our group!
News: An RL paper on learning rate randomization accepted to NeurIPS 2023!
News: BIG NEWS: Oustanding paper award at ICML 2023 for our game theory work!
News: Nine papers including two orals accepted to ICML 2023!

older news

Bio

Michal is the Chief Models Officer at a stealth startup, tenured researcher at Inria, and the lecturer at MVA at ENS Paris-Saclay. Michal is primarily interested in designing algorithms that would require as little human supervision as possible. He is working on methods and settings that are able to deal with minimal feedback, such as deep reinforcement learning, bandit algorithms, self-supervised learning, or self play. Michal has recently worked on representation learning, word models and deep (reinforcement) learning algorithms that have some theoretical underpinning. In the past he has also worked on sequential algorithms with structured decisions where exploiting the structure leads to provably faster learning. Michal is now working on new generation of large large models (LMMs), in addition to providing algorithmic solutions for their scalable test-time inference, fine-tuning and alignment. He received his Ph.D. in 2011 from the University of Pittsburgh, before getting a tenure at Inria in 2012 and co-creating Google DeepMind Paris with Rémi Munos. In 2024, he became the principal Llama engineer at Meta, building online reinforcement learning stack and research for Llama 3 and 4.

Students and postdocs

  • Côme Fiegel, 2022, ENS Ulm, M2 student, with Pierre Ménard and Vianney Perchet
  • Daniil Tiapkin, 2021-2023, HSE, MSc. student, with Alexey Naumov, Denis Belomestny, Éric Moulines, and Pierre Ménard
  • David Cheikhi, 2020 - 2021, Columbia Universitu, NYC/École Polytechnique, Paris, with Pierre Ménard
  • Robert Müller, 2020, Technical University of Munich, M2 student, with Pierre Ménard
  • Ahmed Choukarah, 2020, ENS Ulm, L3 student, with Pierre Ménard
  • Côme Fiegel, 2019, ENS Ulm, L3 student, with Victor Gabillon
  • Axel Elaldi, 2017-2018, master student, École Centrale de Lille ↝ ENS Paris-Saclay/MVA
  • Xuedong Shang, 2017, master student, ENS Rennes, with Emilie Kaufmann ↝ Inria
  • Guillaume Gautier, 2016, master student, École Normale Supérieure, Paris-Saclay, with Rémi Bardenet ↝ Inria/CNRS
  • Andrea Locatelli, 2015-2016, ENSAM/ENS Paris-Saclay, with Alexandra Carpentier ↝ Universität Potsdam
  • Souhail Toumdi, 2015 - 2016, master student, École Centrale de Lille, with Rémi Bardenet ↝ ENS Paris-Saclay/MVA
  • Akram Erraqabi, 2015, master student, École Polytechnique, Paris ↝ Université de Montréal
  • Mastane Achab, 2015, master student, École Polytechnique, Paris, with G. Neu ↝ l'ENS PS ↝ TPT ↝ UPF Barcelona
  • Jean-Bastien Grill, 2014, master student, École Normale Supérieure, Paris, with Rémi Munos ↝ Inria
  • Alexandre Dubus, 2012-2013, master student, Université Lille1 - Sciences et Technologies ↝ Inria
  • Karim Jedda, 2012-2013, master student, École Centrale de Lille ↝ ProSiebenSat.1
  • Alexis Wehrli, 2012-2013, master student, École Centrale de Lille ↝ ERDF

Contact

  • Stealth Startup
  • San Francisco, California, US
  • Paris, France
  • Inria Lille - Nord Europe, equipe Scool (bureau: A05)
  • Parc Scientifique de la Haute Borne
  • 40 avenue Halley
  • 59650 Villeneuve d'Ascq, France
  • office phone: +33 3 59 57 7801
  • Centre Borelli, ENS Paris-Saclay (bureau: vacataires)
  • 4, avenue des Sciences
  • 91190 Gif-sur-Yvette


mv