Michal Valko - Research

My Google Scholar profile, ArXiv profile, and HAL profile.

preprints

Yunhao Tang, Taco Cohen, David W. Zhang, Michal Valko, Rémi Munos: RL-finetuning LLMs from on- and off-policy data with a single algorithm, arXiv preprint
Chaoqi Wang, Zhuokai Zhao, Chen Zhu, Karthik Abinav Sankararaman, Michal Valko, Xuefei Cao, Zhaorun Chen, Madian Khabsa, Yuxin Chen, Hao Ma, Sinong Wang: Preference optimization with multi-sample comparisons, arXiv preprint
Antoine Scheid, Étienne Boursier, Alain Durmus, Michael I Jordan, Pierre Ménard, Éric Moulines, Michal Valko: Optimal design for reward modeling in RLHF, arXiv preprint
Pierre Perrault, Denis Belomestny, Pierre Ménard, Éric Moulines, Alexey Naumov, Daniil Tiapkin, Michal Valko: A new bound on the cumulant generating function of Dirichlet processes, arXiv preprint
Yunhao Tang, Daniel Zhaohan Guo, Zeyu Zheng, Daniele Calandriello, Yuan Cao, Eugene Tarassov, Rémi Munos, Bernardo Ávila Pires, Michal Valko, Yong Cheng, Will Dabney: Understanding the performance gap between online and offline alignment algorithms, arXiv preprint
Denis Belomestny, Pierre Ménard, Alexey Naumov, Daniil Tiapkin, Michal Valko: Sharp deviations bounds for Dirichlet weighted sums with application to analysis of Bayesian algorithms, arXiv preprint
Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári: KL-entropy-regularized RL with a generative model is minimax optimal, arXiv preprint

2025

Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Michal Valko, Vianney Perchet: The Harder Path: Last iterate convergence for uncoupled Learning in zero-sum games with bandit feedback, in International Conference on Machine Learning (ICML 2025) or preprint
Daniil Tiapkin, Daniele Calandriello, Denis Belomestny, Éric Moulines, Alexey Naumov, Kashif Rasul, Michal Valko, Pierre Ménard: Accelerating Nash learning from human feedback via Mirror Prox, in COLT 2025 Workshop: Foundations of Post-training (COLT-FoPt 2025), arXiv preprint poster

2024

Llama Team: Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, ... Michal Valko ... Zef Rosnbrick, Zhaoduo Wen, Zhenyu Yang, Zhiwei Zhao, Zhiyu Ma: The Llama 3 herd of models, arxiv
Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora: Metacognitive capabilities of LLMs: An exploration in mathematical problem solving, in Neural Information Processing Systems (NeurIPS 2024) and (ICML 2024 - AI for Math) arXiv preprint poster
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko: Local and adaptive mirror descents in extensive-form games, in Neural Information Processing Systems (NeurIPS 2024), arXiv preprint
Rémi Munos*, Michal Valko, Daniele Calandriello*, Mohammad Gheshlaghi Azar*, Mark Rowland*, Daniel Guo*, Yunhao Tang*, Matthieu Geist*, Thomas Mésnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot: Nash learning from human feedback, in International Conference on Machine Learning (ICML 2024) arXiv preprint
Daniele Calandriello, Daniel Guo, Remi Munos, Mark Rowland, Yunhao Tang, Bernardo Avila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot: Human alignment of large language models through online preference optimisation, in International Conference on Machine Learning (ICML 2024) [spotlight - 3.5% acceptance rate] arXiv preprint
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot: Generalized preference optimization: A unified approach to offline alignment, in International Conference on Machine Learning (ICML 2024) arXiv preprint
Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel: Decoding-time realignment of language models, Nash learning from human feedback, in International Conference on Machine Learning (ICML 2024) [spotlight - 3.5% acceptance rate] arXiv preprint
Mohammad Gheshlaghi Azar, Mark Rowland, Bilal Piot, Zhaohan Daniel Guo, Daniele Calandriello, Michal Valko, Rémi Munos: A general theoretical paradigm to understand learning from human preferences, in International Conference on Artificial Intelligence and Statistics , (AISTATS 2024) arXiv preprint
Alaa Saade, Steven Kapturowski, Daniele Calandriello, Charles Blundell, Pablo Sprechmann, Leopoldo Sarra, Oliver Groth, Michal Valko, Bilal Piot: Unlocking the power of representations in long-term novelty-based exploration, in International Conference on Learning Representations (ICLR 2024) [spotlight - 5% acceptance rate] (NeuriPS 2023 - ALOE) arXiv preprint
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Éric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard: Demonstration-regularized RL, (ICLR 2024) in International Conference on Learning Representations arXiv preprint

2023

Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Éric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard: Model-free posterior sampling via learning rate randomization, in Neural Information Processing Systems (NeurIPS 2023), arXiv preprint
Onno Eberhard, Thibaut Cuvelier, Michal Valko, Bruno Adrien De Backer: Middle-mile logistics through the lens of goal-conditioned reinforcement learning, in NeurIPS 2023 Workshop: Goal-conditioned RL, overleaf preprint
Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko: Curiosity in hindsight: Intrinsic exploration in stochastic environments, in International Conference on Machine Learning (ICML 2023) (NeurIPS 2022 - DeepRL), arXiv preprint
Yunhao Tang, Rémi Munos, Mark Rowland, Michal Valko: VA-learning as a more efficient alternative to Q-learning, in International Conference on Machine Learning (ICML 2023) arXiv preprint
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Éric Moulines, Rémi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Ménard: Fast rates for maximum entropy exploration, in International Conference on Machine Learning (ICML 2023) arXiv preprint
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko: Adapting to game trees in zero-sum imperfect information games, in International Conference on Machine Learning [outstanding papera ward] (ICML 2023) [oral - x% acceptance rate] arXiv preprint poster talk
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko: Understanding self-predictive learning for reinforcement learning, in International Conference on Machine Learning (ICML 2023) arXiv preprint
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko: DoMo-AC: Doubly multi-step off-policy actor-critic algorithm, in International Conference on Machine Learning (ICML 2023) arXiv preprint
Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai: Regularization and variance-weighted regression achieves minimax optimality in linear MDPs: Theory and practice, in International Conference on Machine Learning (ICML 2023) arXiv preprint
Thomas Mesnard, Wenqi Chen, Alaa Saade, Yunhao Tang, Mark Rowland, Theophane Weber, Clare Lyle, Audrunas Gruslys, Michal Valko, Will Dabney, Georg Ostrovski, Éric Moulines, Rémi Munos: Quantile credit assignment, in International Conference on Machine Learning (ICML 2023) [oral - x% acceptance rate]
Mehdi Azabou, Venkataramana Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, Michal Valko, Petar Veličković, Eva L Dyer: Half-Hop: A graph upsampling approach for slowing down message passing, in International Conference on Machine Learning (ICML 2023) poster code

2022

Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pîslar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot: BYOL-Explore: Exploration by bootstrapped prediction, in Neural Information Processing Systems (NeurIPS 2022) arXiv preprint poster talk
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Éric Moulines, Rémi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard: Optimistic posterior sampling for reinforcement learning with few samples and tight guarantees, in Neural Information Processing Systems (NeurIPS 2022), arXiv preprint poster talk
Daniil Tiapkin, Denis Belomestny, Éric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Ménard: From Dirichlet to Rubin: Optimistic exploration in RL without bonuses, in International Conference on Machine Learning (ICML 2022) [long talk - 2% acceptance rate] arXiv preprint, talk poster
Anirudh Goyal, Abram L Friesen, Theophane Weber, Andrea Banino, Nan Rosemary Ke, Adria Puigdomenech Badia, Ksenia Konyushkova, Michal Valko, Simon Osindero, Timothy P Lillicrap, Nicolas Heess, Charles Blundell: Retrieval-augmented reinforcement learning, in International Conference on Machine Learning (ICML 2022) arXiv preprint
Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco: Scaling Gaussian process optimization by evaluating a few unique candidates multiple times, in International Conference on Machine Learning (ICML 2022) arXiv preprint, talk poster
Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L. Dyer, Rémi Munos, Petar Veličković, Michal Valko: Large-scale representation learning on graphs via bootstrapping, in International Conference on Learning Representations (ICLR 2022) (ICLR 2021 - GTRL) arXiv preprint poster
Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric: , in International Conference on Artificial Intelligence and Statistics Adaptive multi-goal exploration, (AISTATS 2022) arXiv preprint
Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko: Marginalized operators for off-policy reinforcement learning, in International Conference on Artificial Intelligence and Statistics (AISTATS 2022) and (ICML 2021 - RL Theory) arXiv preprint

2021

Ran Liu, Mehdi Azabou, Max Dabagia, Chi-Heng Lin, Mohammad Gheshlaghi Azar, Keith B. Hengen, Michal Valko, Eva L. Dyer: Drop, Swap, and Generate: A self-supervised approach for generating neural activity, in Neural Information Processing Systems (NeurIPS 2021) [oral - 1% acceptance rate] arXiv preprint
Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric: Stochastic shortest path: minimax, parameter-free and towards horizon-free regret, in Neural Information Processing Systems (NeurIPS 2021) [spotlight - 3% acceptance rate] and (ICML 2021 - RL Theory), arXiv preprint, talk video
Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric: A provably efficient sample collection strategy for reinforcement learning, in Neural Information Processing Systems (NeurIPS 2021) spotlight - 3% acceptance rate] arXiv preprint, bibtex
Tadashi Kozuno*, Pierre Ménard*, Rémi Munos, Michal Valko: Model-free learning for two-player zero-sum partially observable Markov games with perfect recall, in Neural Information Processing Systems (NeurIPS 2021) arXiv preprint, talk poster
Yunhao Tang*, Tadashi Kozuno>, Mark Rowland, Rémi Munos, Michal Valko: Unifying gradient estimators for meta-reinforcement learning via off-policy evaluation, in Neural Information Processing Systems (NeurIPS 2021) arXiv preprint
Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Patraucean, Florent Altché, Michal Valko, Jean-Bastien Grill, Aäron van den Oord, Andrew Zisserman: Broaden your views for self-supervised video learning, in International Conference on Computer Vision (ICCV 2021) arXiv preprint poster
Pierre Ménard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko: UCB Momentum Q-learning: Correcting the bias without forgetting, in International Conference on Machine Learning (ICML 2021) [long talk - 3% acceptance rate] arXiv preprint poster talk
Pierre Ménard, Omar Darwiche Domingues, Emilie Kaufmann, Anders Jonsson, Edouard Leurent, Michal Valko: Fast active learning for pure exploration in reinforcement learning, in International Conference on Machine Learning (ICML 2021) arXiv preprint, bibtex
Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel: Revisiting Peng's Q(λ) for for modern reinforcement learning, in International Conference on Machine Learning (ICML 2021) arXiv preprint bibtex
Yunhao Tang, Mark Rowland, Rémi Munos, Michal Valko: Taylor expansion of discount factors, in International Conference on Machine Learning (ICML 2021) arXiv preprint
Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet: Online A-optimal design and active linear regression, in International Conference on Machine Learning (ICML 2021) arXiv preprint poster
Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko: Kernel-based reinforcement Learning: A finite-time analysis, in International Conference on Machine Learning (ICML 2021) arXiv preprint bibtex
Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Perolat, Bart De Vylder, Ali Eslami, Mark Rowland, Andrew Jaegle, Remi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis: Game plan: What AI can do for football, and what football can do for AI, in Journal of Artificial Intelligence Research (JAIR 2021) arXiv preprint
Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko: A kernel-based approach to non-stationary reinforcement learning in metric spaces, in in International Conference on Artificial Intelligence and Statistics (AISTATS 2021) and (ICML 2020 - RL Theory) [oral - 6% acceptance rate] arXiv preprint video
Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko: Episodic reinforcement learning in finite MDPs: Minimax lower bounds revisited, in Algorithmic Learning Theory (ALT 2021) arXiv preprint video bibtex
Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric: Sample complexity bounds for stochastic shortest path with a generative model, in Algorithmic Learning Theory (ALT 2021) video
Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko: Adaptive reward-free exploration, in arXiv preprint, video 1 video 2 in Algorithmic Learning Theory (ALT 2021) and (ICML 2020 - RL Theory) bibtex
Guillaume Gautier, Rémi Bardenet, Michal Valko: Fast sampling from β-ensembles, Statistics and Computing (Statistics and Computing 2021) arXiv preprint, bibtex
Mehdi Azabou, Mohammad Gheshlaghi Azar, Ran Liu, Chi-Heng Lin, Erik C. Johnson, Kiran Bhaskaran-Nair, Max Dabagia, Bernardo Ávila Pires, Lindsey Kitchell, Keith B. Hengen, William Gray-Roncal, Michal Valko, Eva L. Dyer: Mine Your Own vieW: Self-supervised learning through across-sample prediction, in NeurIPS 2021 Workshop: Self-Supervised Learning - Theory and Practice, arXiv preprint
Omar Darwiche Domingues, Corentin Tallec, Rémi Munos, Michal Valko: Density-based bonuses on learned representations for reward-free exploration in deep reinforcement learning, in ICML 2021 Workshop: Unsupervised Reinforcement Learning, openreview

2020

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko: Bootstrap Your Own Latent: A new approach to self-supervised learning, in Neural Information Processing Systems (NeurIPS 2020) [oral - 1% acceptance rate]
- arXiv preprint bibtex
- our twitter announcement and our code
- youtube video by Yannic, unoffical blog 1 and unoffical blog 2
Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko: BYOL works even without batch statistics, in NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice arXiv preprint bibtex
Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric: Improved sample complexity for incremental autonomous exploration in MDPs, in Neural Information Processing Systems (NeurIPS 2020) [oral - 1% acceptance rate] arXiv preprint poster bibtex
Daniele Calandriello*, Michał Dereziński*, Michal Valko: Sampling from a k-DPP without looking at all items, in Neural Information Processing Systems (NeurIPS 2020) spotlight - 3% acceptance rate] arXiv preprint bibtex
Pierre Perrault, Etienne Boursier, Vianney Perchet, Michal Valko: Statistical efficiency of Thompson sampling for combinatorial semi-bandits, in Neural Information Processing Systems (NeurIPS 2020) arXiv preprint bibtex
Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko: Planning in Markov decision processes with gap-dependent sample complexity, in Neural Information Processing Systems (NeurIPS 2020) arXiv preprint poster bibtex
Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos: Monte-Carlo tree search as regularized policy optimization, in International Conference on Machine Learning (ICML 2020) talk arXiv preprint video bibtex
Yunhao Tang, Michal Valko, Rémi Munos: Taylor expansion policy optimization, in International Conference on Machine Learning (ICML 2020) and (MS RL Day 2021) arXiv preprint talk bibtex
Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko: Gamification of pure exploration for linear bandits, in International Conference on Machine Learning (ICML 2020) arXiv preprint talk bibtex
Pierre Perrault, Zheng Wen, Jennifer Healey, Michal Valko: Budgeted online influence maximization, in International Conference on Machine Learning (ICML 2020) video bibtex
Jean Tarbouriech, Evrard Garcelon, Michal Valko, Matteo Pirotta, Alessandro Lazaric: No-regret exploration in goal-oriented reinforcement learning, in International Conference on Machine Learning (ICML 2020) arXiv preprint talk bibtex
Aadirupa Saha, Pierre Gaillard, Michal Valko: Improved sleeping bandits with stochastic action sets and adversarial rewards, in International Conference on Machine Learning (ICML 2020) arXiv preprint talk bibtex
Anne Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko: Stochastic bandits with arm-dependent delays, in International Conference on Machine Learning (ICML 2020) and (GPSD 2020) and (WiML 2019) arXiv preprint video bibtex
Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco: Near-linear time Gaussian process optimization with adaptive batching and resparsification, in International Conference on Machine Learning (ICML 2020) and (OPT 2019) arXiv preprint talk bibtex
Pierre Perrault, Vianney Perchet, Michal Valko: Covariance-adapting algorithm for semi-bandits with application to sparse rewards, in Conference on Learning Theory (COLT 2020), video bibtex
Xuedong Shang, Rianne de Heide, Emilie Kaufmann, Pierre Ménard, Michal Valko: Fixed-confidence guarantees for Bayesian best-arm identification, in International Conference on Artificial Intelligence and Statistics (AISTATS 2020) talk bibtex arXiv preprint
Côme Fiegel, Victor Gabillon, Michal Valko: Adaptive multi-fidelity optimization with fast learning rates in International Conference on Artificial Intelligence and Statistics (AISTATS 2020) bibtex
Victor Gabillon, Rasul Tutunov, Michal Valko, Haitham Bou Ammar: Derivative-free & order-robust optimisation in International Conference on Artificial Intelligence and Statistics (AISTATS 2020) bibtex talk
Julien Seznec, Pierre Ménard, Alessandro Lazaric, Michal Valko: A single algorithm for both restless and rested rotting bandits, in International Conference on Artificial Intelligence and Statistics (AISTATS 2020) bibtex talk
Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric: Reward-free exploration beyond finite-horizon, in Theoretical Foundations of RL Workshop @ ICML 2020 (ICML 2020 - RL Theory) video
Tomáš Kocák, Rémi Munos, Branislav Kveton, Shipra Agrawal, Michal Valko: Spectral Bandits, Journal of Machine Learning Research (JMLR 2020) bibtex

2019

Jean-Bastien Grill*, Omar Darwiche Domingues*, Pierre Ménard, Rémi Munos, Michal Valko: Planning in entropy-regularized Markov decision processes and games, in Neural Information Processing Systems (NeurIPS 2019) bibtex poster
Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Pérolat, Michal Valko, Georgios Piliouras, Rémi Munos: Multiagent evaluation under incomplete information, in Neural Information Processing Systems (NeurIPS 2019) bibtex
Michał Dereziński*, Daniele Calandriello*, Michal Valko: Exact sampling of determinantal point processes with sublinear time preprocessing, in Neural Information Processing Systems (NeurIPS 2019) and (ICML 2019 - NEGDEP) bibtex video
Guillaume Gautier, Rémi Bardenet, Michal Valko: On two ways to use determinantal point processes for Monte Carlo integration, in Neural Information Processing Systems (NeurIPS 2019) and (ICML 2019 - NEGDEP) bibtex video
Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco: Gaussian process optimization with adaptive sketching: Scalable and no regret, in Conference on Learning Theory (COLT 2019) and (ICML 2019 - NEGDEP) and (SWSL 2019) bibtex video talk poster
Peter Bartlett, Victor Gabillon, Jennifer Healey, Michal Valko: Scale-free adaptive planning for deterministic dynamics & discounted rewards in International Conference on Machine Learning (ICML 2019) bibtex video talk poster
Pierre Perrault, Vianney Perchet, Michal Valko: Exploiting structure of uncertainty for efficient matroid semi-bandits, in International Conference on Machine Learning (ICML 2019) bibtex video talk poster
Xuedong Shang, Emilie Kaufmann, Michal Valko: A simple dynamic bandit-based algorithm for hyper-parameter tuning, in shop on Automated Machine Learning at International Conference on Machine Learning (ICML 2019 - AutoML) bibtex poster code
Guillaume Gautier, Rémi Bardenet, Michal Valko: DPPy: Sampling determinantal point processes with Python, Journal of Machine Learning Research (JMLR 2019) arXiv preprint bibtex
Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko: Rotting bandits are not harder than stochastic ones, in International Conference on Artificial Intelligence and Statistics (AISTATS 2019) bibtex talk poster [full oral presentation - 2.5% acceptance rate]
Andrea Locatelli, Alexandra Carpentier, Michal Valko: Active multiple matrix completion with adaptive confidence sets, in International Conference on Artificial Intelligence and Statistics (AISTATS 2019) bibtex talk poster
Pierre Perrault, Vianney Perchet, Michal Valko: Finding the bandit in a graph: Sequential search-and-stop, in International Conference on Artificial Intelligence and Statistics (AISTATS 2019) bibtex poster
Peter L. Bartlett, Victor Gabillon, Michal Valko: A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption, in Algorithmic Learning Theory (ALT 2019) bibtex talk 1 talk 2
Xuedong Shang, Emilie Kaufmann, Michal Valko: General parallel optimization without metric, in Algorithmic Learning Theory (ALT 2019) bibtex talk
Guillaume Gautier, Rémi Bardenet, Michal Valko: Les processus ponctuels déterminantaux en apprentissage automatique bibtex (Gretsi 2019)

2018

Jean-Bastien Grill, Michal Valko, Rémi Munos: Optimistic optimization of a Brownian, in Neural Information Processing Systems (NeurIPS 2018) bibtex poster
Xuedong Shang, Emilie Kaufmann, Michal Valko: Adaptive black-box optimization got easier: HCT needs only local smoothness, in European Workshop on Reinforcement Learning (EWRL 2018) bibtex poster
Edouard Oyallon, Eugene Belilovsky, Sergey Zagoruyko, Michal Valko: Compressing the input for CNNs with the first-order scattering transform, in European Conference on Computer Vision (ECCV 2018) bibtex poster
Daniele Calandriello, Ioannis Koutis, Alessandro Lazaric, Michal Valko: Improved large-scale graph learning through ridge spectral sparsification, in International Conference on Machine Learning (ICML 2018) bibtex talk poster
Yasin Abbasi-Yadkori, Peter L. Bartlett, Victor Gabillon, Alan Malek, Michal Valko: Best of both worlds: Stochastic & adversarial best-arm identification, Conference on Learning Theory (COLT 2018) bibtex video talk poster

2017

Daniele Calandriello, Alessandro Lazaric, Michal Valko: Efficient second-order online kernel learning with adaptive embedding, in Neural Information Processing Systems (NeurIPS 2017) bibtex talk poster
Zheng Wen, Branislav Kveton , Michal Valko, Sharan Vaswani: Online influence maximization under independent cascade model with semi-bandit feedback, in Neural Information Processing Systems (NeurIPS 2017) bibtex
Daniele Calandriello, Alessandro Lazaric, Michal Valko: Second-order kernel online convex optimization with adaptive sketching, in International Conference on Machine Learning (ICML 2017) bibtex talk poster
Guillaume Gautier, Rémi Bardenet, Michal Valko: Zonotope hit-and-run for efficient sampling from projection DPPs, in International Conference on Machine Learning (ICML 2017) bibtex talk poster
Daniele Calandriello, Alessandro Lazaric, Michal Valko: Distributed adaptive sampling for kernel matrix approximation, in International Conference on Artificial Intelligence and Statistics (AISTATS 2017) and (ICML 2017 - LL) bibtex talk code poster
Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yu-En Liu: Trading off rewards and errors in multi-armed bandits, in International Conference on Artificial Intelligence and Statistics (AISTATS 2017) bibtex poster

2016

Michal Valko: Bandits on graphs and structures, habilitation thesis, École normale supérieure de Cachan (ENS Cachan 2016) bibtex
Jean-Bastien Grill, Michal Valko, Rémi Munos: Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning, in Neural Information Processing Systems (NeurIPS 2016) bibtex talk poster [full oral presentation - 1.8% acceptance rate]
Daniele Calandriello, Alessandro Lazaric, Michal Valko: Pack only the essentials: Adaptive dictionary learning for kernel ridge regression, in Adaptive and Scalable Nonparametric Methods in Machine Learning at Neural Information Processing Systems (NeurIPS 2016 - ASNMML) bibtex poster
Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yu-En L@iu: Rewards and Errors in Multi-armed Bandit for Interactive Education, in Challenges in Machine Learning: Learning and Education workshop at Neural Information Processing Systems (NeurIPS 2016 - CIML) bibtex poster
Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric-Ambrym Maillard: Pliable rejection sampling, in International Conference on Machine Learning (ICML 2016) bibtex talk long talk poster
Daniele Calandriello, Alessandro Lazaric, Michal Valko: Analysis of Nyström method with sequential ridge leverage scores, in Uncertainty in Artificial Intelligence (UAI 2016) bibtex poster spotlight
Tomáš Kocák, Gergely Neu, Michal Valko: Online learning with Erdős-Rényi side-observation graphs, in Uncertainty in Artificial Intelligence (UAI 2016) bibtex poster spotlight
Mohammad Ghavamzadeh, Yaakov Engel, Michal Valko: Bayesian policy gradient and actor-critic algorithms, Journal of Machine Learning Research (JMLR 2016) bibtex code
Tomáš Kocák, Gergely Neu, Michal Valko: Online learning with noisy side observations, in International Conference on Artificial Intelligence and Statistics (AISTATS 2016) bibtex talk poster [full oral presentation - 6% acceptance rate]
Alexandra Carpentier, Michal Valko: Revealing graph bandits for maximizing local influence, in International Conference on Artificial Intelligence and Statistics (AISTATS 2016) bibtex poster

2015

Jean-Bastien Grill, Michal Valko, Rémi Munos: Black-box optimization of noisy functions with unknown smoothness, in Neural Information Processing Systems (NeurIPS 2015) bibtex code, code in R poster
Alexandra Carpentier, Michal Valko: Simple regret for infinitely many armed bandits, in International Conference on Machine Learning (ICML 2015) bibtex talk poster arXiv
Manjesh Hanawal, Venkatesh Saligrama, Michal Valko, Rémi Munos: Cheap Bandits, in International Conference on Machine Learning (ICML 2015) bibtex talk poster
Daniele Calandriello, Alessandro Lazaric, Michal Valko: Large-scale semi-supervised learning with online spectral graph sparsification, in Resource-Efficient Machine Learning workshop at International Conference on Machine Learning (ICML 2015 - REML) bibtex poster
Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh: Maximum Entropy Semi-Supervised Inverse Reinforcement Learning, in International Joint Conferences on Artificial Intelligence (IJCAI 2015) bibtex talk poster

2014

Tomáš Kocák, Gergely Neu, Michal Valko, Rémi Munos: Efficient Learning by Implicit Exploration in Bandit Problems with Side Observations, in Neural Information Processing Systems (NeurIPS 2014) bibtex talk poster
Alexandra Carpentier, Michal Valko: Extreme Bandits, in Neural Information Processing Systems (NeurIPS 2014) bibtex poster
Gergely Neu, Michal Valko: Online Combinatorial Optimization with Stochastic Decision Sets and Adversarial Losses, in Neural Information Processing Systems (NeurIPS 2014) bibtex talk poster
Michal Valko, Rémi Munos, Branislav Kveton, Tomáš Kocák: Spectral Bandits for Smooth Graph Functions, in International Conference on Machine Learning (ICML 2014) bibtex slides poster
Tomáš Kocák, Michal Valko, Rémi Munos, Shipra Agrawal: Spectral Thompson Sampling, in AAAI Conference on Artificial Intelligence (AAAI 2014) bibtex slides poster
Philippe Preux, Rémi Munos, Michal Valko: Bandits attack function optimization, in IEEE Congress on Evolutionary Computation (CEC 2014) bibtex
Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh: MESSI: Maximum Entropy Semi-Supervised Inverse Reinforcement Learning, in NIPS Workshop on Novel Trends and Applications in Reinforcement Learning (NeurIPS 2014 - TCRL) bibtex
Tomáš Kocák, Michal Valko, Rémi Munos, Branislav Kveton, Shipra Agrawal: Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems, in AAAI Workshop on Sequential Decision-Making with Big Data (AAAI 2014 - SDMBD) bibtex

2013

Michal Valko, Alexandra Carpentier, Rémi Munos: Stochastic Simultaneous Optimistic Optimization, in International Conference on Machine Learning (ICML 2013) bibtex demo code, code in R slides poster talk
Michal Valko, Nathan Korda, Rémi Munos, Ilias Flaounas, Nello Cristianini: Finite-Time Analysis of Kernelised Contextual Bandits, in Uncertainty in Artificial Intelligence (UAI 2013) and (JFPDA 2013). bibtex poster spotlight code
Branislav Kveton, Michal Valko: Learning from a Single Labeled Face and a Stream of Unlabeled Data, in IEEE International Conference on Automatic Face and Gesture Recognition (FG 2013) [spotlight] bibtex
Milos Hauskrecht, Iyad Batal, Michal Valko, Shyam Visweswaran, Gregory F. Cooper, Gilles Clermont: Outlier detection for patient monitoring and alerting, in Journal of Biomedical Informatics (JBI 2013) bibtex

2012

Michal Valko, Mohammad Ghavamzadeh, Alessandro Lazaric: Semi-supervised apprenticeship learning, in Journal of Machine Learning Research Workshop and Conference Proceedings: European Workshop on Reinforcement Learning (EWRL 2012) bibtex talk poster

2011

Michal Valko, Branislav Kveton, Hamed Valizadegan, Gregory F. Cooper, Milos Hauskrecht: Conditional Anomaly Detection with Soft Harmonic Functions, in International Conference on Data Mining (ICDM 2011) bibtex
Thomas C. Hart, Patricia M. Corby, Milos Hauskrecht, Ok Hee Ryu, Richard Pelikan, Michal Valko, Maria B. Oliveira, Gerald T. Hoehn, and Walter A. Bretz: Identification of Microbial and Proteomic Biomedicalkers in Early Childhood Caries in International Journal of Dentistry (IJD 2011) bibtex
Michal Valko: Adaptive Graph-Based Algorithms for Conditional Anomaly Detection and Semi-Supervised Learning, PhD thesis, University of Pittsburgh (PITT 2011) bibtex
Michal Valko, Hamed Valizadegan, Branislav Kveton, Gregory F. Cooper, Milos Hauskrecht: Conditional Anomaly Detection Using Soft Harmonic Functions: An Application to Clinical Alerting, Workshop on Machine Learning for Global Challenges in International Conference on Machine Learning (ICML 2011 - Global) bibtex poster spotlight

2010

Michal Valko, Branislav Kveton, Ling Huang, Daniel Ting: Online Semi-Supervised Learning on Quantized Graphs in Uncertainty in Artificial Intelligence (UAI 2010) bibtex Video: Adaptation, Video: OfficeSpace, spotlight poster
Branislav Kveton, Michal Valko, Ali Rahimi, Ling Huang: Semi-Supervised Learning with Max-Margin Graph Cuts in International Conference on Artificial Intelligence and Statistics (AISTATS 2010) bibtex
Milos Hauskrecht, Michal Valko, Shyam Visweswaram, Iyad Batal, Gilles Clermont, Gregory Cooper: Conditional Outlier Detection for Clinical Alerting in Annual American Medical Informatics Association conference (AMIA 2010) bibtex [Homer Warner Best Paper Award]
Branislav Kveton, Michal Valko, Matthai Phillipose, Ling Huang: Online Semi-Supervised Perception: Real-Time Learning without Explicit Feedback in IEEE Online Learning for Computer Vision Workshop in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010 - OLCV) [best paper Google Award] bibtex
Michal Valko, Milos Hauskrecht: Feature importance analysis for patient management decisions in International Congress on Medical Informatics (MEDINFO 2010) bibtex

2008

Michal Valko, Gregory Cooper, Amy Seybert, Shyam Visweswaran, Melissa Saul, Milos Hauskrecht: Conditional anomaly detection methods for patient-management alert systems, Workshop on Machine Learning in Health Care Applications in International Conference on Machine Learning (ICML-2008 - MLHealth) bibtex talk
Michal Valko, Milos Hauskrecht: Distance metric learning for conditional anomaly detection, International Florida AI Research Society Conference (FLAIRS 2008) bibtex
Michal Valko, Richard Pelikan, Milos Hauskrecht: Learning predictive models for combinations of heterogeneous proteomic data sources, AMIA Summit on Translational Bioinformatics (STB 2008) [outstanding paper award] bibtex talk

2007

Milos Hauskrecht, Michal Valko, Branislav Kveton, Shyam Visweswaram, Gregory Cooper: Evidence-based Anomaly Detection in Clinical Domains in Annual American Medical Informatics Association conference (AMIA 2007). [nominated for the best paper award] bibtex

2006

Wendy W. Chapman, John N. Dowling, Gregory F. Cooper, Milos Hauskrecht and Michal Valko: A Comparison of Chief Complaints and Emergency Department Reports for Identifying Patients with Acute Lower Respiratory Syndrome in Proceedings of the National Syndromic Surveillance Conference (ISDS 2006) bibtex
Miloš Hauskrecht, Richard Pelikan, Michal Valko, James Lyons-Weiler: Feature Selection and Dimensionality Reduction in Genomics and Proteomics. Fundamentals of Data Mining in Genomics and Proteomics, eds. Berrar, Dubitzky, Granzow. Springer (2006) bibtex

2005

Michal Valko, Nuno C. Marques, Marco Castelani: Evolutionary Feature Selection for Spiking Neural Network Pattern Classifiers in Proceedings of Portuguese Conference on Artificial Intelligence (EPIA 2005), eds. Bento et al., IEEE, pages 24-32. bibtex
Michal Valko Evolving Neural Networks for Statistical Decision Theory, Comenius University, Bratislava, 2005 (master thesis) (2005) Advisor: Radoslav Harman bibtex talk

older preprints

Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos: Geometric entropic exploration, arXiv preprint
Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko Michal Valko: On the approximation relationship between optimizing ratio of submodular (RS) and difference of submodular (DS) functions, arXiv preprint
Branislav Kveton, Zheng Wen, Azin Ashkan, Michal Valko: Learning to Act Greedily: Polymatroid Semi-Bandits, accepted for publication to Journal of Machine Learning Research (JMLR) bibtex arXiv preprint

Presentations

Michal Valko: Graph-Based Anomaly Detection with Soft Harmonic Functions: Presented at CS Department Research Competition (Research 2011) [#1st place] talk also at (Grad Expo 2011) and (CS DAY 2011) poster
Branislav Kveton, Michal Valko, Matthai Philiposse: Real-Time Adaptive Face Recognition, Presented at 23rd Neural Information Processing Systems conference (NeurIPS 2009), Video: Adaptation, Video: OfficeSpace, poster #1, poster #2
Michal Valko:, Branislav Kveton, Matthai Philiposse: Robust Face Recognition Using Online Learning, Presented at 9th University of Pittsburgh Science conference (SCIENCE 2009)Grad Expo 2010) talk and (CS Day 2010) poster
Michal Valko: Conditional anomaly detection with adaptive similarity metric: Presented at CS Department Research Competition (Research 2008) [#1st place] talk
Michal Valko, Milos Hauskrecht, G. Cooper, S. Visweswaran, M. Saul, A. Seybert, J. Harrison, A. Post: Conditional Anomaly Detection, Presented at (CS Day 2008) [#1st by people, #2nd by faculty] also at University of Pittsburgh, Arts & Sciences (Grad Expo 2008) poster

References

bibtex file with references I often use