Software & Datasets
Software Libraries
Google Vizier
Python library for black-box optimization and research on bandits. Supports various bandit algorithms and hyperparameter tuning. Actively maintained by Google.
SMPyBandits
Python package for single- and multi-player multi-armed bandits. Implements numerous bandit algorithms including UCB, Thompson Sampling, KL-UCB, and many variants.
contextualbandits
Python library for contextual bandits. Implements LinUCB, Thompson Sampling, and other contextual algorithms with scikit-learn integration.
EconML
Microsoft library for causal inference and contextual bandits. Python implementation with focus on policy learning and heterogeneous treatment effects.
Vowpal Wabbit
Fast machine learning library with extensive support for contextual bandits. Industry-grade implementation used at scale by Microsoft and others.
Benchmark Datasets
Criteo Ad Placement Dataset
Large-scale display advertising dataset for contextual bandits. Contains real-world data from Criteo's ad placement system with features and rewards. Widely used for benchmarking contextual bandit algorithms in production settings.
Yahoo Webscope Datasets
Historical collection including the R6 Front Page Today Module dataset (45M+ user visits). Classic benchmark for contextual bandits in news recommendation. Note: Yahoo Webscope program has been discontinued but datasets may be available through archives.
RecSim & OpenAI Gym Environments
Simulation environments for recommendation and bandit algorithms. Allows reproducible benchmarking without real user data. Includes various scenarios from simple multi-armed bandits to complex contextual settings.

















