My Google Scholar page should contain everything as well.

2025

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training.

Fabian Schaipp, Alexander Hägele, Adrien Taylor, Umut Simsekli, Francis Bach.
Forty-second International Conference on Machine Learning (ICML 2025).
[PDF | bibtex | Code]

2024

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations.

Alexander Hägele, Elie Bakouch, Atli Kosson, Loubna Ben Allal, Leandro Von Werra and Martin Jaggi.
Advances in Neural Information Processing Systems (NeurIPS 2024). Spotlight Award at NeurIPS’24.
Also Spotlight Presentation at the Workshop on Next Generation of Sequence Models (NGSM) and Best Poster Award at the Workshop on Efficient Systems for Foundation Models (ES-FOMO) at ICML’24.
[PDF | bibtex | Code | Slides]

2023

BaCaDI: Bayesian Causal Discovery with Unknown Interventions

Alexander Hägele, Jonas Rothfuss, Lars Lorch, Vignesh Ram Somnath, Bernhard Schölkopf and Andreas Krause.
Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023).
Also presented at 1st Workshop on Causal Representation Learning at UAI 2022 (Link).
Oral presentation at AISTATS 2023 (notable paper award), ranked top 32 among 1689 submissions (top 1.9%).
PDF | bibtex | Details | Code | Slides

2021

Robustness Certification with Generative Models

Matthew Mirman, Alexander Hägele, Pavol Bielik, Timon Gehr and Martin Vechev.
Proceedings of the 42nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2021).
PDF | bibtex | DOI

Publications