Theorem Proving and Machine Learning in the age of LLMs: SoA and Future Perspectives
will take place in Edinburgh, Scotland, UK, April 7th-8th 2025, organised by Ekaterina Komendantskaya, Elizabeth Polgreen, Christian Saemann, Kathrin Stark, and Michael Rawson. The event is supported by the Cost Action CA20111 - European Research Network on Formal Proofs.
There will be no remote participation, but recordings will be uploaded where possible.
Aims
Machine learning (ML) has been shown to be very successful in programming and translation tasks, and creates new opportunities combining AI with proofs. Recently, various claims have been made that large language models (LLMs) will revolutionise these areas. However, many questions about the details of the applications of LLMs and their impact on theorem proving and mathematics remain open. At the workshop, we want to bring together researchers from a wide range of communities: mathematics, automated and interactive theorem proving, machine learning, natural language processing, and formal methods, in order to discuss the state-of-the-art and future directions for this new area of research.
Examples of topics that we intend to discuss include, but are not limited to:
-
ML/LLM for Advancing Proof Techniques, e.g., tailoring LLMs to theorem-proving datasets and benchmarks, combining neural networks and symbolic reasoning for robust theorem proving, enabling LLMs to learn proof techniques from minimal data or prior examples.
-
Applications of ML/LLM in Theorem Proving, e.g., designing co-pilot systems for theorem provers like Coq, Lean, or Isabelle, auto-formalising mathematical concepts and proofs from textbooks or research papers, human-machine collaboration workflows in proof construction.
-
Challenges and Limitations of ML/LLM in Theorem Proving, e.g. addressing hallucinations and errors in proofs generated by LLMs, handling large and complex proof spaces with LLM-guided tools, mitigating biases introduced during LLM training on specific mathematical domains.
-
Benchmarks and Evaluation for ML/LLM in Theorem Proving, e.g., creating datasets specifically for evaluating LLM-based theorem proving systems, assessing the interpretability and trustworthiness of LLM-generated proofs, defining success metrics for proof assistance beyond correctness, such as creativity and accessibility.
-
Interdisciplinary Impact of ML/LLM, e.g., leveraging LLMs to teach formal methods, logic, and theorem proving to students, using LLMs to explore conjectures and new areas of mathematical research, applications in formal verification for software, hardware, and systems engineering, including industrial applications.
-
Future Directions for ML/LLM in theorem proving: e.g., the implications of relying on AI systems for critical mathematical proofs, setting open standards for the integration of LLMs in the theorem-proving workflows, speculating on the evolution of LLMs and their roles in formal reasoning.
-
Cross-Domain Connections: e.g., developing user-friendly natural language interfaces for proof assistants, adapting LLM capabilities from general domains to formal logic and proofs.
Schedule
The workshop features 27 contributions, a panel discussion and a poster session. We are proud to have three invited speakers:
- Formal Mathematical Reasoning: A New Frontier for Large Language Models: Swarat Chaudhuri (UT Austin and Google Deepmind)
- Machine Learning for Mathematics Research: Sergei Gukov (Caltech & Dublin Inst. Adv. Studies)
- Machine Learning in Theorem Proving: Josef Urban (CIIRC/CVUT)
A detailed schedule is available. For the panel session on the 8th, we invite all participants to propose suitable questions to the panelists. We will choose 5 questions to discuss in detail.
Registration
We are expecting a large number of participants, and therefore registration is mandatory. Registration is online, and will close on the 12th March 2025 at 1600 UK time.
Workshop Dinner
There is an optional workshop dinner at the Surgeon’s Hall at a cost of £63 per person. You may, if you like, invite an additional guest at extra cost. These options will be available at the time of registration.
Location
Postgraduate Centre Heriot-Watt, Heriot-Watt University, Edinburgh, EH14 4AL, UK. More information about the campus from Heriot-Watt.
Accommodation
- The only hotel on the HWU campus is the Marriott Courtyard Hotel.
- If booking accommodation in town, expect travel times of at least 25 minutes.
Dinner
To get to Surgeon’s Hall from Heriot-Watt University, the easiest possibility is to take the 45 or 35 bus.
UK Electronic Travel Authorisation
The UK has introduced an Electronic Travel Authorisation system. You may need one to enter the UK. From 2nd April, European visitors will also need an ETA. They can apply from the 5th March.
Publication plans
The chairs are investigating the possibility of organising a special journal issue dedicated to the topics of this workshop, at the J. of Annals of Mathematics and AI. Further details will be discussed at the time of the workshop.
Travel Grants
The travel and accommodation of a number of participants (approximately 12) will be supported by the Cost Action CA20111 - European Research Network on Formal Proofs. Applications have now closed.
Abstracts
Steven Obua: Abstraction Logic Is All You Need
Alberto Gandolfi: How LLMs could Fool a Proof Checker: The Risks of Inconsistent Assumptions in Automated Proof Systems
Markus Pantsar: The need for ethical guidelines in mathematical research in the time of generative AI
Tim Fernando: Between min cost search and least action
Konstantinos Kogkalidis, Orestis Melkonian and Jean-Philippe Bernardy: Structure-Aware Neural Representations of Agda Programs
Huajian Xin: Frontier of Formal Theorem Proving with Large Language Models: Insights from the DeepSeek-Prover
Samuel Teuber and Bernhard Beckert: Towards LLM-support for Deductive Verification of Java Programs
Jonathan Julian Huerta Y Munive: DeepIsaHOL progress report: current machine learning for the Isabelle proof assistant
Job Petrovčič, Sebastian Mežnar and Ljupčo Todorovski: Kernel-level expression generator
Robin Rawiel and Lukas Niehaus: Reinforcement Learning for Term Rewrite Systems
Laetitia Teodorescu, Guillaume Baudart, Emilio Jesús Gallego Arias and Marc Lelarge: NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving
Aleksandr Shefer, Igor Engel, Stanislav Alekseev, Daniil Berezun, Ekaterina Verbitskaia and Anton Podkopaev: Are LLMs Ready for Software Verification?
Alexandra Fikiori, Asterios Gkantzounis, Foivos Skarpelos and Petros Stefaneas: On Abstract Logics for Interacting Provers
Guy Axelrod, Moa Johansson and Andrea Silvi: Intrinsic Motivation To Construct Terms In A Dependent Type Theory
Gleb Solovev, Nikita Khramov, Andrei Kozyrev and Anton Podkopaev: CoqPilot benchmarking framework
Jules Viennot, Guillaume Baudart, Emilio Jesús Gallego Arias and Marc Lelarge: MiniF2F in Rocq: Automatic Translation Between Proof Assistants — A Case Study
Rashid Barket, Uzma Shafiq, Matthew England and Jürgen Gerhard: Transformers to Predict the Applicability of Symbolic Integration Routines
Artjoms Sinkarovs: Correct by Construction Machine Learning
Irina Starikova: Mathematics and Creative AI
Luca Pasetto and Christoph Benzmüller: The Fatio Protocol for Formal Dialogue in Isabelle/HOL
Maxim Zyskin: Chessboard covers and AI.
Richard Thompson, Adam Pease, Angelos Toutsios, Roberto Milanese Jr. and Jarrad Singley: Formalizing Natural Language: Cultivating LLM Translations Using Automated Theorem Proving
Cameron Freer, Alexander Lew, Timothy O’Donnell and Vikash Mansinghka: Autoformalization and autoinformalization via probabilistic reasoning
Yutaka Nagashima and Daniel Sebastian Goc: Proof By Abduction in Isabelle/HOL
Stephan Schulz: Automated Theorem Provers as the Hub of the AI Math Ecosystem
Yousef Alhessi, Sólrún Halla Einarsdóttir, Emily First, George Granberry, Moa Johansson and Nicholas Smallbone: Neuro-Symbolic Lemma Conjecturing
Alexei Lisitsa: In search of simplicity: a case study in ML and LLM assisted applications of ATP in mathematics
Bartosz Naskręcki: From Mathematical Theory to Machine Learning: Detecting Symmetry Groups in Crystallographic Tilings
Photos
Further Acknowledgements
We are grateful for support from the following organisations in addition to EuroProofNet.