Multi-Modal Foundation Models for Particle Physics under Model Misspecification (MMFM)
In this project, we will develop novel training strategies for scalable foundation models for multi-modal particle physics data and investigate learned domain adaptation (DA) methods to alleviate domain shifts between simulation and real-world data. We combine self-supervised learning (SSL) strategies for individual modalities and CLIP-like alignment of multi-modal embeddings with a flexible transformer-based architecture to create a versatile backbone for a wide range of particle physics use-cases, such as classification, particle identification, and generative surrogate modeling. We aim to show that combining general training principles, such as masked modeling (SSL) or approximate invariance through adversarial training (DA), is a promising route towards building general-purpose models, which do not yet exist in the field.
Prof. Dr. Lukas Heinrich
Data Science in Physics