WS25 DLR & TUM Data Science in Earth Observation: Feasibility and Theoretical Background of Generative AI Models

Feasibility and Theoretical Background of Generative AI Models

Sponsored by: TUM & German Aerospace Center (DLR)
Project lead: Dr. Ricardo Acevedo Cabra
Scientific lead: TUM PhD Candidate Jakob Gawlikowski, Nils Lehmen, David Iagaru, Nina Maria Gottschling
TUM co-mentor: TBA
Term: Winter semester 2025
Application deadline: Sunday 20.07.2025

Apply to this project here

Resolution enhanced Sentinel-2 images before (left) and after (right) wildfires in Amarillo, Texas. In the right image, new buildings are hallucinated

MOTIVATION

Earth Observation (EO) data, acquired through satellite missions like Sentinel-2, plays a vital rolein monitoring environmental changes and supporting disaster response. In such applications,generative models are increasingly used for tasks like modality translation (e.g., generatingsynthetic optical from radar), cloud removal, or resolution enhancement. However, these tasks areprone to hallucinations, deceptive artifacts that resemble highly plausible details or features of anAI-based prediction.On one side these hallucinations are rooted in the task itself. The input data often lacks thenecessary information to establish a true, injective mapping from the input to the output data. Forinstance, in super-resolution, small details from the high-resolution image are not visible in thelow-resolution counterpart. On the other hand, data driven approaches are sensitive to changes inthe data distribution and hallucinations also arise when the input data differs from the data used totrain the model. In such cases, the network extracts features from the input and maps them to anoutput space which was optimized based on the training data, often resulting in outputs that appearrealistic but are actually inaccurate. These hallucinations are caused by model-dependent(epistemic) uncertainty. Common examples include the occurrence of clouds or changes in thelandscape caused by natural disasters such as wildfires, floods or earthquakes, which are typicallynot represented in the training data.

GOAL

The primary objective of this project is to systematically investigate hallucinations caused by shiftsin data distribution, with a focus on super-resolution tasks. The project will build upon existing pretrained neural networks for both spatial and spectral super-resolution.The four key objectives are:

Working out characteristics of different distribution shifts and collecting (additional) test datafor pre-trained generative models.
Collecting, designing, and evaluating metrics to identify distribution shifts at the input, feature,and output levels.
Utilizing and adapting existing techniques for detecting distribution shifts, as well as developingnovel methods to quantifythese shifts.
Providing a framework that helps to identify distribution shifts and corresponding hallucinationsin generative models.

MAIN METHODS AND DATA

The team will work with satellite data, such as from Sentinel-2, and pre-trained generative modelsfor evaluation purposes. To handle the large volume of data and to perform network training andevaluation, access to an HPC cluster will be provided. The team will have access to existingimplementations and a toolbox for uncertainty quantification in deep learning [1], whichcan be utilized and extended to additional generative approaches. Moreover, the integration ofexisting forward operators for inverse problems, particularly those addressed by generative modelsfor super-resolution, offers a potential direction for detecting distribution shifts. Furthermore, theoretical works on performance bounds and AI hallucinations for inverse problems,developed in our lab [2], [3] and in [4], can be incorporated with our advisory into the proposedapproaches.

REQUIREMENTS FOR STUDENTS

Background in remote sensing, machine learning (theoretical and/or practical), or related fields.
Experienced in data processing and an understanding of deep learning frameworks.
Experienced in Python or similar languages.
Experience in work with an HPC cluster is helpful.

REFERENCES

[1] N. Lehmann, N. M. Gottschling, J. Gawlikowski, A. J. Stewart, S. Depeweg, and E. Nalisnick, “Lightning uq box: Uncertainty quantification for neural networks,” Journal of Machine Learning Research, vol. 26, no. 54, pp. 1–7, 2025.

[2] N. M. Gottschling, V. Antun, A. C. Hansen, and B. Adcock, “The troublesome kernel: On hallucinations, no free lunches, and the accuracy-stability tradeoff in inverse problems,” SIAM Review, vol. 67, no. 1, pp. 73–104, 2025.

[3] N. M. Gottschling, P. Campodonico, V. Antun, and A. C. Hansen, “On the existence of optimal multi-valued decoders and their accuracy bounds for undersampled inverse problems,” arXiv preprint arXiv:2311.16898, 2023.

[4] S. Bhadra, V. A. Kelkar, F. J. Brooks, and M. A. Anastasio, “On hallucinations in tomographic image reconstruction,” IEEE transactions on medical imaging, vol. 40, no. 11, pp. 3249–3260, 2021.

Apply to this project here

To top

Be part of TUM-DI-LAB!

Mentors:

cutting edge knowledge is essential for our lab. Professors, postdocs and doctoral students are welcome as project mentors. Find out here how to become a mentor.

Partners:

Industrial partners are indispensable for TUM-DI-LAB. Find out here how to become a partner

Munich Data Science Institute

TUM-DI-LAB is part of the Munich Data Science Institute (MDSI) since October 2021