Practical Machine Learning Solutions for Uncertainty Quantification in Regression Problems

This project took place in winter term 2022, you CAN NOT apply to this project anymore!

Results of this project will be uploaded here in May 2023.

Who are we?
At Lidl Analytics we pursue a global mission: Generating business value through data-driven decision support. Therefore, we combine the expertise of our stakeholders with Analytics competencies, developing algorithms and statistical models for recognizing complex patterns in Big Data. To leverage the full potential of Analytics throughout the company, we act as a global Analytics

network across all Lidl countries with a network of 100+ members.

You will work in a cross-national team together with Slava (Senior Data Scientist) and Dennis (Data Scientist) from Lidl International and Paul (Head of Analytics) from Lidl UK.

What is our project?
We want to compare and evaluate different machine learning and deep learning models for the uncertainty estimation in regression problems. The goal is to identify, which models perform well in which context and use case to go beyond a single point estimate but to specifically estimate the uncertainty around that point using probabilistic methods.

Why this project?
Prediction and forecasting are two central elements in our data science work at Lidl. In particular, when working with our business units, we require (un-)certainty estimates for our predictions to help businesses’ decision-making processes. For instance, it is important to know how likely it is that we will reach a certain KPI in advance in order to adjust business decisions accordingly. In this situation, the point estimate by itself can only serve as a first guideline; having information on the (un-)certainty as to how likely and in what range the KPI will likely end up in allows for more detailed and informed decision-making. We at Lidl Analytics assist our business units in these kinds of decisions along the entire value chain in all our business sectors and therefore require a good understanding of how and to what degree different machine learning and deep learning models can be used for uncertainty estimation.

What do we want to achieve with the project?
We want to achieve two important goals with our project:

  • First, by comparing different machine learning and deep learning models and their ability to estimate uncertainty in regression problems across different data sources and use cases, we want to develop guidelines and best practices. The goal is to identify and compare the usefulness of different approaches to provide guidance for their applicability across use cases in our 30+ Lidl countries.
  • Second, the code we develop together should be integrated into our existing modelling pipelines making it an integral part of our data science workflow, for instance for turnover or sales predictions. It therefore requires careful analyses and testing as well as the ability to scale the developed code to our 11,900 stores.

What are we going to do during the project?
The project consists of three main parts.

  • First, we need to carefully review, select, and evaluate candidate models that appear suitable for probabilistic uncertainty estimation.
  • Second, these models will then be trained on various internal and external datasets (e.g., Lidl data from our stores or sales data as well as open-source data from Kaggle or other sources). This process further involves defining suitable evaluation metrics to compare model quality and applicability across use-cases as well as highlighting and explaining model performances in detail.
  • Third, the code should be applied to different use-cases within the retail context, such as turnover of sales prediction, and be prepared to scale well to run for all our stores across the globe.

Whom are we looking for?
We are seeking curious and enthusiastic students who are eager to tackle a problem in a data- driven approach. You should be interested to understand and investigate the fundamentals that underlie common machine learning approaches and put these insides into a business context. We welcome everyone who has a solid understanding of data analysis, statistics, machine learning and knows how to implement all this in Python.

What are your benefits of joining our project?
You will be a fully integrated team member as of day one and work closely with your project team as well as our global analytics departments and business units. We provide all necessary IT equipment as well as access and insights into a multitude of data sources on various areas in the retail sector. On top of that, you will be employed as a working student at Lidl for the duration of the project and enjoy all benefits we offer as a large and global company.

Sounds interesting? Then apply to our project here! We are very much looking forward to working with you during the winter term and can’t wait to meet you!

Accepted students to this project should attend (unless they have proven knowledge) online workshops at the LRZ from 10.10. - 14.10.22. More information will be provided to students accepted to this project.