De novo machine learning solution for pairing enzyme and substrate
- Sponsored by: TUM Biotechnology of Natural Products and TUM Chair for Bioinformatics.
- Project lead: Dr. Ricardo Acevedo Cabra
- Scientific lead: Prof. Burkhard Rost, Prof. Wilfried Schwab, Dr. Tobias Senoner,
- Dr. Ivan Koludarov,Dr. Thomas Hoffmann.
- TUM co-mentor: TBA
- Term: Winter semester 2025
- Application deadline: Sunday 20.07.2025
Apply to this project here

Glycosylation, the transfer of a simple sugar to another metabolite (acceptor), is a fundamental chemical reaction in living organisms, as all the building blocks of life are linked to sugars such as DNA, RNA, proteins, polysaccharides, and even many lipids. In nature, glycosylation is catalyzed by glycosyltransferases (GT), which require a sugar donor substrate, in many cases UDP-glucose, and an acceptor. Although the donor-substrate binding site is well conserved in GT enzymes and is a useful feature for identifying the relevant GT genes in sequenced genomes, the acceptor-binding pocket is highly variable. This variability makes the in silico identification of functional enzymes for a specific acceptor difficult. The prediction methods developed so far use sequence alignments and physico-chemical properties of the acceptors to predict enzyme-substrate pairs, whereby 3D features are not taken into account. AlphaFold2 makes it possible to predict the 3D protein structures with high accuracy, to identify the amino acids that frame the binding pockets of the donor and acceptor substrates and to determine correlations with the functional substrates.
Our goal is to develop an improved prediction method by AlphaFold2-predicting 3D structures for 120 GT proteins and training on an in-house enzyme activity database.
Key milestones are
(i) Cluster GTs by sequence, structure and embeddings to reveal correlation to function
(ii) Test docking for GT-substrate pairs using predicted structure, and
(iii) Develop de novo machine learning solution pairing GT and substrate. The aim is to find GTs that glycosylate a specific substrate to modify pharmaceuticals and other bioactive metabolites in a targeted manner.
Apply to this project here