Efficient AI Integration: Leveraging Small Language Models for Self-Hosted Applications
- Sponsored by: Data Reply GmbH
- Project lead: Dr. Ricardo Acevedo Cabra
- Scientific lead: Antonio Di Turi, Alessandro Palladini and ihab Bouaffar
- TUM co-mentor: Dr. Alessandro Scagliotti
- Term: Summer semester 2025
- Application deadline: Sunday 19.01.2025
Apply to this project here

Motivation and Values
In an era where artificial intelligence is reshaping industries, the ability to deploy language models efficiently and securely is paramount. Small Language Models (SLMs) offer a promising solution by providing powerful language processing capabilities without the extensive computational demands of larger models. By focusing on SLMs, organizations can integrate AI technologies seamlessly into their existing infrastructure, enhancing productivity and innovation while maintaining control and privacy over their data, reducing dependency on external APIs.
Questions such as:
- “How can we implement advanced AI solutions within our existing infrastructure?”
- “Is it feasible to deploy effective language models without specialized hardware like GPUs?”
- “Can we tailor AI models to meet our specific organizational needs without relying on third-party services?”
- “How do we use AI models without compromising user privacy and data security?”
Will no longer be obstacles to innovation.
By focusing on SMLs, we aim to integrate AI capabilities seamlessly into their existing systems. This approach enhances data sovereignty, reduces operational costs, and eliminates reliance on external services, paving the way for more secure and efficient AI adoption.
Goals
- Investigate Small Language Models: Explore the capabilities of SLMs for various organizational applications.
- Balance Quality and Efficiency: Research strategies to optimize the trade-off between model performance and computational resources.
- Deploy on Standard CPUs: Demonstrate the feasibility of running SLMs effectively on common CPU infrastructure.
- Reduce Dependency on External APIs: Develop strategies to implement AI solutions internally without relying on external providers.
Outstanding goals (optional)
- Explore deployments on the edge: Investigate the potential of deploying SLMs on edge devices for enhanced performance and accessibility.
Methods
- Exploration of Small Language Models:
- Model Survey: Review existing SLM architectures to assess their suitability for different applications.
- Performance Evaluation: Analyze the capabilities of SLMs in understanding and generating language within specific contexts.
- Deployment Strategies:
- Infrastructure Assessment: Examine existing hardware capabilities to support SLM deployment.
- Deployment Strategies: Develop methods for running SLMs on CPUs, ensuring they operate efficiently within resource constraints.
- Performance Testing: Compare CPU deployments to traditional GPU-based implementations to assess viability.
- Performance Optimization:
- Balancing Quality and Efficiency: Research methods to optimize models to achieve high-quality outputs with minimal computational overhead.
- Resource Management: Explore techniques to ensure models run efficiently on different hardware without compromising performance.
- Benchmarking: Evaluate models to find the optimal balance between efficiency and effectiveness.
Requirements and Opportunities
We invite participants who are motivated to learn and eager to contribute to all phases of the project. No specific technical expertise is required; enthusiasm and a proactive mindset are the most valuable assets.
Engaging in this project offers:
- Insight into AI Integration: Gain understanding of how AI models can be adapted and deployed within an organization.
- Skill Development: Learn about model optimization, deployment strategies, and performance evaluation.
- Impactful Experience: Contribute to innovative solutions that can transform organizational processes.
- Collaborative Exploration: Work alongside like-minded individuals in navigating the frontiers of AI technology.
We are excited to embark on this journey together and look forward to the innovative solutions we will create. Let's explore how to make AI more accessible, efficient, and tailored to organizational needs without relying on external dependencies.
Apply to this project here