Unlocking Rare Diseases: An AI-Powered Diagnostic Revolution

Kamilia Zaripova

Rare diseases affect over 300 million people worldwide, yet most patients wait 5-7 years for a correct diagnosis. While over 7,000 rare diseases exist, 80% are genetic and nearly half lack identified causative genes. This diagnostic challenge—called the “diagnostic odyssey”—is being transformed by artificial intelligence. This blog explores how AI tools like knowledge graphs and machine learning are revolutionizing rare disease diagnosis, making it faster and more accurate for patients who need answers most.

Introduction

Despite major advances in DNA technology, half of all patients with suspected genetic diseases still go undiagnosed even after comprehensive genetic testing. Rare diseases affect fewer than 1 in 2,000 people individually, but together they impact hundreds of millions globally—creating a massive collective health challenge [van Karnebeek et al., 2024].

The “diagnostic odyssey” describes what many patients experience: years of inconclusive tests, visits to multiple specialists, and mounting medical bills without clear answers. This journey causes significant stress for families and delays treatments that could prevent disease progression or improve quality of life.

Traditional diagnosis relies on doctors manually connecting clinical symptoms with complex genetic data—a process that becomes increasingly difficult as genetic databases grow larger and more complex. Artificial intelligence offers a solution by automatically integrating patient symptoms with vast medical knowledge bases, potentially speeding up diagnosis and improving outcomes for patients with rare genetic conditions.

Key Biological Concepts

Understanding AI’s role in rare disease diagnosis requires familiarity with several important concepts:

DNA, Genome, and Exome:

The human genome contains 3 billion DNA “letters” that serve as instructions for human development. The exome represents just 2% of the genome—the parts that code for proteins—but contains 85% of disease-causing mutations. This makes exome sequencing cost-effective for diagnosis.

Phenotype vs. Genotype:

Phenotype refers to observable characteristics like physical features, symptoms, and test results. Genotype refers to a person’s complete genetic makeup. The relationship between them is complex—similar symptoms can have different genetic causes, and the same genetic change can sometimes cause different symptoms.

Biomedical Knowledge Graphs (KG):

These organize medical information as interconnected networks, connecting genes, diseases, symptoms, and treatments through defined relationships. They help computers understand complex biological connections and make inferences that might not be obvious when studying components separately. An example of a patient-specific knowledge graph—the portion of the overall KG mapped with patient-specific data—is shown in Fig. 1.

Gene Prioritization:

Computational methods that rank genes based on how likely they are to cause a patient’s condition. These algorithms combine multiple types of evidence to guide diagnostic decisions.

Figure 1: An example of a simplified patient-specific one-hop subgraph and its neighborhood, illustrating the relationships between the POLR3A gene and associated genes, diseases, and phenotypes. The real patient subgraph maintains a similar structure but is substantially more complex.

Major Diagnostic Challenges

Several interconnected problems make rare disease diagnosis particularly difficult:

Scale and Knowledge Gaps:

With over 7,000 rare diseases but fewer than 500 approved treatments [van Karnebeek et al., 2024], most doctors encounter only a small fraction of these conditions. For nearly half of genetic diseases, we still don’t know which genes are responsible.

Long Diagnostic Delays:

Patients typically wait 5-7 years for diagnosis [van Karnebeek et al., 2024], during which they may receive wrong diagnoses, undergo unnecessary procedures, and miss critical treatment windows. This creates “diagnostic trauma”—psychological stress from prolonged medical uncertainty.

Population Bias:

Genetic research databases are 78% European [van Karnebeek et al., 2024], limiting diagnostic accuracy for patients from other ethnic backgrounds. This perpetuates health disparities and reduces the effectiveness of genetic medicine for underrepresented populations.

Uncertain Genetic Variants:

Modern genetic testing finds millions of genetic changes, many classified as “variants of uncertain significance”—changes where we don’t know if they cause disease. This uncertainty complicates diagnosis because doctors can’t determine which variants are actually problematic.

Figure 2: Clinical workflow transformation: Traditional vs. AI-enhanced approach. (A) Traditional physician workflow involves four sequential steps with significant challenges at each stage, requiring 5-7 years for diagnosis with 50-60% success rate. Physicians face pattern recognition bias during assessment, time-intensive literature searches, limited gene panel coverage (20-50 genes), and complex variant interpretation. (B) AI-enhanced workflow streamlines the process into three integrated steps using standardized Human Phenotype Ontology (HPO) terminology, comprehensive genome-wide analysis of 8,000+ genes, and clinical decision support, reducing diagnosis time to 3-6 months with 70-85% success rate. The impact demonstrates substantial improvements across time efficiency, gene coverage, diagnostic quality, and physician experience, representing a paradigm shift toward more systematic and comprehensive rare disease diagnosis.
The 5–7 years in traditional diagnostic odysseys are not spent in direct physician work but arise from fragmented care pathways: patients visit multiple specialists, undergo sequential tests, and wait months between appointments. Each negative result restarts the process, while physicians manually review literature and re-evaluate variants as new discoveries appear. AI-enhanced workflows shorten this dramatically by integrating genetic and phenotypic data within hours, but clinical confirmation, genetic counseling, and ethical reporting still require human oversight — explaining why total time remains in the range of several months rather than days.

Bridging the Gap: Where AI Fits In

From a clinician’s perspective, diagnosing rare diseases often involves integrating many layers of information—clinical observations, laboratory data, imaging findings, and, when available, genetic sequencing results. Each case is unique, and while genetic technologies have advanced rapidly, interpreting results in the right clinical context remains complex. Artificial intelligence can support this process by helping physicians navigate the growing body of genomic and phenotypic knowledge more efficiently.

In practice, AI tools act as an extension of clinical reasoning. They aggregate information from published studies, variant databases, and phenotype ontologies, highlighting potential links between a patient’s findings and known genetic mechanisms. Instead of replacing medical judgment, they provide structured, evidence-based suggestions that help guide further investigation. For example, an AI system might propose a previously unconsidered gene–disease connection consistent with the patient’s phenotype, prompting a focused follow-up rather than broad, exploratory testing.

Recent advances in AI-assisted diagnostics have begun to shift the analytical focus from genomic sequencing data toward clinical phenotypes. Unlike earlier systems that rely primarily on genetic information or predefined candidate gene lists, these emerging approaches can operate directly from phenotypic descriptions available during patient evaluation. This enables the generation of preliminary diagnostic hypotheses prior to the return of laboratory results. By integrating observable clinical features with structured biomedical knowledge graphs, such methods can prioritize genes most consistent with the observed presentation. In clinical practice, this facilitates earlier hypothesis formulation, more targeted testing strategies, and improved communication with patients and families regarding potential diagnostic trajectories.

Ultimately, these developments align closely with established patterns of clinical reasoning—beginning with symptom characterization, followed by differential diagnosis and iterative refinement as new data emerge. Embedding AI within this workflow has the potential to accelerate diagnostic processes, enhance transparency, and broaden the scope of analysis, while maintaining the clinician’s central role in interpretation and decision-making.

How AI is Changing the Game

Researchers have developed three main AI approaches to tackle diagnostic challenges, each addressing different aspects of the rare disease puzzle:

DNA-Based Methods:

What they do: These tools analyze genetic sequencing data to identify disease-causing DNA variants.
How they work: Tools like MutationTaster [Steinhaus et al., 2021], CADD [Rentzsch et al., 2019], and M-CAP [Jagadeesh et al., 2016] use machine learning algorithms trained on millions of known genetic variants. They assess factors like how often variants appear in healthy populations, how they affect protein structure, and whether similar variants have been linked to disease before.
Why this approach: If you can identify the exact genetic “typo” causing a patient’s condition, you can provide definitive diagnosis and targeted treatment. However, these tools are limited because they need comprehensive databases of well-characterized variants to make accurate predictions, which means for each patient they need a genetic test beforehand.

Symptom-Based Tools:

What they do: These focus on observable patient characteristics—symptoms, physical features, and test results—rather than genetic data.
How they work: Phenolyzer [Yang et al., 2015, Köhler et al., 2009] mines medical literature to build connections between symptoms and relevant genes, essentially asking “which genes are most often mentioned alongside these symptoms in research papers?” Facial analysis tools like DeepGestalt [Gurovich et al., 2019] and GestaltMatcher [Hsieh et al., 2022] use computer vision to identify subtle facial features that human doctors might miss—analyzing factors like eye spacing, nose shape, and facial proportions that can indicate specific genetic syndromes.
Why this approach: Many genetic conditions have distinctive patterns of symptoms that appear before genetic testing, and some patients can’t access expensive genetic testing. However, these tools struggle when patients have unusual combinations of symptoms or when multiple conditions cause similar features.

Combined Approaches:

What they do: Hybrid methods integrate both genetic and clinical information to improve diagnostic accuracy.
How they work: AI-MARRVEL (AIM) [Mao et al., 2024] uses a random forest—imagine multiple expert committees each voting on which genetic variants are most suspicious, then combining their votes. It analyzes both DNA sequencing files and standardized symptom descriptions simultaneously. Knowledge graph methods like CADA [Peng et al., 2021] and SHEPHERD [Alsentzer et al., 2022] create interconnected maps of medical knowledge, linking symptoms to diseases to genes through established relationships, then use AI to find the shortest “path” from a patient’s symptoms to potential genetic causes. Two other widely used integrated tools are Amelie [Birgmeier et al., 2020] and PhenoApt [Chen et al., 2023], which also combine clinical phenotypes with genomic data to assist gene prioritization.
Amelie focuses on literature-based gene discovery, ranking candidate genes based on similarity between a patient’s symptoms (encoded in Human Phenotype Ontology terms) and phenotype–gene associations mined from millions of scientific papers. This model can not work without the candidate gene list. PhenoApt, in contrast, operates even without prior genetic data, directly linking patient phenotypes to gene-disease knowledge graphs. Both approaches paved the way for newer methods like PhenoKG, which extend these ideas by scaling to the entire genome and removing the dependency on prior filtering.
Why this approach: Combining multiple types of evidence should theoretically provide more accurate diagnoses than using genetic or clinical data alone. However, SHEPHERD’s limitation is that it requires doctors to provide pre-selected lists of candidate genes, which defeats the purpose if doctors already know which genes to suspect.

PhenoKG:

Recently, we suggested a new approach [Zaripova et al., 2025].
What it does: It works entirely from patient symptoms without requiring any genetic testing.
How it works: The system takes a patient’s symptom description and enriches it using biomedical knowledge graphs that capture complex relationships between symptoms, diseases, and genes. Instead of starting with a small list of 15-20 expert-suspected genes, PhenoKG ranks approximately 8,000 genes across the entire human genome by their probability of causing the patient’s condition. It uses graph-based networks to model how symptoms connect to diseases and how diseases connect to genes, capturing patient-specific patterns that might be missed by traditional approaches.
Why this matters: This approach addresses three critical limitations: it doesn’t require expensive genetic testing upfront, it doesn’t depend on doctors already knowing which genes to suspect, and it can potentially discover novel disease-gene associations that wouldn’t appear on conventional expert-curated lists. The system achieves 22.5 ± 2.5% accuracy without any prior filtering—meaning that in 22.5 ± 2.5% of cases, the correct disease-causing gene appears among the top-ranked suggestions. With optional expert input, accuracy jumps to 83.9 ± 1.2%, demonstrating effectiveness both as a standalone tool and as a complement to traditional clinical workflows. Compared to existing systems, PhenoKG substantially outperforms SHEPHERD (10.4% without prior filtering → 65.9% with prior filtering), Amelie (51.8% with candidate gene lists), and PhenoApt (16.7% without candidate genes). This could be particularly valuable for patients with unusual presentations or those from underrepresented populations, where existing knowledge may be incomplete. Next steps here would be to generate the candidate genes list, having only the phenotypes.

Looking Forward

The convergence of emerging computational paradigms presents unprecedented opportunities for breakthrough innovations in the diagnosis of rare diseases. Future systems could leverage federated learning architectures with differential privacy for edge-deployed diagnostic models on mobile devices, enabling real-time acoustic biomarker detection and computer vision phenotyping without data sharing. Graph-based causal inference models could generate patient-specific digital twins using hybrid neural-symbolic architectures that simulate counterfactual genetic scenarios across multi-scale biological processes. Blockchain-enabled diagnostic networks might implement multi-agent consensus algorithms where specialized AI components collaboratively solve cases through attention-based neural architectures with uncertainty quantification. Large-scale multimodal foundation models trained on genomic sequences, medical imaging, and sensor data could enable few-shot learning for novel conditions through cross-modal attention mechanisms and contrastive learning objectives. Perhaps most intriguingly, temporal graph neural networks with dynamic knowledge graph embeddings could predict future gene-disease associations using variational autoencoders before experimental validation. The rare disease domain serves as an ideal testbed for these advanced approaches, where high-dimensional data, complex biological relationships, and urgent clinical need create the perfect environment for pushing AI research toward more robust, interpretable, and clinically actionable systems.

References

Emily Alsentzer, Michelle M Li, Shilpa N Kobren, Ayush Noori, Undiagnosed Diseases Network, Isaac S Kohane, and Marinka Zitnik. Few shot learning for phenotype-driven diagnosis of patients with rare genetic diseases. npj Digit. Med. 8, 380 (2025).

Yaron Gurovich, Yair Hanani, Omri Bar, Guy Nadav, Nicole Fleischer, Dekel Gelbman, Lina Basel-Salmon, Peter M Krawitz, Susanne B Kamphausen, Martin Zenker, et al. Identifying facial phenotypes of genetic disorders using deep learning. Nature medicine, 25(1):60–64, 2019.

Tzung-Chien Hsieh, Aviram Bar-Haim, Shahida Moosa, Nadja Ehmke, Karen W Gripp, Jean Tori Pantel, Magdalena Danyel, Martin Atta Mensah, Denise Horn, Stanislav Rosnev, et al. Gestaltmatcher facilitates rare disease matching using facial phenotype descriptors. Nature genetics, 54(3):349–357, 2022.

Karthik A Jagadeesh, Aaron M Wenger, Mark J Berger, Harendra Guturu, Peter D Stenson, David N Cooper, Jonathan A Bernstein, and Gill Bejerano. M-cap eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nature genetics, 48 (12):1581–1586, 2016.

Sebastian Köhler, Marcel H Schulz, Peter Krawitz, Sebastian Bauer, Sandra Dölken, Claus E Ott, Christine Mundlos, Denise Horn, Stefan Mundlos, and Peter N Robinson. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. The American Journal of Human Genetics, 85(4):457–464, 2009.

Dongxue Mao, Chaozhong Liu, Linhua Wang, Rami AI-Ouran, Cole Deisseroth, Sasidhar Pa- supuleti, Seon Young Kim, Lucian Li, Jill A Rosenfeld, Linyan Meng, et al. AI-MARRVEL—a knowledge-driven AI system for diagnosing mendelian disorders. NEJM AI, 1(5):AIoa2300009, 2024.

Chengyao Peng, Simon Dieck, Alexander Schmid, Ashar Ahmad, Alexej Knaus, Maren Wenzel, Laura Mehnert, Birgit Zirn, Tobias Haack, Stephan Ossowski, et al. CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph. NAR Genomics and Bioinformatics, 3(3):lqab078, 2021.

Philipp Rentzsch, Daniela Witten, Gregory M Cooper, Jay Shendure, and Martin Kircher. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic acids research, 47(D1):D886–D894, 2019.

Robin Steinhaus, Sebastian Proft, Markus Schuelke, David N Cooper, Jana Marie Schwarz, and Dominik Seelow. Mutationtaster2021. Nucleic Acids Research, 49(W1):W446–W451, 2021.

Clara DM van Karnebeek, Anne O’Donnell-Luria, Gareth Baynam, Anaïs Baudot, Tudor Groza, Judith JM Jans, Timo Lassmann, Mary Catherine V Letinturier, Stephen B Montgomery, Peter N Robinson, et al. Leaving no patient behind! expert recommendation in the use of innovative technologies for diagnosing rare diseases. Orphanet Journal of Rare Diseases, 19 (1):357, 2024.

Hui Yang, Peter N Robinson, and Kai Wang. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nature methods, 12(9):841–843, 2015.

Kamilia Zaripova, Ege Özsoy, Nassir Navab, and Azade Farshad. Phenokg: Knowledge graph- driven gene discovery and patient insights from phenotypes alone, 2025, https://doi.org/10.48550/arXiv.2506.13119.