We are CausalAI4Health (Causal Artificial Intelligence for Health), a research group pushing the frontiers of computational and statistical methods to uncover and quantify causal mechanisms in complex biomedical systems.
The group is led by Dr. Adèle Helena Ribeiro and is hosted by the Institute of Medical Informatics at the University of Münster, with support from the German Federal Ministry of Research, Technology and Space (BMFTR).
Our mission is to advance both the theoretical foundations and translational research of explainable AI and causal inference, enabling deeper scientific insight and more effective data-driven decisions in the life and health sciences. We aim to uncover why biological and health systems function as they do and how targeted interventions can influence their behavior.
Real-world biomedical datasets rarely satisfy standard methodological assumptions: they are often high-dimensional, heterogeneous, and multimodal, and can be affected by latent confounding, selection bias, privacy constraints, and limited sample sizes. If these challenges are not rigorously addressed, causal analyses risk producing invalid, non-reproducible, or non-generalizable results.
Our research tackles these challenges by developing methods that are both rigorous and effective in real-world settings, with a focus on:
Learn reliable causal relationships despite hidden confounders and limited samples.
Provide statistically valid and trustworthy conclusions with uncertainty estimates.
Integrate expert knowledge while accounting for uncertainty and conflicting information.
Enable multi-institution research without sharing sensitive data.
Efficiently handle complex, multimodal datasets at scale.
Our ongoing applications include public health and clinical research in malaria, mental health, cardiovascular diseases, post acute infection syndromes including long COVID, and cancer. In each of these domains, causal insights have the potential to improve mechanistic understanding and inform targeted prevention, diagnosis, and treatment strategies.
By addressing core challenges in causal inference such as reliability, heterogeneity, privacy, knowledge integration, and scalability, our research contributes to the development of explainable, trustworthy, and actionable AI for biomedical discovery and precision health.
Group Leader
Dr. Adèle Helena RibeiroResearch focus: Advancing Causal and Explainable AI for Health and Life Sciences.
Ph.D. Student
Maximilian Hahn, M.Sc.
Research focus: Advancing Privacy-Preserving, Collaborative Causal AI in Real-World Settings.
Co-advised with Prof. Dr. Dominik Heider, Institute of Medical Informatics, Universität Münster, Münster, Germany.
(External) Ph.D. Student
Azlaan Mustafa Samad, M.Sc.
Research focus: Advancing Causal Representation Learning and Causal Abstraction in the Health Sciences.
Co-advised with Prof. Dr. Wolfgang Nejdl, L3S Research Center,
CAIMed (Lower Saxony Research Center for AI and Causal Methods in Medicine) &
Leibniz Universität Hannover, Hanover, Germany.
(External) Ph.D. Student
Matheus Becali Rocha, M.Sc.
Research focus: Causal Discovery for More Interpretable and Generalizable Predictions.
Co-advised with Prof. Dr. Renato Kroling, Department of Informatics, Federal University of Espírito Santo, Brazil.
(External) Master Student
Aaron Zumdick
Research focus: Enabling Causal Discovery from Complex, Heterogeneous, and Non-IID Data.
Co-advised with Prof. Dr. Peter Florian Stadler, Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Leipzig, Germany.
(External) Master Student
Bárbara Alexandra Alves Sequeira
Research focus: Exploring Causal Discovery in Catalysis under Limited and Confounded Datasets.
Co-advised with Prof. Dr. Pedro Freitas Mendes, Department of Chemical Engineering, Instituto Superior Técnico, Lisbon, Portugal.
We develop robust causal discovery methods that address latent confounding,
mixed variable types, and potential faithfulness violations,
while quantifying structural uncertainty to support reliable
downstream analyses such as effect estimation and intervention planning.
dcFCI (Ribeiro & Heider, forthcoming) is a robust causal discovery algorithm that combines FCI-guided search with score-based evaluation. It rigorously handles latent confounders and heterogeneous variable types, resolves conflicts in edge orientation, and guarantees that only valid PAGs are produced. Additionally, it ranks alternative causal structures to provide a measure of structural uncertainty. The dcFCI R package is publicly available on GitHub.
We develop methods that integrate expert knowledge into data-driven causal discovery,
enhancing the reliability and validity of learned models.
By incorporating expert insight directly into the learning process,
our algorithms extract patterns from data while remaining anchored in domain expertise.
anchorFCI (Ribeiro et al., 2025) extends the conservative FCI algorithm by selecting reliable anchor variables (those known not to be caused by others) and encoding their non-ancestral relationships. This is particularly useful for uncovering causal relationships among demographic or clinical traits using genetic variants as anchors. The anchorFCI R package is available on GitHub.
AGFN (da Silva et al., ArXiv, forthcoming) is a a probabilistic, uncertainty-aware framework for causal discovery, equipped with an optimal elicitation strategy to guide expert interaction. It infers a data-driven distribution over ancestral graphs and iteratively refines it through (potentially noisy) expert feedback. The AGFN Python package is available on GitHub.
In high-dimensional and complex settings, it is often more informative to represent causal models at a higher level of abstraction than the individual variables themselves.
By grouping related variables into clusters, we can simplify complex systems, highlight key causal relationships, and make inference more tractable.
C-DAGs (Anand, et al. AAAI 2023) is a graphical framework for causal reasoning at a higher level of abstraction, enabling reliable causal inferences between clusters of variables without specifying the relationships within each cluster. Beyond formally defining C-DAGs, we have developed sound and complete methods for causal inference in this framework, supporting both interventional and counterfactual queries. Recently (Yvernes, et al., NeurIPS 2025), we extended the C-DAG framework to support arbitrary variable clusterings by relaxing the partition admissibility constraint, thereby allowing cyclic C-DAG representations.
CLOC (Anand, et. al, NeurIPS 2025) is a causal discovery algorithm designed to uncover relationships between clusters of variables in high-dimensional systems. It leverages a novel graphical framework to encode and learn cluster-level dependencies and independencies in Markov causal systems. The algorithm is sound and complete, providing a reliable representation of learnable causal relationships between clusters. The CLOC implementation in R is available on GitHub.
In general, a causal model cannot be uniquely determined from observational data alone.
However, causal discovery algorithms can partially recover it, defining a
set of plausible models known as the Markov Equivalence Class (MEC).
In Semi-Markovian Systems, where hidden confounders may exist, such a class
is typically represented as a Partial Ancestral Graph (PAG).
We developed the first sound and complete calculus and algorithms for identifying (conditional) causal effects from PAGs (Jaber et al., NeurIPS 2022). Combined with causal discovery, this yields the first fully data-driven approach to causal inference, uncovering all identifiable effects directly from observational data. The CIDP algorithm is implemented in the PAGId R package on GitHub.
Malaria remains a major health challenge, particularly in regions facing poverty, limited healthcare access, and harsh environments, such as the Amazon rainforest.
By combining AI and causal inference, we can significantly advance malaria research by identifying context-specific risk factors, uncovering underlying causal mechanisms, and guiding more effective, targeted interventions. (Ribeiro, et al. 2025 Front. Genet.).
Supported by BMFTR, this project leverages, in collaboration with the University of São Paulo, the Mâncio Lima cohort (Johansen, et al., 2021), which includes rich demographic, phenotypic, and genetic data from about 20% of households in Brazil’s main urban malaria hotspot, together with national surveillance data from SIVEP-Malaria, covering most symptomatic cases across the country.
Understanding the causal influence of the gut microbiome on mental health is crucial for uncovering the biological pathways that connect the gut and the brain. By identifying potential causal taxa from observational data, researchers can better prioritize targets for experimental validation and develop more effective, personalized mental health interventions.
Using data-driven causal inference, we identified Eggerthella and Hungatella as causal contributors to major depressive disorder (MDD), acting through two distinct gut–brain pathways independent of body mass index (Fehse et al., forthcoming).
A follow-up study (Thanarajah, et al., 2025, JAMA Psychiatry), shows that soft drink consumption may contribute to MDD through gut microbiota alterations, notably involving Eggerthella.
| Ribeiro, A. H., Heider, D. (2025). dcFCI: Robust Causal Discovery Under Latent Confounding, Unfaithfulness, and Mixed Data. arXiv preprint arXiv:2505.06542. doi: 10.48550/arXiv.2505.06542. (Link) |
| Anand, T.V., Ribeiro, A. H., Tian, J., Hripcsak G., and Bareinboim, E. (2025). Causal discovery over clusters of variables in Markovian systems. Advances in Neural Information Processing Systems (NeurIPS 2025). (Link) |
| Thanarajah, S. E., Ribeiro, A. H., …, Heider, D, Dannlowski, U., Hahn, T. (2025). Soft drink consumption and depression mediated by gut microbiome alterations. JAMA Psychiatry. (Link) |
| Ribeiro, A. H.*, Soler, J. M. P., Corder, R. M., Ferreira, M. U., Heider D. (2025). From Bites to Bytes: Understanding How and Why Individual Malaria Risk Varies Using Artificial Intelligence and Causal Inference. Frontiers in Genetics.. doi: 10.3389/fgene.2025.1599826. (Link) |
| Fehse L.*, Ribeiro, A. H.*, Winter, N. R., Thanarajah, S.E., … , Heider, D., Hahn, T. (2024). From Gut to Brain: Evidence for a Causal Contribution of Gut-Microbiota to Major Depressive Disorder in Humans. MedRxiv preprint. doi: 10.1101/2024.12.05.24318549. (Link) |
| da Silva, T., Silva, E., Góis, A., Heider, D., Kaski, S. and Mesquita, D., Ribeiro, A. H. (2024). Human-in-the-Loop Causal Discovery under Latent Confounding using Ancestral GFlowNets. arXiv preprint arXiv:2309.12032. (Link) |
| da Silva, T., Silva, E., Góis, A., Heider, D., Kaski, S. and Mesquita, D., Ribeiro, A. H. (2024). Human-Aided Discovery of Ancestral Graphs. LXAI Workshop at Neural Information Processing Systems (NeurIPS 2024). (Link) |
| Ribeiro, A. H., Crnkovic, M., Pereira, J. L., Fisberg, R. M., Sarti, F. M., Rogero, M. M., Heider, D., and Cerqueira, A. (2024). AnchorFCI: Harnessing Genetic Anchors for Enhanced Causal Discovery of Cardiometabolic Disease Pathways. Frontiers in Genetics. 15:1436947. DOI: 10.3389/fgene.2024.1436947. (Link) |
| Leite, J. M. R., Ribeiro, A. H., Pereira, J. L., de Souza, C. A., Heider, D., ... & Sarti, F. M. (2024). Missense genetic variants in major bitter taste receptors are associated with diet quality and food intake in a highly admixed underrepresented population. Clinical Nutrition ESPEN. https://doi.org/10.1016/j.clnesp.2024.06.045 |
| Meneguitti Dias, F., Ribeiro, E., Ribeiro, A. H., Krieger, J., Antonio Gutierrez, M. (2023). Artificial Intelligence-Driven Screening System for Rapid Image-Based Classification of 12-Lead ECG Exams: A Promising Solution for Emergency Room Prioritization. IEEE Access https://doi.org/10.1109/ACCESS.2023.3328538 |
| Tajabadi, M, Grabenhenrich, L., Ribeiro, A. H., Leyer, M., Heider D. (2023). Sharing Data With Shared Benefits: Artificial Intelligence Perspective. J Med Internet Res 2023;25:e47540. (Link) |
| Anand, T. V.*, Ribeiro, A. H.*, Tian, J. , Bareinboim, E. (2023). Causal Effect Identification in Cluster DAGs. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-23) , 37(10), 12172-12179. https://doi.org/10.1609/aaai.v37i10.26435 -- Selected for Oral Presentation |
| Mundt, M., Cooper, K.W., Dhami, D.S., Ribeiro, A. H., Smith, J.S., Bellot A., Hayes, T. (2023) Continual Causality: A Retrospective of the Inaugural AAAI-23 Bridge Program. Proceedings of The First AAAI Bridge Program on Continual Causality, PMLR 208:1-10 . (Link) |
| Jaber, A.*, Ribeiro, A. H.*, Zhang, J., Bareinboim, E. (2022). Causal Identification under Markov equivalence: Calculus, Algorithm, and Completeness. In Advances in Neural Information Processing Systems, 35, 3679-3690. (NeurIPS-22). (Link) -- Highlighted Paper (<2%, out of 10,411). |
| Ribeiro, A. H., Bareinboim, E. (2022). Causal Inference and Data Fusion: Towards an Accelerated Process of Scientific Discovery. OECD-22. Organisation for Economic Co-operation and Development, Volume “AI and the productivity of science” (Link) |
| Dias, F. M., Samesima, N., Ribeiro, A. H., Moreno, R. A., Pastore, C. A., Krieger, J. E., and Gutierrez,M. A. (2021). 2D Image-Based Atrial Fibrillation Classification. In 2021 Computing in Cardiology (CinC), volume 48, pages 1–4. IEEE. https://doi.org/10.23919/CinC53138.2021.9662735 |
| Ribeiro, A. H., Vidal, M. C., Sato, J. R., Fujita A. (2021). Granger Causality among Graphs and Application to Functional Brain Connectivity in Autism Spectrum Disorder. Entropy. 23(9):1024. https://doi.org/10.3390/e23091204 |
| Ribeiro, A. H., Soler, J. M. P. (2020). Learning Genetic and Environmental Graphical Models from Gaussian Family Data. Statistics in Medicine. 39: 2403– 2422. https://doi.org/10.1002/sim.8545 |
| Ribeiro, A. H., Soler, J. M. P., Hirata Jr., R. (2019). Variance-Preserving Estimation of Intensity Values Obtained from Omics Experiments. Frontiers in Genetics. 10:855. https://doi.org/110.3389/fgene.2019.00855 |
| Ribeiro, A. H. (2018). Identification of Causality in Genetics and Neuroscience. Doctoral dissertation, Department of Computer Science, University of Sao Paulo. https://doi.org/10.11606/T.45.2019.tde-15032019-190109 |
| Ribeiro, A. H., Lotufo, P., Fujita, A, Goulart, A., Chor, D., Mill, J. G., Bensenor, I., Santos, I. S. (2017). Association Between Short-Term Systolic Blood Pressure Variability and Carotid Intima-Media Thickness in ELSA-Brasil Baseline. American Journal of Hypertension. 30:954–960. https://doi.org/10.1093/ajh/hpx076 |
| Ribeiro, A. H., Soler, J. M. P., Neto, E. C., Fujita, A. (2016). Causal Inference and Structure Learning of Genotype-Phenotype Networks Using Genetic Variation. In Big Data Analytics in Genomics. Springer International Publishing, New York, p. 89-143. https://doi.org/10.1007/978-3-319-41279-5_3 |
| Ribeiro, A. H. (2014). Gene expression analysis taking into account measurement errors and application to real data. Master thesis, Department of Computer Science, University of Sao Paulo. https://doi.org/10.11606/D.45.2014.tde-04082014-163616 |
| Ribeiro, A. H. (2025). dcFCI: Robust causal discovery under latent confounding, unfaithfulness, and mixed data. https://github.com/adele/dcFCI |
| Ribeiro, A. H. (2024). anchorFCI: an extension of the FCI algorithm designed to improve robustness and discovery power in causal discovery by strategically selecting reliable anchors, while leveraging their known non-ancestral relationships. https://github.com/adele/anchorFCI |
| Ribeiro, A. H. (2022). PAGId: an R package for causal effect identification in Partial Ancestral Graphs. https://github.com/adele/PAGId |
| Ribeiro, A. H. (2020). FamilyBasedPGMs: an R package for learning genetic and environmental graphical models from family data. https://github.com/adele/FamilyBasedPGMs |
| Ribeiro, A. H. (2019). OmicsMA: An R Package for Variance-Preserving Estimation and Normalization of M-A Values from Omics Experiments. https://github.com/adele/omicsMA |