Adèle H. Ribeiro

About Me

Adèle Helena Ribeiro

Postdoctoral Researcher
Institute of Medical Informatics
University of Münster
adele.ribeiro@uni-muenster.de

About me

I am currently a postdoctoral researcher at the Institute of Medical Informatics, University of Münster, Germany, within Prof. Dr. Dominik Heider's group. I joined his lab in October 2022 while still affiliated with the Department of Mathematics and Computer Science at Philipps University of Marburg. Prior to this, I was a postdoctoral researcher in the Causal Artificial Intelligence Lab at Columbia University, USA, working with Prof. Dr. Elias Bareinboim. Before joining Columbia University, I completed a doctoral research internship at the Neuroscience Institute, Princeton University, USA, and worked as a postdoctoral researcher at the Laboratory of Genetics and Molecular Cardiology, Heart Institute, USP, Brazil. My research focuses on addressing critical challenges in causal and counterfactual inference in real-world domains, such as the health sciences, helping bridge the gap between theory and practical applications. This includes developing more robust, scalable, and privacy-preserving causal discovery and effect identification tools, with a focus on quantifying uncertainty, integrating background knowledge, and employing human-in-the-loop approaches to support personalized healthcare and advance scientific knowledge and decision-making.

Research Interests

Causal Inference
Explainable AI
Structure Learning
Deep Learning
Statistical Genetics
Multi-Omics Analysis
Computational Neuroscience
Health and Medical Research

Curriculum vitae

Resume

Education and Professional Preparation

Oct 2024

Postdoctoral Scholar
Institute of Medical Informatics
University of Münster
Münster, Germany
Project: Causal Data Science and Machine Learning in Biomedicine.
Advisor: Prof. Dominik Heider
Oct 2022

Postdoctoral Scholar
Laboratory of Data Science in Biomedicine
Philipps-Universität Marburg
Marburg, Hesse, Germany
Project: Causal Data Science and Machine Learning in Biomedicine.
Advisor: Prof. Dominik Heider
Sep 2019

Postdoctoral Scholar
Causal Artificial Intelligence Laboratory
Data Science / Computer Science Institutes
Columbia University
New York, NY, USA
Project: Causal Inference in the Health Sciences: from Biased and Heterogeneous Data Collections to Personalized and Improved Patient Outcomes.
Advisor: Prof. Elias Bareinbom
Feb 2019

Postdoctoral Scholar
Laboratory of Genetics and Molecular Cardiology
Heart Institute (InCor)
University of Sao Paulo
Sao Paulo, SP, Brazil
Project: Deep Learning for 12-lead ECG Classification.
Advisor: Prof. José Eduardo Krieger
Nov 2018

Doctor of Philosophy in
Computer Science
Institute of Mathematics and Statistics
University of Sao Paulo (IME-USP)
Sao Paulo, SP, Brazil PhD's dissertation: Identification of Causality in Genetics and Neuroscience
Advisor: Prof. André Fujita
Co-Advisor: Prof. Júlia Maria Pavan Soler
Fall 2017

Doctoral Research Internship
Neuroscience Institute
Princeton University
Princeton, NJ, USA
Project: Deep learning-based pose representation and dynamics modeling of marmoset monkeys.
Advisor: Prof. Asif A. Ghazanfar
Jun 2014

Master of Science in
Computer Science
Institute of Mathematics and Statistics
University of Sao Paulo (IME-USP)
Sao Paulo, SP, Brazil
Master's thesis: Gene expression analysis taking into account measurement errors and application to real data.
Advisor: Prof. Roberto Hirata Jr.
Dec 2011

Bachelor of Science in Computational
and Applied Mathematics
Institute of Mathematics and Statistics
University of Sao Paulo (IME-USP)
Sao Paulo, SP, Brazil
Senior thesis: Analysis of Pyroelectric Infrared (PIR) sensor output signals.
Advisor: Prof. Roberto Hirata Jr.

Fellowships and Scholarships

Sep 2021	DAAD Postdoc-NeT-AI Fellowship DAAD Artificial Intelligence Networking (AInet) Fellowship Federal Ministry of Education and Research, Germany
Sep 2019 - Aug 2022	Postdoctoral Research Fellowship Causal Artificial Intelligence Lab Department of Computer Science & Data Science Institute, Columbia University, New York, NY, USA
Jan 2019 - Aug 2019	Postdoctoral Research Fellowship Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil
Sep 2017 - Dec 2017	PhD Visiting Student at Princeton University Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil
Aug 2014 - Jul 2018	PhD Graduate Research Scholarship Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil
Mar 2012 - Feb 2014	MSc Graduate Research Scholarship Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil

Publications

Fehse L.*, Ribeiro, A. H.*, Winter, N. R., Thanarajah, S.E., … , Heider, D., Hahn, T. (2024). From Gut to Brain: Evidence for a Causal Contribution of Gut-Microbiota to Major Depressive Disorder in Humans. MedRxiv preprint. doi: 10.1101/2024.12.05.24318549. (Link)

Major Depressive Disorder (MDD) is a highly prevalent, severe mental health condition that constitutes one of the leading causes of disability worldwide. While recent animal studies suggest a causal role of the gut microbiome in the pathophysiology of MDD models, evidence in humans is still unclear due to small sample sizes, inconsistent clinical assessment of MDD diagnosis, and methodological limitations regarding causal inference in cross-sectional data. Here, we explicitly address these shortcomings to investigate the potential causal link between the gut microbiome and MDD: First, we replicate previous findings using one of the largest multicenter MDD cohorts for which microbiome data and in-depth diagnostic assessment are available (N=1,269 MDD patients and controls). We find a significant difference between healthy controls and MDD patients for the relative abundance of the four taxa Eggerthella, Hungatella, Coprobacillus, and Lachnospiraceae FCS020. Second, we employ state-of-the-art, fully data-driven causal inference tools within Judea Pearl’s framework, allowing us to derive model constraints from the data rather than relying on potentially strong, unrealistic assumptions. Using this approach, we found data-driven evidence for Eggerthella and Hungatella as causal contributors to MDD. Furthermore, we show that Eggerthella and Hungatella abundances are associated with MDD beyond the influence of body mass index, identifying two distinct pathways linking MDD to the gut microbiome. Finally, the difference in relative abundance of these taxa between healthy and MDD patients was independent of antidepressant medication. Our study provides the first evidence for a potential causal role of gut-microbiota in the pathophysiology of depression in humans.

da Silva, T., Silva, E., Góis, A., Heider, D., Kaski, S. and Mesquita, D., Ribeiro, A. H. (2024). Human-in-the-Loop Causal Discovery under Latent Confounding using Ancestral GFlowNets. arXiv preprint arXiv:2309.12032. (Link)

Structure learning is the crux of causal inference. Notably, causal discovery (CD) algorithms are brittle when data is scarce, possibly inferring imprecise causal relations that contradict expert knowledge -- especially when considering latent confounders. To aggravate the issue, most CD methods do not provide uncertainty estimates, making it hard for users to interpret results and improve the inference process. Surprisingly, while CD is a human-centered affair, no works have focused on building methods that both 1) output uncertainty estimates that can be verified by experts and 2) interact with those experts to iteratively refine CD. To solve these issues, we start by proposing to sample (causal) ancestral graphs proportionally to a belief distribution based on a score function, such as the Bayesian information criterion (BIC), using generative flow networks. Then, we leverage the diversity in candidate graphs and introduce an optimal experimental design to iteratively probe the expert about the relations among variables, effectively reducing the uncertainty of our belief over ancestral graphs. Finally, we update our samples to incorporate human feedback via importance sampling. Importantly, our method does not require causal sufficiency (i.e., unobserved confounders may exist). Experiments with synthetic observational data show that our method can accurately sample from distributions over ancestral graphs and that we can greatly improve inference quality with human aid.

da Silva, T., Silva, E., Góis, A., Heider, D., Kaski, S. and Mesquita, D., Ribeiro, A. H. (2024). Human-Aided Discovery of Ancestral Graphs. LXAI Workshop at Neural Information Processing Systems (NeurIPS 2024). (Link)

In data-scarce situations, causal discovery (CD) algorithms often produce unreliable causal relationships that may conflict with expert knowledge, especially in the presence of latent confounders. Additionally, most CD methods lack adequate uncertainty quantification, hindering users' ability to evaluate and refine results. To address these issues, we present a fully probabilistic CD method referred to as Ancestral GFlowNets (AGFNs). In a nutshell, AGFNs sample ancestral graphs (AGs) proportionally to a score-based belief distribution, allowing users to assess % and propagate the uncertainty of the discovered causal relationships. On top of that, we design an elicitation framework that enables the incorporation of human knowledge into the inference process via importance sampling. Notably, our approach naturally accommodates CD on data sets with latent confounding and potentially heterogeneous data types, a setting that has received little attention from the literature. Finally, experimental results with observational data show that our method effectively samples from distributions over AGs and significantly enhances inference quality with human aid.

Ribeiro, A. H., Crnkovic, M., Pereira, J. L., Fisberg, R. M., Sarti, F. M., Rogero, M. M., Heider, D., and Cerqueira, A. (2024). AnchorFCI: Harnessing Genetic Anchors for Enhanced Causal Discovery of Cardiometabolic Disease Pathways. Frontiers in Genetics. 15:1436947. DOI: 10.3389/fgene.2024.1436947. (Link)

Cardiometabolic diseases, a leading global health concern, arise from a complex interplay of lifestyle choices, genetic predispositions, and biochemical markers. Although extensive research has uncovered strong associations among various risk factors and these diseases, grasping their causal relationships is vital for gaining deeper mechanistic insights and designing effective prevention and intervention strategies. We address this gap by introducing anchorFCI, a novel adaptation of the conservative Really Fast Causal Inference (RFCI) algorithm designed to enhance the discovery of causal relationships by strategically selecting and integrating reliable anchor variables from an additional set known not to be caused by the variables of interest. This approach is particularly well-suited for learning causal networks involving phenotypic, clinical, and sociodemographic factors, leveraging genetic variables recognized as not being influenced by these factors. By integrating these anchor variables along with knowledge of their non-ancestral relationships, anchorFCI effectively handles latent confounding while enhancing both robustness and discovery power. We demonstrate its effectiveness using data from the 2015 ISA-Nutrition study in São Paulo, Brazil, and further estimate the effect sizes of the uncovered causal relationships with state-of-the-art tools from Judea Pearl's framework, presenting a fully data-driven causal inference pipeline. The results not only support many established causal relationships but also elucidate their interconnections within a complex network, enhancing our understanding of the broader dynamics and the multifaceted nature of cardiometabolic risk.

Leite, J. M. R., Ribeiro, A. H., Pereira, J. L., de Souza, C. A., Heider, D., ... & Sarti, F. M. (2024). Missense genetic variants in major bitter taste receptors are associated with diet quality and food intake in a highly admixed underrepresented population. Clinical Nutrition ESPEN. https://doi.org/10.1016/j.clnesp.2024.06.045

To investigate associations between Single Nucleotide Polymorphisms (SNPs) in the TAS1R and TAS2R taste receptors and diet quality, intake of alcohol, added sugar, and fat, using linear regression and machine learning techniques in a highly admixed population. In the ISA-Capital health survey, 901 individuals were interviewed and had socioeconomic, demographic, health characteristics, along with dietary information obtained through two 24-h recalls. Data on 12 components related to food groups, nutrients, and calories was combined into a diet quality score (BHEI-R). BHEI-R, SoFAAs (calories from added sugar, saturated fat, and alcohol) and Alcohol use were tested for associations with 255 TAS2R SNPs and 73 TAS1R SNPs for 637 individuals with regression analysis and Random Forest. Significant SNPs were combined into Genetic taste scores (GTSs). Among 23 SNPs significantly associated either by stepwise linear/logistic regression or random forest with any possible biological functionality, the missense variants rs149217752 in TAS2R40, for SoFAAs, and rs2233997 in TAS2R4, were associated with both BHEI-R (under 4% increase in Mean Squared Error) and SoFAAs. GTSs increased the variance explanation of quantitative phenotypes and there was a moderately high AUC for alcohol use. The study provides insights into the genetic basis of human taste perception through the identification of missense variants in the TAS2R gene family. These findings may contribute to future strategies in precision nutrition aimed at improving food quality by reducing added sugar, saturated fat, and alcohol intake.

Meneguitti Dias, F., Ribeiro, E., Ribeiro, A. H., Krieger, J., Antonio Gutierrez, M. (2023). Artificial Intelligence-Driven Screening System for Rapid Image-Based Classification of 12-Lead ECG Exams: A Promising Solution for Emergency Room Prioritization. IEEE Access https://doi.org/10.1109/ACCESS.2023.3328538

The electrocardiogram (ECG) serves as a valuable diagnostic tool, providing crucial information about life-threatening cardiac conditions such as atrial fibrillation and myocardial infarction. A prompt and efficient assessment of ECG exams in environments such as Emergency Rooms (ERs) can significantly enhance the chances of survival for high-risk patients. Despite the presence of numerous works on ECG classification, most of these studies have concentrated on one-dimensional ECG signals, which are commonly found in publicly available ECG datasets. Nevertheless, the practical relevance of such methods is limited in hospital settings, where ECG exams are usually stored as images. In this study, we have developed an artificial intelligence-driven screening system specifically designed to analyze 12-lead ECG images. Our proposed method has been trained on an extensive dataset comprising 99,746 12-lead ECG exams collected from the ambulatory section of a tertiary hospital. The primary goal was to precisely classify the exams into three classes: Normal (N), Atrial Fibrillation (AFib), and Other (O). The evaluation of our approach yielded AUROC scores of 93.2%, 99.2%, and 93.1% for N, AFib, and O, respectively. To further validate our approach, we conducted evaluations using the 2018 China Physiological Signal Challenge (CPSC) database. In this evaluation, we achieved AUROC scores of 91.8%, 97.5%, and 70.4% for the classes N, AFib, and O, respectively. Additionally, we assessed our method using 1,074 exams acquired in the ER and obtained AUROC values of 98.3%, 98.0%, and 97.7% for the classes N, AFib, and O, respectively. Furthermore, we developed and deployed a system with a trained model within the ER of a tertiary hospital for research purposes. This system automatically retrieves newly captured ECG chart images from the Picture Archiving and Communication System (PACS) within the ER. These images undergo necessary preprocessing steps and serve as input for our proposed classification method. This comprehensive approach established an efficient and versatile end-to-end framework for ECG classification. The results of our study highlight the potential of leveraging artificial intelligence in the screening of ECG exams, offering a promising solution for the rapid assessment and prioritization of patients in the ER.

Tajabadi, M, Grabenhenrich, L., Ribeiro, A. H., Leyer, M., Heider D. (2023). Sharing Data With Shared Benefits: Artificial Intelligence Perspective. J Med Internet Res 2023;25:e47540. (Link)

Artificial intelligence (AI) and data sharing go hand in hand. In order to develop powerful AI models for medical and health applications, data need to be collected and brought together over multiple centers. However, due to various reasons, including data privacy, not all data can be made publicly available or shared with other parties. Federated and swarm learning can help in these scenarios. However, in the private sector, such as between companies, the incentive is limited, as the resulting AI models would be available for all partners irrespective of their individual contribution, including the amount of data provided by each party. Here, we explore a potential solution to this challenge as a viewpoint, aiming to establish a fairer approach that encourages companies to engage in collaborative data analysis and AI modeling. Within the proposed approach, each individual participant could gain a model commensurate with their respective data contribution, ultimately leading to better diagnostic tools for all participants in a fair manner.

Anand, T. V.*, Ribeiro, A. H.*, Tian, J. , Bareinboim, E. (2023). Causal Effect Identification in Cluster DAGs. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-23) , 37(10), 12172-12179. https://doi.org/10.1609/aaai.v37i10.26435 -- Selected for Oral Presentation

Reasoning about the effect of interventions and counterfactuals is a fundamental task found throughout the data sciences. A collection of principles, algorithms, and tools has been developed for performing such tasks in the last decades (Pearl, 2000). One of the pervasive requirements found throughout this literature is the articulation of assumptions, which commonly appear in the form of causal diagrams. Despite the power of this approach, there are significant settings where the knowledge necessary to specify a causal diagram over all variables is not available, particularly in complex, high-dimensional domains. In this paper, we introduce a new graphical modeling tool called cluster DAGs (for short, CDAGs) that allows for the partial specification of relationships among variables based on limited prior knowledge, alleviating the stringent requirement of specifying a full causal diagram. A C-DAG specifies relationships between clusters of variables, while the relationships between the variables within a cluster are left unspecified, and can be seen as a graphical representation of an equivalence class of causal diagrams that share the relationships among the clusters. We develop the foundations and machinery for valid inferences over C-DAGs about the clusters of variables at each layer of Pearl’s Causal Hierarchy (Pearl and Mackenzie 2018; Bareinboim et al. 2020) - L1 (probabilistic), L2 (interventional), and L3 (counterfactual). In particular, we prove the soundness and completeness of d-separation for probabilistic inference in C-DAGs. Further, we demonstrate the validity of Pearl’s do-calculus rules over C-DAGs and show that the standard ID identification algorithm is sound and complete to systematically compute causal effects from observational data given a C-DAG. Finally, we show that C-DAGs are valid for performing counterfactual inferences about clusters of variables.

Mundt, M., Cooper, K.W., Dhami, D.S., Ribeiro, A. H., Smith, J.S., Bellot A., Hayes, T. (2023) Continual Causality: A Retrospective of the Inaugural AAAI-23 Bridge Program. Proceedings of The First AAAI Bridge Program on Continual Causality, PMLR 208:1-10 . (Link)

Both of the fields of continual learning and causality investigate complementary aspects of human cognition and are fundamental components of artificial intelligence if it is to reason and generalize in complex environments. Despite the burgeoning interest in investigating the intersection of the two fields, it is currently unclear how causal models may describe continuous streams of data and vice versa, how continual learning may exploit learned causal structure. We proposed to bridge this gap through the inaugural AAAI-23 “Continual Causality” bridge program, where our aim was to take the initial steps towards a unified treatment of these fields by providing a space for learning, discussions, and to build a diverse community to connect researchers. The activities ranged from traditional tutorials and software labs, invited vision talks, and contributed talks based on submitted position papers, as well as a panel and breakout discussions. Whereas materials are publicly disseminated as a foundation for the community: https://www.continualcausality.org, respectively discussed ideas, challenges, and prospects beyond the inaugural bridge are summarized in this retrospective paper.

Jaber, A.*, Ribeiro, A. H.*, Zhang, J., Bareinboim, E. (2022). Causal Identification under Markov equivalence: Calculus, Algorithm, and Completeness. In Advances in Neural Information Processing Systems, 35, 3679-3690. (NeurIPS-22). (Link) -- Highlighted Paper (<2%, out of 10,411).

One common task in many data sciences applications is to answer questions about the effect of new interventions, like: what would happen to Y if we make X equal to x while observing covariates Z = z?. Formally, this is known as conditional effect identification, where the goal is to determine whether a post-interventional distribution is computable from the combination of an observational distribution and assumptions about the underlying domain represented by a causal diagram. A plethora of methods was developed for solving this problem, including the celebrated do-calculus [Pearl, 1995]. In practice, these results are not always applicable since they require a fully specified causal diagram as input, which is usually not available. In this paper, we assume as the input of the task a less informative structure known as a partial ancestral graph (PAG), which represents a Markov equivalence class of causal diagrams, learnable from observational data. We make the following contributions under this relaxed setting. First, we introduce a new causal calculus, which subsumes the current state-of-the-art, PAG-calculus. Second, we develop an algorithm for conditional effect identification given a PAG and prove it to be both sound and complete. In words, failure of the algorithm to identify a certain effect implies that this effect is not identifiable by any method. Third, we prove the proposed calculus to be complete for the same task.

Ribeiro, A. H., Bareinboim, E. (2022). Causal Inference and Data Fusion: Towards an Accelerated Process of Scientific Discovery. OECD-22. Organisation for Economic Co-operation and Development, Volume “AI and the productivity of science” (Link)

Dias, F. M., Samesima, N., Ribeiro, A. H., Moreno, R. A., Pastore, C. A., Krieger, J. E., and Gutierrez,M. A. (2021). 2D Image-Based Atrial Fibrillation Classification. In 2021 Computing in Cardiology (CinC), volume 48, pages 1–4. IEEE. https://doi.org/10.23919/CinC53138.2021.9662735

Atrial fibrillation (AF) is a common arrhythmia (0.5% worldwide prevalence) associated with an increased risk of various cardiovascular disorders, including stroke. Automated routine AF detection by Electrocardiogram (ECG) is based on the analysis of one-dimensional ECG signals and requires dedicated software for each type of device, limiting its wide use, especially with the rapid incorporation of telemedicine into the healthcare system. Here, we implement a machine learning method for AF classification using the region of interest (ROI) corresponding to the long DII lead automatically extracted from DI-COM 12-lead ECG images. We observed 94.3%, 98.9%, 99.1%, and 92.2% for sensitivity, specificity, AUC, and F1 score, respectively. These results indicate that the proposed methodology performs similar to one-dimensional ECG signals as input, but does not require a dedicated software facilitating the integration into clinical practice, as ECGs are typically stored in PACS as 2D images.

Ribeiro, A. H., Vidal, M. C., Sato, J. R., Fujita A. (2021). Granger Causality among Graphs and Application to Functional Brain Connectivity in Autism Spectrum Disorder. Entropy. 23(9):1024. https://doi.org/10.3390/e23091204

Graphs/networks have become a powerful analytical approach for data modeling. Besides, with the advances in sensor technology, dynamic time-evolving data have become more common. In this context, one point of interest is a better understanding of the information flow within and between networks. Thus, we aim to infer Granger causality (G-causality) between networks' time series. In this case, the straightforward application of the well-established vector autoregressive model is not feasible. Consequently, we require a theoretical framework for modeling time-varying graphs. One possibility would be to consider a mathematical graph model with time-varying parameters (assumed to be random variables) that generates the network. Suppose we identify G-causality between the graph models' parameters. In that case, we could use it to define a G-causality between graphs. Here, we show that even if the model is unknown, the spectral radius is a reasonable estimate of some random graph model parameters. We illustrate our proposal's application to study the relationship between brain hemispheres of controls and children diagnosed with Autism Spectrum Disorder (ASD). We show that the G-causality intensity from the brain's right to the left hemisphere is different between ASD and controls.

Ribeiro, A. H., Soler, J. M. P. (2020). Learning Genetic and Environmental Graphical Models from Gaussian Family Data. Statistics in Medicine. 39: 2403– 2422. https://doi.org/10.1002/sim.8545

Many challenging problems in biomedical research rely on understanding how variables are associated with each other and influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely acknowledged as a very natural and formal language to describe relationships among variables and have been extensively used for studying complex diseases and traits. In this work, we propose methods that leverage observational Gaussian family data for learning a decomposition of undirected and directed acyclic PGMs according to the influence of genetic and environmental factors. Many structure learning algorithms are strongly based on a conditional independence test. For independent measurements of normally distributed variables, conditional independence can be tested through standard tests for zero partial correlation. In family data, the assumption of independent measurements does not hold since related individuals are correlated due to mainly genetic factors. Based on univariate polygenic linear mixed models, we propose tests that account for the familial dependence structure and allow us to assess the significance of the partial correlation due to genetic (between-family) factors and due to other factors, denoted here as environmental (within-family) factors, separately. Then, we extend standard structure learning algorithms, including the IC/PC and the really fast causal inference (RFCI) algorithms, to Gaussian family data. The algorithms learn the most likely PGM and its decomposition into two components, one explained by genetic factors and the other by environmental factors. The proposed methods are evaluated by simulation studies and applied to the Genetic Analysis Workshop 13 simulated dataset, which captures significant features of the Framingham Heart Study.

Ribeiro, A. H., Soler, J. M. P., Hirata Jr., R. (2019). Variance-Preserving Estimation of Intensity Values Obtained from Omics Experiments. Frontiers in Genetics. 10:855. https://doi.org/110.3389/fgene.2019.00855

Faced with the lack of reliability and reproducibility in omics studies, more careful and robust methods are needed to overcome the existing challenges in the multi-omics analysis. In conventional omics data analysis, signal intensity values (denoted by M and values) are estimated neglecting pixel-level uncertainties, which may reflect noise and systematic artifacts. For example, intensity values from two-color microarray data are estimated by taking the mean or median of the pixel intensities within the spot and then subjected to a within-slide normalization by LOWESS. Thus, focusing on estimation and normalization of gene expression profiles, we propose a spot quantification method that takes into account pixel-level variability. Also, to preserve relevant variation that may be removed in LOWESS normalization with poorly chosen parameters, we propose a parameter selection method that is parsimonious and considers intrinsic characteristics of microarray data, such as heteroskedasticity. The usefulness of the proposed methods is illustrated by an application to real intestinal metaplasia data. Compared with the conventional approaches, the analysis is more robust and conservative, identifying fewer but more reliable differentially expressed genes. Also, the variability preservation allowed the identification of new differentially expressed genes. Using the proposed approach, we have identified differentially expressed genes involved in pathways in cancer and confirmed some molecular markers already reported in the literature.

Ribeiro, A. H. (2018). Identification of Causality in Genetics and Neuroscience. Doctoral dissertation, Department of Computer Science, University of Sao Paulo. https://doi.org/10.11606/T.45.2019.tde-15032019-190109

Causal inference may help us to understand the underlying mechanisms and the risk factors of diseases. In Genetics, it is crucial to understand how the connectivity among variables is influenced by genetic and environmental factors. Family data have proven to be useful in elucidating genetic and environmental influences, however, few existing approaches are able of addressing structure learning of probabilistic graphical models (PGMs) and family data analysis jointly. We propose methodologies for learning, from observational Gaussian family data, the most likely PGM and its decomposition into genetic and environmental components. They were evaluated by a simulation study and applied to the Genetic Analysis Workshop 13 simulated data, which mimic the real Framingham Heart Study data, and to the metabolic syndrome phenotypes from the Baependi Heart Study. In neuroscience, one challenge consists in identifying interactions between functional brain networks (FBNs) - graphs. We propose a method to identify Granger causality among FBNs. We show the statistical power of the proposed method by simulations and its usefulness by two applications: the identification of Granger causality between the FBNs of two musicians playing a violin duo, and the identification of a differential connectivity from the right to the left brain hemispheres of autistic subjects.

Ribeiro, A. H., Lotufo, P., Fujita, A, Goulart, A., Chor, D., Mill, J. G., Bensenor, I., Santos, I. S. (2017). Association Between Short-Term Systolic Blood Pressure Variability and Carotid Intima-Media Thickness in ELSA-Brasil Baseline. American Journal of Hypertension. 30:954–960. https://doi.org/10.1093/ajh/hpx076

Blood pressure (BP) is associated with carotid intima-media thickness (CIMT), but few studies have explored the association between BP variability and CIMT. We aimed to investigate this association in the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) baseline. We found a small but significant association between SBP variability and CIMT values. This was additive to the association between SBP central tendency and CIMT values, supporting a role for high short-term SBP variability in atherosclerosis.

Ribeiro, A. H., Soler, J. M. P., Neto, E. C., Fujita, A. (2016). Causal Inference and Structure Learning of Genotype-Phenotype Networks Using Genetic Variation. In Big Data Analytics in Genomics. Springer International Publishing, New York, p. 89-143. https://doi.org/10.1007/978-3-319-41279-5_3

A major challenge in biomedical research is to identify causal relationships among genotypes, phenotypes, and clinical outcomes from high-dimensional measurements. Causal networks have been widely used in systems genetics for modeling gene regulatory systems and for identifying causes and risk factors of diseases. In this chapter, we describe fundamental concepts and algorithms for constructing causal networks from observational data. In biological context, causal inferences can be drawn from the natural experimental setting provided by Mendelian randomization, a term that refers to the random assignment of genotypes at meiosis. We show that genetic variants may serve as instrumental variables, improving estimation accuracy of the causal effects. In addition, identifiability issues that commonly arise when learning network structures may be overcome by using prior information on genotype–phenotype relations.

Ribeiro, A. H. (2014). Gene expression analysis taking into account measurement errors and application to real data. Master thesis, Department of Computer Science, University of Sao Paulo. https://doi.org/10.11606/D.45.2014.tde-04082014-163616

Any measurement, since it is made for a real instrument, has an uncertainty associated with it. In the present work, we address this issue of uncertainty in two-channel cDNA Microarray experiments, a technology that has been widely used in recent years and is still an important tool for gene expression studies. Tens of thousands of gene representatives are printed onto a glass slide and hybridized simultaneously with mRNA from two different cell samples. Different fluorescent dyes are used for labeling both samples. After hybridization, the glass slide is scanned yielding two images. Image processing and analysis programs are used for spot segmentation and pixel statistics computation, for instance, the mean, median and variance of pixel intensities for each spot. The same statistics are computed for the pixel intensities in the background region. Statistical estimators such as the variance gives us an estimate of the accuracy of a measurement. Based on the intensity estimates for each spot, some data transformations are applied in order to eliminate systematic variability so we can obtain the effective gene expression. This paper shows how to analyze gene expression measurements with an estimated error. We presented an estimate of this uncertainty and we studied, in terms of error propagation, the effects of some data transformations. An example of data transformation is the correction of the bias estimated by a robust local regression method, also known as lowess. With the propagated errors obtained, we also showed how to use them for detecting differentially expressed genes between different conditions. Finally, we compared the results with those obtained by classical analysis methods, in which the measurement errors are disregarded. We conclude that modeling the measurements uncertainties can improve the analysis, since the results obtained in a real gene expressions data base were consistent with the literature.

Open-Source Libraries

Ribeiro, A. H. (2024). anchorFCI: an extension of the FCI algorithm designed to improve robustness and discovery power in causal discovery by strategically selecting reliable anchors, while leveraging their known non-ancestral relationships. https://github.com/adele/anchorFCI

This package provides an implementation of the anchorFCI algorithm. Technical details are provided in the paper by Ribeiro et al. (2024), entitled "AnchorFCI: harnessing genetic anchors for enhanced causal discovery of cardiometabolic disease pathways", available at doi: 10.3389/fgene.2024.1436947. AnchorFCI is an extension of the FCI algorithm designed to improve robustness and discovery power in causal discovery by strategically selecting reliable anchors, while leveraging their known non-ancestral relationships. It operates on two sets of variables: the first set contains the variables of interest, while the second comprises variables that are not caused by any from the first. While this structure is beneficial for various applications, it is especially well-suited for datasets involving phenotypic, clinical, and sociodemographic variables (the first set), alongside genetic variables, such as SNPs (the second set), which are recognized as not being caused by the first set.

Ribeiro, A. H. (2022). PAGId: an R package for causal effect identification in Partial Ancestral Graphs. https://github.com/adele/PAGId

This package implements the CIDP and IDP algorithms for identifying (conditional) causal effects from a Partial Ancentral Graph (PAG). Technical details are provided in the NeurIPS 2022 paper by Jaber A., Ribeiro A. H., Zhang J., and Bareinboim E., (2022) entitled "Causal Identification under Markov equivalence: Calculus, Algorithm, and Completeness".

Ribeiro, A. H. (2020). FamilyBasedPGMs: an R package for learning genetic and environmental graphical models from family data. https://github.com/adele/FamilyBasedPGMs

This package provides methods for learning, from observational Gaussian family data (i.e., Gaussian data clusterized in families), Gaussian undirected and directed acyclic PGMs describing linear relationships among multiple phenotypes and a decomposition of the learned PGM into unconfounded genetic and environmental PGMs. Methods are based on zero partial correlation tests derived in the work by Ribeiro and Soler (2020).

Ribeiro, A. H. (2019). OmicsMA: An R Package for Variance-Preserving Estimation and Normalization of M-A Values from Omics Experiments. https://github.com/adele/omicsMA

This package provides methods for estimating and normalizing the M (intensity log-ratio) and A (mean log intensity) values from two-channel (or two-color) microarrays. Unlike conventional estimation methods which take into account only measures of location (e.g., mean and median) of the pixel intensities of each channel, the provided estimation method takes into account pixel-level variability, which may reflects uncertainties due noise and systematic artifacts.

Participation in Conferences

da Silva, T., Silva, E., Góis, A., Heider, D., Kaski, S. and Mesquita, D.*, Ribeiro, A. H. (2024). Human-Aided Discovery of Ancestral Graphs. LXAI Workshop at NeurIPS 2024 , Vancouver Convention Centre, Vancouver, British Columbia, Canada. (Poster Presentation)

We introduce Ancestral GFlowNets (AGFNs) as a new amortized inference method for sampling from a belief distribution on the space of ancestral graphs. We develop the first human-in-the-loop framework for ancestral causal discovery (CD). We design an optimal strategy for elicitation of an expert's feedback regarding the nature of a specific causal relationship among the observed variables, We demonstrate that our human-aided CD method drastically outperforms traditional CD algorithms after just a few expert interactions.

Ribeiro, A. H., Fehse, L., Winter, N., Welzel, M., Kircher, T., Thanarajah, S. E., Dannlowski, U., Heider, D., Hahn, T. (2024). Uncovering Gut Microbiota's Causal Role in Major Depressive Disorder. 13th Sino-German Frontiers of Science Symposium (SINOGFOS), Shanghai, China. Supported by the Chinese Academy of Sciences and Humboldt Foundation. (Poster Presentation)

Major Depressive Disorder (MDD) is a multifaceted mental health condition. Despite numerous studies highlighting a significant association between MDD and the gut microbiome, it remains unclear whether these associations play a causal role in MDD development. In this study, we conducted a differential abundance analysis (DAA) followed by a causal analysis of the DFG FOR2107 dataset (https://for2107.de/), which includes 1,269 patients. We highlight two important contributions: (1) Through a meticulous application of the FCI algorithm, we identified that Eggerthella and Hungatella causally contribute to MDD, while Coprobacilius indirectly causes MDD via Eggerthella. (2) Obesity not only affects MDD but also confounds between taxa variables and MDD. Using effect identification tools, we show the interventional probability of MDD increases with the abundance of Eggerthella and Hungatella.

Levshina, N., Ribeiro, A. H. (2023). Who did What to Whom: Measuring and explaining cross-linguistic differences. 10th International Contrastive Linguistics Conference (ICLC-10) Mannheim, Germany. July, 2023. (Conference Abstract)

Ribeiro, A. H., Sato, J. R., Fujita, A. (2018). Granger Causality Between Graphs and Applications in Functional Brain Networks. X-Meeting - 14th International Conference of the AB3C , October 24th - 26th, 2018, São Pedro, SP, Brazil. (Poster Presentation) - Best Poster Award

Networks are everywhere, from social to biological sciences. Usually these networks are represented by graphs, i.e., mathematical objects composed of a set of vertices and a set of edges. However, a vast number of natural networks are dynamic and current methods typically ignore a third key component: time. This fact requires statistical approaches to analyze them appropriately.

In this context, we propose a methodology to identify Granger causality among graphs. By assuming that graphs are generated by models whose parameters are random variables, we define that a time series of graphs y_{i,t} does not Granger cause another time series of graphs y_{j,t} if the parameters of the model for y_{i,t} does not Granger cause the parameters of the model for y_{j,t}. The problem is that the models that generate the graphs are usually unknown and consequently the parameters cannot be estimated. However, for some random graph models, such as Erdös-Rényi, geometric, regular, Watts-Strogatz, and Barabási-Albert, it is known that the spectral radius (the largest eigenvalue of the adjacency matrix of the graph) is a function of the model parameters. For example, for the Erdos-Renyi random graph model, which is defined by the parameters n, number of vertices, and p, probability of two random vertices are connected, the spectral radius is known to be np.

Based on this idea, we propose to identify Granger causality between time series of graphs by fitting a vector autoregressive model (VAR) to the time series of spectral radii. By an extensive simulation study, we show that the methodology has good accuracy, particularly for large graphs and long time series. In addition, we show that the spectral radius performed better than other centrality measures, such as, degree, eigenvector, betweenness, and closeness centralities. Finally, we applied the methodology to identify Granger causality between brain networks.

Ribeiro, A. H., Soler, J. M. P., Fujita, A. (2018). Learning Genetic and Environmental Causal Graphical Models in Family-Based Studies. Abstracts for the XIXth International Biometric Conference , July 8-13, 2018, Barcelona, Spain, International Biometric Society. Conference Abstract. (Contributed Talk)

To unravel the biological mechanism underlying complex traits and diseases, it is crucial to understand how the related phenotypes are associated with each other and how they are influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely used to describe relationships among variables (phenotypes) in a very intuitive and mathematically rigorous way. On the other hand, family-based studies are usually conducted to assess the influence of genetic and environmental factors on phenotypes. In this case, the polygenic model can be used to decompose the phenotypic variability into two variance components: one polygenic, for capturing the variability across families, and one environmental, for capturing the residual variability. Some algorithms for learning PGMs from observational data, known as structure learning algorithms, are strongly based on a conditional independence test. Considering the case where the observations are independent and pnormally distributed, the null hypothesis of conditional independence can be tested using classical tests for zero partial correlation and PGMs can be learned under Markov-properties equivalence. However, in family-based studies, measurements of related individuals are correlated and such dependence structure must be taken into account to obtain appropriate test statistics.

Based on the Gaussian univariate polygenic model, we derived an estimator for the partial correlation coefficient taking into account the family dependence structure and present a decomposition of the partial correlation coefficient according to the contribution of the genetic and environmental effects. Also, we derived zero partial correlation tests for these coefficients and extended the Meinshausen and Buhlmann (2006)'s approach, which learns undirected PGMs from Vertex Neighborhoods, and the IC (Pearl, 2000) / PC (Spirtes et al., 2000) algorithm, which learns directed PGMs, for learning genetic and environmental PGMs from observational family data. The performance of the proposed methodologies was assessed by using 100 replicates of simulated data, based on the Framingham Heart Study, provided by the Genetic Analysis Workshop (GAW) 13 in problem 2.

Soler, J. M. P., Ribeiro, A. H., Jahnke, M. R. (2017). A produção da cerveja produzindo conhecimento. 3o Congresso de Graduação da USP, São Paulo, SP, Brazil. (Poster Presentation)

A cerveja é parte da história da humanidade e remonta dos legados deixados pelos antigos sumérios, egípcios, mesopotâmios e ibéricos há, pelo menos, 6000 a.C. Apesar disso, longe de ser considerado um processo estável, a produção da cerveja evolui e aprimora-se constantemente, a ponto de, atualmente, motivar uma indústria artesanal em franca expansão que, devido às inúmeras fontes de variabilidade intrínsecas, potencializa o espírito curioso e criativo do alquimista e o refinamento sensorial de indivíduos, independentemente de idade, gênero, condição social, etc.

Identificamos nesse universo uma janela ampla para o despertar do entusiasmo ao aprendizado de alunos do 3o ano da Graduação em Estatística na disciplina de Planejamento de Experimentos (MAE 0317) que abraçaram, imediata e vigorosamente, a proposta de produzirem cerveja como veículo ilustrativo transversal dos conceitos e ferramentas abordados na disciplina. Assim, formalizamos um projeto conjunto, a ser planejado e executado durante o 1º semestre de 2017, em sala de aula e em campo, envolvendo o professor, alunos, a monitoria e especialistas na produção de cerveja.

A ideia é combinar estatisticamente respostas que mensuram a qualidade da cerveja, tais como, densidade, estabilidade da espuma e experiência sensorial (corpo, amargor, doçura, aroma, transparência, etc.) contra fatores de variabilidade que podem ser controlados experimentalmente, tais como, a temperatura de cozimento, a carbonatação e a maturação. Considerando os resultados preliminares obtidos até agora e as perspectivas manifestas, acreditamos que o projeto permite trabalhar a percepção do conteúdo da disciplina pelo aluno, de tal forma a transformar o aprendizado de conceitos teóricos densos em uma experiência prazerosa, estimulante e interativa.

Ribeiro, A. H., Soler, J. M. P., Fujita, A. (2016) A Comparative Study of Algorithms for Learning Causal Genotype–Phenotype Networks. Abstracts for the XXVIIIth International Biometric Conference , July 10-15, 2016, Victoria, British Columbia, Canada, International Biometric Society. ISBN 978-0-9821919-4-1. (Poster Presentation)

A challenging task in biomedical research is to understand precisely the complex network of causal associations among phenotypes and outcomes. Experimental studies such as clinical trials are the most trustworthy method of causality assessment. However, it may be unfeasible to carry out randomized experiments to discover all possible causal relationships when the number of variables is large. In systems genetics, causal inference is supported by Mendelian randomization, which provides a natural randomization process where genotypes, rather than treatments, are randomly allocated to individuals. Furthermore, genetic variants robustly associated with phenotypes can be seen as instrumental variables, allowing inferences on the causal relation between phenotypes and outcomes.

In this work, we made a comparative study among four recent algorithms that use genetic variants as instrumental variables for learning the structure of a genotype-phenotype network, namely, (i) QTL-directed Dependency Graph (QDG), (ii) QTL-driven phenotype network (QTLnet), (iii) Sparsity-aware Maximum Likelihood (SML), and (iv) QTL+Phenotype Supervised Orientation (QPSO). These algorithms are similar in the sense that they use QTL information to determine the causal direction among phenotypes. However, they were designed under different assumptions and therefore some may be more suitable than others for a particular biological application. By simulation studies, we investigated advantages and limitations of these methodologies, under different configurations. Finally, we applied the algorithms to real data involving cardiovascular phenotypes of F2 rats and compared the inferred causal networks.

Swinka, B. B., Carvalho, C. M., Weihermann, A, Schuck, D. C., Boldrini, N., Silva, V. V., Costa, M. T., Ribeiro, A. H., Fujita, A., Brohem C. A., and Lorencini M. (2015). Analysis of extracellular-matrix and cell-adhesion genes modulated by mechanical massage applied in combination with a cosmetic emulsion. Supplement issue of the Journal of Investigative Dermatology, Epidermal Structure & Barrier Function , v. 135, p. S58-S69, 74th Annual Meeting of the Society for Investigative Dermatology (SID 2015), Atlanta, GA, USA. Conference Abstract: https://doi.org/10.1038/jid.2015.71

Massage therapies are associated with pathological improvements, and have also been extensively used for esthetic purposes. This study aimed to evaluate part of the molecular mechanisms involved in massage by investigating modulation of gene expression associated with cell adhesion and the ECM (extracellular matrix) induced by esthetic massage combined with a cosmetic emulsion. Thirteen female volunteers clinically characterized as having grade II or III cellulite were recruited and were subjected to skin biopsies in the gluteofemoral region before and after treatment. Each volunteer’s leg was considered an experimental unit to reduce individual variability. The study population was divided into: (1) legs treated with a cosmetic emulsion and (2) legs treated with a cosmetic emulsion and massage. Examination of 84 genes analyzed by qPCR revealed a predominance of up-regulation in individuals treated with the emulsion and massage in comparison to individuals treated only with the emulsion (fold change > 1.5, and p < 0.05). The main genes modulated were: ECM proteases (ADAMTS8, MMP1, MMP3, MMP9 and MMP11), transmembrane molecules (HAS1, ITGAL), adhesion molecules (COL8A1 and LAMA1) and cell-matrix adhesion molecules (ADAMTS13). Concluding, the combination (cosmetic emulsion and massage) is therefore recommended to increase the effectiveness of a product and obtain the desired benefits in the treatment of skin disorders such as cellulite. The lack of scientific data on this technique can very often lead to skepticism among health professionals and even patients or consumers of cosmetic treatments. This study helps to elucidate some of the molecular phenomena associated with this therapy.

Ribeiro, A. H., Hirata Jr., R., Soler, J. M. P. (2014). Two-color microarray data analysis taking into account probe-level inaccuracies. ISCB-Latin America X-Meeting on Bioinformatics with BSB and SoiBio , Belo Horizonte, MG, Brazil. (Poster Presentation)

Most analyses of two-color microarray data are based on point estimation of the log-ratio of the two channel intensities. These estimates, commonly named M values, are conventionally obtained from some location measure of the pixel intensities of each channel, ignoring any imprecision. It is well known that the microarray technology is associated with many noise sources, and it has been shown that improved inferences can be obtained by including the inaccuracies involved and propagating them to downstream analysis. Using the multivariate delta method, we propose new estimators for the mean and the variance of the M values that take into account the probe-level inaccuracies in the analysis.

Invited Talks

Dec 2024	L3S Research Center, Leibniz University, and CAIMed L3S Research Center, Leibniz University, and Lower Saxony research Center for Artificial Intelligence and Causal Methods in Medicine (CAIMed), Hannover, Germany Ribeiro, A. H. From Theory to Practice: Advancing Causal Inference for Real-World Applications in Health Sciences.
Oct 2024	Seminar at Université Grenoble Alpes Institut d'Informatique et Mathématiques Appliquées de Grenoble (IMAG), France Ribeiro, A. H. Recent Advances in Causal Inference under Limited Domain Knowledge.
Jun 2024	TUM Seminar on Statistics and Data Science Department of Mathematics, Technical University of Munich (TUM), Germany Ribeiro, A. H. Recent Advances in Causal Inference under Limited Domain Knowledge
May 2024	68th Annual Meeting of RBras Brazilian Region of the International Biometrics Society (RBras), ESALQ/USP, in Piracicaba, SP, Brazil Ribeiro, A. H. From Observations to Causality: Recent Advances and Ongoing Challenges
Aug 2023	Seminar at FGV EMAp School of Applied Mathematics of Getulio Vargas Foundation (FGV EMAp), Rio de Janeiro, Brazil. Ribeiro, A. H. Recent Advances in Causal Inference under Limited Domain Knowledge.
Apr 2023	Workshop on Causal Representation Learning Max Planck Institute for Intelligent Systems, Tübingen, Germany Ribeiro, A. H.. Effect Identification in Cluster Causal Diagrams.
Aug 2022	DAAD Postdoc-NeT-AI Tour, Germany Institute of Information Systems & Institute for Medical Biometrics and Statistics at the University of Lübeck; Institute for Computational Systems Biology at the University of Hamburg; Centre for Cognitive Science at TU Darmstadt; Center for Systems Biology and Department of Computer Science at TU Dresden; and Helmholtz Center Munich Ribeiro, A. H.. Causal Inference from Observational Data in Partially Understood Domains.
Aug 2022	Future Bioinformatics Workshop, Germany Ribeiro, A. H.. Causal AI: Towards Explainable, Generalizable, and Trustworthy Decision-Making.
Jun 2022	Columbia DSI Scholars - Summer Research Bootcamp 2022 Data Science Institute, Columbia University Ribeiro, A. H. An Overview on Causal Data Science.
May 2022	Interinstitutional Graduate Program in Statistics (PIPGES) Federal University of São Carlos and University of São Paulo Ribeiro, A. H. Causal Effect Identification in Partially Understood Domains. (Talk on Youtube)
Mar 2022	Voices of Data Science at UMass Amherst Manning College of Information & Computer Sciences, University of Massachusetts Amherst Ribeiro, A. H.. On the Importance of Causal Inference in the Next Generation of Artificial Intelligence. (Talk on Youtube)
Mar 2022	Causal Inference Learning Group (CILG) Biostatistics Department, Mailman School of Public Health, Columbia University Ribeiro, A. H..Effect Identification in Cluster Causal Diagrams.
Dec 2021	WHY-21 at NeurIPS 2021 - Causal Inference & Machine Learning: Why now? Ribeiro, A. H.. Effect Identification in Cluster Causal Diagrams.
Nov 2021	Laboratory of Epidemiology & Population Science (LEPS) at the National Institute on Aging (NIA) Ribeiro, A. H.. Causal Inference and the Data-Fusion Problem
Nov 2021	OECD workshop on AI and the productivity of science. Ribeiro, A. H., Bareinboim, E.. Developing causal AI: its importance and an overview. (Talk on Youtube)
May 2019	Graduate Seminars Series - Statistics Federal University of Sao Carlos and University of Sao Paulo (UFSCar - USP), Sao Carlos, SP, Brazil Ribeiro, A. H.. Learning Genetic and Environmental Probabilistic Graphical Models from Gaussian Family Data.

Appearances in Popular Media

Oct 2021	“How I would like to continue my research... ” Interview by Klaus Rathje on the DAAD Postdoctoral Networking Tour "AI in Medicine".
May 2021	Developing and Applying Causal Inference Methods in Public Health Interview by Karina Alexanyan, Ph.D., for the Data Science Institute at Columbia University.

Invited Lectures and Short-Courses

Jul 2024	2nd European Summer School on Artificial Intelligence - ESSAI 2024 5-day Course Department of Informatics and Telecommunications National and Kapodistrian University of Athens, Athens, Greece. Ribeiro, A. H., Dhami, D., and Zecevic, M. Machines Climbing Pearl's Ladder of Causation. (Lectures on Youtube)
Jul 2024	14th Lisbon Machine Learning School - LxMLS 2024 3-hour Tutorial Instituto Superior Técnico, Lisbon, Portugal Ribeiro, A. H.. Introduction to Causal Inference. (Lecture on Youtube)
Jun 2024	Nordic Probabilistic AI School - ProbAI 2024 3-hour Tutorial Frederiksberg Campus of University of Copenhagen, Copenhagen, Denmark Ribeiro, A. H.. Introduction to Causal Inference. (Lecture on Youtube)
Jan 2024	Tropical Probabilistic AI School - Tropical ProbAI 2024 3-hour Tutorial Hosted with the EMAp FGV Summer School on Data Science 2024, Rio de Janeiro, Brazil Tutorial on GitHub. Ribeiro, A. H.. Introduction to Causal Inference.
Jul 2023	European Summer School on Artificial Intelligence - ESSAI 2023 5-day Course Faculty of Computer and Information Science, University of Ljubljana, Slovenia Ribeiro, A. H., Dhami, D., and Zecevic, M. Machines Climbing Pearl's Ladder of Causation.
Jul 2023	13rd Lisbon Machine Learning School - LxMLS 2023 3-hour Tutorial Instituto Superior Técnico, Lisbon, Portugal Ribeiro, A. H.. Causality and its Role in Reasoning, Explainability, and Generalizability. (Lecture on Youtube)
Jun 2023	Nordic Probabilistic AI School - ProbAI 2023 3-hour Tutorial Norwegian University of Science and Technology (NTNU), Trondheim, Norway Tutorial on GitHub. Ribeiro, A. H.. Causal Inference: Towards Explainable, Generalizable, and Trustworthy AI. (Lecture on Youtube)
Feb 2023	Continual Causality - Bridge Program at AAAI-2023 90-min Tutorial Walter E. Washington Convention Center, Washington DC, USA Ribeiro, A. H.. Putting the Causality in Continual Causality.
Jul 2022	12th Lisbon Machine Learning Summer School (LxMLS - 2022) Invited 3-hour Tutorial Ribeiro, A. H., Bareinboim, E.. Causal Data Science (Lecture on Youtube)
Sep 2021	Graduate Seminars Series - Statistics Statistics Department, University of Brasilia - UnB, Brasilia, Brazil Invited Lecture Ribeiro, A. H.. Causal Inference and Data-Fusion.
Jul 2021	11th Lisbon Machine Learning Summer School (LxMLS - 2021) Invited 3-hour Tutorial Ribeiro, A. H., Bareinboim, E.. Causal Data Science: An Introduction to Causal Inference and Data-Fusion. (Lecture on Youtube)
Jun 2021	Perspectives in Statistics Statistics Department, University of Sao Paulo (IME - USP), Sao Paulo, SP, Brazil Invited Lecture Ribeiro, A. H.. Causal Inference from Observational Studies
Dec 2020	Seventy-Sixth (76th) Annual Deming Conference on Applied Statistics. Invited 3-hour Tutorial Ribeiro, A. H., Adibuzzaman, M., Bareinboim, E.. Causal Inference in the Health Sciences.
Nov 2020	American Medical Informatics Association (AMIA 2020) Virtual Annual Symposium. Contributed 3.5-hour Tutorial Ribeiro, A. H., Adibuzzaman, M., Bareinboim, E.. Causal Inference in the Health Sciences.
Oct 2020	Graduate Seminars Series - Biostatistics and Biometrics Sao Paulo State University - UNESP, Botucatu, SP, Brazil Invited Lecture Ribeiro, A. H.. Causal Inference from Observational Studies
Jan 2017	Graduate Summer School - Sao Paulo State University - UNESP, Presidente Prudente, SP, Brazil 9-hour Short Course Ribeiro, A. H., Soler, J.M.P.. Dimensionality Reduction and Structure Learning with Applications to Genomics
May 2016	61a Reunião Anual da Região Brasileira da Sociedade Internacional de Biometria (RBras), Salvador, BA, Brazil 4-hour Short Course strong>Ribeiro, A. H., Soler, J.M.P.. Dimensionality Reduction Applied to Genomics

Teaching

Lecturer

Oct 2023 - Sep 2024	Department of Computer Science, Heinrich Heine University of Düsseldorf, Germany Courses: Causal Data Science (Course on Youtube); Topics in Causality.
Mar 2023 - Oct 2023	Department of Mathematics and Computer Science, Phillips University of Marburg, Germany Course: Causal Data Science: Theoretical Foundations and Algorithms.

Assistant Professor

Feb 2018 - Jul 2018

Computer Engineering Department - Institute of Education and Research (Insper), Sao Paulo, SP, Brazil.
Course: Software Design using Python

Teaching Assistant

Mar 2017 - Jul 2017	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Statistical Design of Experiments
Aug 2016 - Dec 2016	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Multivariate Data Analysis
Mar 2016 - Jul 2016	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Statistical Methods for Genetics and Genomics
Aug 2015 - Dec 2015	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Multivariate Data Analysis
Mar 2015 - Jul 2015	Architecture and Urbanism College - University of Sao Paulo (FAU-USP), Sao Paulo, SP, Brazil. Mathematics, Architecture and Design
Aug 2014 - Dec 2014	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Statistical Techniques, Programming and Simulation
Mar 2014 - Jul 2014	Institute of Astronomy, Geophysics and Atmospheric Sciences - University of Sao Paulo (IAG-USP), Sao Paulo, SP, Brazil. Numerical Calculus with Applications in Physics
Aug 2013 - Dec 2013	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Mathematical Modeling
Mar 2013 - Jul 2013	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Introduction to Computer Programming
Aug 2012 - Dec 2012	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Linear Programming
Mar 2012 - Jul 2012	Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil. Numerical Methods for Linear Algebra

About Me

Adèle Helena Ribeiro

About me

Research Interests

Resume

Education and Professional Preparation

Postdoctoral Scholar

Postdoctoral Scholar

Postdoctoral Scholar

Postdoctoral Scholar

Doctor of Philosophy inComputer Science

Doctoral Research Internship

Master of Science in Computer Science

Bachelor of Science in Computationaland Applied Mathematics

Fellowships and Scholarships

Publications

Open-Source Libraries

Participation in Conferences

Invited Talks

Appearances in Popular Media

Invited Lectures and Short-Courses

Teaching

Lecturer

Assistant Professor

Teaching Assistant

Doctor of Philosophy in
Computer Science

Master of Science in
Computer Science

Bachelor of Science in Computational
and Applied Mathematics