About Me

Adèle Helena Ribeiro

Postdoctoral Researcher
Causal AI Lab, Columbia University

About me

I am a Postdoctoral Research Scientist in the Causal Artificial Intelligence (Causal AI) Laboratory, where I work with Professor Elias Bareinboim. My research lies at the intersection of Computer Science, Statistics, and Artificial Intelligence in Healthcare. My efforts are focused on advancing the theory of causal inference and learning for discovering, generalizing, and personalizing cause-effect relationships from multiple observational and experimental data collections. I am also interested in the development and application of machine learning and AI tools equipped with causal and counterfactual reasoning for more fair, explainable, scalable, reliable, and personalized decision-making. I have a particular interest in applications in the Health Sciences and have directed my research towards addressing challenges that emerge in such domains to help bridge the gap between theory and practical applications.

Research Interests

  • Causal Inference
  • Explainable AI
  • Structure Learning
  • Deep Learning
  • Statistical Genetics
  • Multi-Omics Analysis
  • Computational Neuroscience
  • Health and Medical Research


Education and Professional Preparation

  • Sep 2019

    Postdoctoral Scholar

    Causal Artificial Intelligence Laboratory
    Data Science / Computer Science Institutes
    Columbia University
    New York, NY, USA
    Project: Causal Inference in the Health Sciences: from Biased and Heterogeneous Data Collections to Personalized and Improved Patient Outcomes.
    Advisor: Prof. Elias Bareinbom
  • Feb 2019

    Postdoctoral Scholar

    Laboratory of Genetics and Molecular Cardiology
    Heart Institute (InCor)
    University of Sao Paulo
    Sao Paulo, SP, Brazil
    Project: Deep Learning for 12-lead ECG Classification.
    Advisor: Prof. José Eduardo Krieger
  • Nov 2018

    Doctor of Philosophy in
    Computer Science

    Institute of Mathematics and Statistics
    University of Sao Paulo (IME-USP)
    Sao Paulo, SP, Brazil
    PhD's dissertation: Identification of Causality in Genetics and Neuroscience
    Advisor: Prof. André Fujita
    Co-Advisor: Prof. Júlia Maria Pavan Soler
  • Fall 2017

    Doctoral Research Internship

    Neuroscience Institute
    Princeton University
    Princeton, NJ, USA
    Project: Deep learning-based pose representation and dynamics modeling of marmoset monkeys.
    Advisor: Prof. Asif A. Ghazanfar
  • Jun 2014

    Master of Science in
    Computer Science

    Institute of Mathematics and Statistics
    University of Sao Paulo (IME-USP)
    Sao Paulo, SP, Brazil
  • Dec 2011

    Bachelor of Science in Computational
    and Applied Mathematics

    Institute of Mathematics and Statistics
    University of Sao Paulo (IME-USP)
    Sao Paulo, SP, Brazil
    Senior thesis: Analysis of Pyroelectric Infrared (PIR) sensor output signals.
    Advisor: Prof. Roberto Hirata Jr.

Fellowships and Scholarships

Sep 2021 DAAD Postdoc-NeT-AI Fellowship
DAAD Artificial Intelligence Networking (AInet) Fellowship
Federal Ministry of Education and Research, Germany
Sep 2019 - Aug 2020 Postdoctoral Research Fellowship
Causal Artificial Intelligence Lab
Department of Computer Science & Data Science Institute, Columbia University, New York, NY, USA
Jan 2019 - Aug 2019 Postdoctoral Research Fellowship
Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil
Sep 2017 - Dec 2017 PhD Visiting Student at Princeton University
Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil
Aug 2014 - Jul 2018 PhD Graduate Research Scholarship
Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil
Mar 2012 - Feb 2014 MSc Graduate Research Scholarship
Coordination for the Improvement of Higher Education Personnel (CAPES), Brazil


One common task in many data sciences applications is to answer questions about the effect of new interventions, like: what would happen to Y if we make X equal to x while observing covariates Z = z?. Formally, this is known as conditional effect identification, where the goal is to determine whether a post-interventional distribution is computable from the combination of an observational distribution and assumptions about the underlying domain represented by a causal diagram. A plethora of methods was developed for solving this problem, including the celebrated do-calculus [Pearl, 1995]. In practice, these results are not always applicable since they require a fully specified causal diagram as input, which is usually not available. In this paper, we assume as the input of the task a less informative structure known as a partial ancestral graph (PAG), which represents a Markov equivalence class of causal diagrams, learnable from observational data. We make the following contributions under this relaxed setting. First, we introduce a new causal calculus, which subsumes the current state-of-the-art, PAG-calculus. Second, we develop an algorithm for conditional effect identification given a PAG and prove it to be both sound and complete. In words, failure of the algorithm to identify a certain effect implies that this effect is not identifiable by any method. Third, we prove the proposed calculus to be complete for the same task.
One pervasive task found throughout the empirical sciences is to determine the effect of interventions from non-experimental data. It is well-understood that assumptions are necessary to perform such causal inferences, an idea popularized through Cartwright's motto: "no causes-in, no causes-out." One way of articulating these assumptions is through the use of causal diagrams, which are a special type of graphical model with causal semantics [Pearl, 2000]. The graphical approach has been applied successfully in many settings, but there are still challenges to its use, particularly in complex, high-dimensional domains. In medicine, for example, background knowledge may exist about the relationships among a subset of variables, but usually not all of them. In this paper, we introduce cluster causal diagrams (for short, C-DAGs) to allow for the representation of partial understanding of the relationships among variables, which has the potential of alleviating DAGs' somewhat stringent requirements. C-DAGs provide a simple yet effective way to partially abstract a grouping of variables among which causal relationships are not fully understood. Our goal is to develop machinery to reason on top of C-DAG's new representation. In particular, we first define a new version of the d-separation criterion, and prove its soundness and completeness. Second, we extend these new separation rules and prove the validity of the corresponding do-calculus. Lastly, we show that a standard identification algorithm can systematically compute causal effects from observational data with cluster causal diagrams.
Atrial fibrillation (AF) is a common arrhythmia (0.5% worldwide prevalence) associated with an increased risk of various cardiovascular disorders, including stroke. Automated routine AF detection by Electrocardiogram (ECG) is based on the analysis of one-dimensional ECG signals and requires dedicated software for each type of device, limiting its wide use, especially with the rapid incorporation of telemedicine into the healthcare system. Here, we implement a machine learning method for AF classification using the region of interest (ROI) corresponding to the long DII lead automatically extracted from DI-COM 12-lead ECG images. We observed 94.3%, 98.9%, 99.1%, and 92.2% for sensitivity, specificity, AUC, and F1 score, respectively. These results indicate that the proposed methodology performs similar to one-dimensional ECG signals as input, but does not require a dedicated software facilitating the integration into clinical practice, as ECGs are typically stored in PACS as 2D images.
Graphs/networks have become a powerful analytical approach for data modeling. Besides, with the advances in sensor technology, dynamic time-evolving data have become more common. In this context, one point of interest is a better understanding of the information flow within and between networks. Thus, we aim to infer Granger causality (G-causality) between networks' time series. In this case, the straightforward application of the well-established vector autoregressive model is not feasible. Consequently, we require a theoretical framework for modeling time-varying graphs. One possibility would be to consider a mathematical graph model with time-varying parameters (assumed to be random variables) that generates the network. Suppose we identify G-causality between the graph models' parameters. In that case, we could use it to define a G-causality between graphs. Here, we show that even if the model is unknown, the spectral radius is a reasonable estimate of some random graph model parameters. We illustrate our proposal's application to study the relationship between brain hemispheres of controls and children diagnosed with Autism Spectrum Disorder (ASD). We show that the G-causality intensity from the brain's right to the left hemisphere is different between ASD and controls.
Many challenging problems in biomedical research rely on understanding how variables are associated with each other and influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely acknowledged as a very natural and formal language to describe relationships among variables and have been extensively used for studying complex diseases and traits. In this work, we propose methods that leverage observational Gaussian family data for learning a decomposition of undirected and directed acyclic PGMs according to the influence of genetic and environmental factors. Many structure learning algorithms are strongly based on a conditional independence test. For independent measurements of normally distributed variables, conditional independence can be tested through standard tests for zero partial correlation. In family data, the assumption of independent measurements does not hold since related individuals are correlated due to mainly genetic factors. Based on univariate polygenic linear mixed models, we propose tests that account for the familial dependence structure and allow us to assess the significance of the partial correlation due to genetic (between-family) factors and due to other factors, denoted here as environmental (within-family) factors, separately. Then, we extend standard structure learning algorithms, including the IC/PC and the really fast causal inference (RFCI) algorithms, to Gaussian family data. The algorithms learn the most likely PGM and its decomposition into two components, one explained by genetic factors and the other by environmental factors. The proposed methods are evaluated by simulation studies and applied to the Genetic Analysis Workshop 13 simulated dataset, which captures significant features of the Framingham Heart Study.
Faced with the lack of reliability and reproducibility in omics studies, more careful and robust methods are needed to overcome the existing challenges in the multi-omics analysis. In conventional omics data analysis, signal intensity values (denoted by M and values) are estimated neglecting pixel-level uncertainties, which may reflect noise and systematic artifacts. For example, intensity values from two-color microarray data are estimated by taking the mean or median of the pixel intensities within the spot and then subjected to a within-slide normalization by LOWESS. Thus, focusing on estimation and normalization of gene expression profiles, we propose a spot quantification method that takes into account pixel-level variability. Also, to preserve relevant variation that may be removed in LOWESS normalization with poorly chosen parameters, we propose a parameter selection method that is parsimonious and considers intrinsic characteristics of microarray data, such as heteroskedasticity. The usefulness of the proposed methods is illustrated by an application to real intestinal metaplasia data. Compared with the conventional approaches, the analysis is more robust and conservative, identifying fewer but more reliable differentially expressed genes. Also, the variability preservation allowed the identification of new differentially expressed genes. Using the proposed approach, we have identified differentially expressed genes involved in pathways in cancer and confirmed some molecular markers already reported in the literature.
Causal inference may help us to understand the underlying mechanisms and the risk factors of diseases. In Genetics, it is crucial to understand how the connectivity among variables is influenced by genetic and environmental factors. Family data have proven to be useful in elucidating genetic and environmental influences, however, few existing approaches are able of addressing structure learning of probabilistic graphical models (PGMs) and family data analysis jointly. We propose methodologies for learning, from observational Gaussian family data, the most likely PGM and its decomposition into genetic and environmental components. They were evaluated by a simulation study and applied to the Genetic Analysis Workshop 13 simulated data, which mimic the real Framingham Heart Study data, and to the metabolic syndrome phenotypes from the Baependi Heart Study. In neuroscience, one challenge consists in identifying interactions between functional brain networks (FBNs) - graphs. We propose a method to identify Granger causality among FBNs. We show the statistical power of the proposed method by simulations and its usefulness by two applications: the identification of Granger causality between the FBNs of two musicians playing a violin duo, and the identification of a differential connectivity from the right to the left brain hemispheres of autistic subjects.
Blood pressure (BP) is associated with carotid intima-media thickness (CIMT), but few studies have explored the association between BP variability and CIMT. We aimed to investigate this association in the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) baseline. We found a small but significant association between SBP variability and CIMT values. This was additive to the association between SBP central tendency and CIMT values, supporting a role for high short-term SBP variability in atherosclerosis.
A major challenge in biomedical research is to identify causal relationships among genotypes, phenotypes, and clinical outcomes from high-dimensional measurements. Causal networks have been widely used in systems genetics for modeling gene regulatory systems and for identifying causes and risk factors of diseases. In this chapter, we describe fundamental concepts and algorithms for constructing causal networks from observational data. In biological context, causal inferences can be drawn from the natural experimental setting provided by Mendelian randomization, a term that refers to the random assignment of genotypes at meiosis. We show that genetic variants may serve as instrumental variables, improving estimation accuracy of the causal effects. In addition, identifiability issues that commonly arise when learning network structures may be overcome by using prior information on genotype–phenotype relations.
Any measurement, since it is made for a real instrument, has an uncertainty associated with it. In the present work, we address this issue of uncertainty in two-channel cDNA Microarray experiments, a technology that has been widely used in recent years and is still an important tool for gene expression studies. Tens of thousands of gene representatives are printed onto a glass slide and hybridized simultaneously with mRNA from two different cell samples. Different fluorescent dyes are used for labeling both samples. After hybridization, the glass slide is scanned yielding two images. Image processing and analysis programs are used for spot segmentation and pixel statistics computation, for instance, the mean, median and variance of pixel intensities for each spot. The same statistics are computed for the pixel intensities in the background region. Statistical estimators such as the variance gives us an estimate of the accuracy of a measurement. Based on the intensity estimates for each spot, some data transformations are applied in order to eliminate systematic variability so we can obtain the effective gene expression. This paper shows how to analyze gene expression measurements with an estimated error. We presented an estimate of this uncertainty and we studied, in terms of error propagation, the effects of some data transformations. An example of data transformation is the correction of the bias estimated by a robust local regression method, also known as lowess. With the propagated errors obtained, we also showed how to use them for detecting differentially expressed genes between different conditions. Finally, we compared the results with those obtained by classical analysis methods, in which the measurement errors are disregarded. We conclude that modeling the measurements uncertainties can improve the analysis, since the results obtained in a real gene expressions data base were consistent with the literature.

Open-Source Libraries

This package provides methods for learning, from observational Gaussian family data (i.e., Gaussian data clusterized in families), Gaussian undirected and directed acyclic PGMs describing linear relationships among multiple phenotypes and a decomposition of the learned PGM into unconfounded genetic and environmental PGMs. Methods are based on zero partial correlation tests derived in the work by Ribeiro and Soler (2020).
This package provides methods for estimating and normalizing the M (intensity log-ratio) and A (mean log intensity) values from two-channel (or two-color) microarrays. Unlike conventional estimation methods which take into account only measures of location (e.g., mean and median) of the pixel intensities of each channel, the provided estimation method takes into account pixel-level variability, which may reflects uncertainties due noise and systematic artifacts.

Participation in Conferences

Networks are everywhere, from social to biological sciences. Usually these networks are represented by graphs, i.e., mathematical objects composed of a set of vertices and a set of edges. However, a vast number of natural networks are dynamic and current methods typically ignore a third key component: time. This fact requires statistical approaches to analyze them appropriately.

In this context, we propose a methodology to identify Granger causality among graphs. By assuming that graphs are generated by models whose parameters are random variables, we define that a time series of graphs y_{i,t} does not Granger cause another time series of graphs y_{j,t} if the parameters of the model for y_{i,t} does not Granger cause the parameters of the model for y_{j,t}. The problem is that the models that generate the graphs are usually unknown and consequently the parameters cannot be estimated. However, for some random graph models, such as Erdös-Rényi, geometric, regular, Watts-Strogatz, and Barabási-Albert, it is known that the spectral radius (the largest eigenvalue of the adjacency matrix of the graph) is a function of the model parameters. For example, for the Erdos-Renyi random graph model, which is defined by the parameters n, number of vertices, and p, probability of two random vertices are connected, the spectral radius is known to be np.

Based on this idea, we propose to identify Granger causality between time series of graphs by fitting a vector autoregressive model (VAR) to the time series of spectral radii. By an extensive simulation study, we show that the methodology has good accuracy, particularly for large graphs and long time series. In addition, we show that the spectral radius performed better than other centrality measures, such as, degree, eigenvector, betweenness, and closeness centralities. Finally, we applied the methodology to identify Granger causality between brain networks.
To unravel the biological mechanism underlying complex traits and diseases, it is crucial to understand how the related phenotypes are associated with each other and how they are influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely used to describe relationships among variables (phenotypes) in a very intuitive and mathematically rigorous way. On the other hand, family-based studies are usually conducted to assess the influence of genetic and environmental factors on phenotypes. In this case, the polygenic model can be used to decompose the phenotypic variability into two variance components: one polygenic, for capturing the variability across families, and one environmental, for capturing the residual variability. Some algorithms for learning PGMs from observational data, known as structure learning algorithms, are strongly based on a conditional independence test. Considering the case where the observations are independent and pnormally distributed, the null hypothesis of conditional independence can be tested using classical tests for zero partial correlation and PGMs can be learned under Markov-properties equivalence. However, in family-based studies, measurements of related individuals are correlated and such dependence structure must be taken into account to obtain appropriate test statistics.

Based on the Gaussian univariate polygenic model, we derived an estimator for the partial correlation coefficient taking into account the family dependence structure and present a decomposition of the partial correlation coefficient according to the contribution of the genetic and environmental effects. Also, we derived zero partial correlation tests for these coefficients and extended the Meinshausen and Buhlmann (2006)'s approach, which learns undirected PGMs from Vertex Neighborhoods, and the IC (Pearl, 2000) / PC (Spirtes et al., 2000) algorithm, which learns directed PGMs, for learning genetic and environmental PGMs from observational family data. The performance of the proposed methodologies was assessed by using 100 replicates of simulated data, based on the Framingham Heart Study, provided by the Genetic Analysis Workshop (GAW) 13 in problem 2.
A cerveja é parte da história da humanidade e remonta dos legados deixados pelos antigos sumérios, egípcios, mesopotâmios e ibéricos há, pelo menos, 6000 a.C. Apesar disso, longe de ser considerado um processo estável, a produção da cerveja evolui e aprimora-se constantemente, a ponto de, atualmente, motivar uma indústria artesanal em franca expansão que, devido às inúmeras fontes de variabilidade intrínsecas, potencializa o espírito curioso e criativo do alquimista e o refinamento sensorial de indivíduos, independentemente de idade, gênero, condição social, etc.

Identificamos nesse universo uma janela ampla para o despertar do entusiasmo ao aprendizado de alunos do 3o ano da Graduação em Estatística na disciplina de Planejamento de Experimentos (MAE 0317) que abraçaram, imediata e vigorosamente, a proposta de produzirem cerveja como veículo ilustrativo transversal dos conceitos e ferramentas abordados na disciplina. Assim, formalizamos um projeto conjunto, a ser planejado e executado durante o 1º semestre de 2017, em sala de aula e em campo, envolvendo o professor, alunos, a monitoria e especialistas na produção de cerveja.

A ideia é combinar estatisticamente respostas que mensuram a qualidade da cerveja, tais como, densidade, estabilidade da espuma e experiência sensorial (corpo, amargor, doçura, aroma, transparência, etc.) contra fatores de variabilidade que podem ser controlados experimentalmente, tais como, a temperatura de cozimento, a carbonatação e a maturação. Considerando os resultados preliminares obtidos até agora e as perspectivas manifestas, acreditamos que o projeto permite trabalhar a percepção do conteúdo da disciplina pelo aluno, de tal forma a transformar o aprendizado de conceitos teóricos densos em uma experiência prazerosa, estimulante e interativa.
A challenging task in biomedical research is to understand precisely the complex network of causal associations among phenotypes and outcomes. Experimental studies such as clinical trials are the most trustworthy method of causality assessment. However, it may be unfeasible to carry out randomized experiments to discover all possible causal relationships when the number of variables is large. In systems genetics, causal inference is supported by Mendelian randomization, which provides a natural randomization process where genotypes, rather than treatments, are randomly allocated to individuals. Furthermore, genetic variants robustly associated with phenotypes can be seen as instrumental variables, allowing inferences on the causal relation between phenotypes and outcomes.

In this work, we made a comparative study among four recent algorithms that use genetic variants as instrumental variables for learning the structure of a genotype-phenotype network, namely, (i) QTL-directed Dependency Graph (QDG), (ii) QTL-driven phenotype network (QTLnet), (iii) Sparsity-aware Maximum Likelihood (SML), and (iv) QTL+Phenotype Supervised Orientation (QPSO). These algorithms are similar in the sense that they use QTL information to determine the causal direction among phenotypes. However, they were designed under different assumptions and therefore some may be more suitable than others for a particular biological application. By simulation studies, we investigated advantages and limitations of these methodologies, under different configurations. Finally, we applied the algorithms to real data involving cardiovascular phenotypes of F2 rats and compared the inferred causal networks.
Massage therapies are associated with pathological improvements, and have also been extensively used for esthetic purposes. This study aimed to evaluate part of the molecular mechanisms involved in massage by investigating modulation of gene expression associated with cell adhesion and the ECM (extracellular matrix) induced by esthetic massage combined with a cosmetic emulsion. Thirteen female volunteers clinically characterized as having grade II or III cellulite were recruited and were subjected to skin biopsies in the gluteofemoral region before and after treatment. Each volunteer’s leg was considered an experimental unit to reduce individual variability. The study population was divided into: (1) legs treated with a cosmetic emulsion and (2) legs treated with a cosmetic emulsion and massage. Examination of 84 genes analyzed by qPCR revealed a predominance of up-regulation in individuals treated with the emulsion and massage in comparison to individuals treated only with the emulsion (fold change > 1.5, and p < 0.05). The main genes modulated were: ECM proteases (ADAMTS8, MMP1, MMP3, MMP9 and MMP11), transmembrane molecules (HAS1, ITGAL), adhesion molecules (COL8A1 and LAMA1) and cell-matrix adhesion molecules (ADAMTS13). Concluding, the combination (cosmetic emulsion and massage) is therefore recommended to increase the effectiveness of a product and obtain the desired benefits in the treatment of skin disorders such as cellulite. The lack of scientific data on this technique can very often lead to skepticism among health professionals and even patients or consumers of cosmetic treatments. This study helps to elucidate some of the molecular phenomena associated with this therapy.
Most analyses of two-color microarray data are based on point estimation of the log-ratio of the two channel intensities. These estimates, commonly named M values, are conventionally obtained from some location measure of the pixel intensities of each channel, ignoring any imprecision. It is well known that the microarray technology is associated with many noise sources, and it has been shown that improved inferences can be obtained by including the inaccuracies involved and propagating them to downstream analysis. Using the multivariate delta method, we propose new estimators for the mean and the variance of the M values that take into account the probe-level inaccuracies in the analysis.

Invited Talks and Tutorials

Jul 2022 12th Lisbon Machine Learning Summer School (LxMLS - 2022)
Invited 3-hour Tutorial
Ribeiro, A. H., Bareinboim, E.. Causal Data Science
Jun 2022 Columbia DSI Scholars - Summer Research Bootcamp 2022
Data Science Institute, Columbia University
Invited Talk
Ribeiro, A. H. An Overview on Causal Data Science.
May 2022 Interinstitutional Graduate Program in Statistics (PIPGES)
Federal University of São Carlos and University of São Paulo
Invited Talk
Ribeiro, A. H. Causal Effect Identification in Partially Understood Domains.
Mar 2022 Voices of Data Science at UMass Amherst
Manning College of Information & Computer Sciences, University of Massachusetts Amherst
Invited Talk
Ribeiro, A. H.. On the Importance of Causal Inference in the Next Generation of Artificial Intelligence.
Mar 2022 Causal Inference Learning Group
Biostatistics Department, Mailman School of Public Health, Columbia University
Invited Talk
Ribeiro, A. H..Effect Identification in Cluster Causal Diagrams.
Dec 2021 WHY-21 at NeurIPS 2021 - Causal Inference & Machine Learning: Why now?
Invited Talk
Ribeiro, A. H.. Effect Identification in Cluster Causal Diagrams.
Nov 2021 Laboratory of Epidemiology & Population Science (LEPS) at the National Institute on Aging (NIA)
Invited Talk
Ribeiro, A. H.. Causal Inference and the Data-Fusion Problem
Nov 2021 OECD workshop on AI and the productivity of science.
Invited Talk
Ribeiro, A. H., Bareinboim, E.. Developing causal AI: its importance and an overview.
Sep 2021 Graduate Seminars Series - Statistics
Statistics Department, University of Brasilia - UnB, Brasilia, Brazil
Invited Lecture
Ribeiro, A. H.. Causal Inference and Data-Fusion.
Jul 2021 11th Lisbon Machine Learning Summer School (LxMLS - 2021)
Invited 3-hour Tutorial
Ribeiro, A. H., Bareinboim, E.. Causal Data Science: An Introduction to Causal Inference and Data-Fusion.
Jun 2021 Perspectives in Statistics
Statistics Department, University of Sao Paulo (IME - USP), Sao Paulo, SP, Brazil
Invited Lecture
Ribeiro, A. H.. Causal Inference from Observational Studies
Dec 2020 Seventy-Sixth (76th) Annual Deming Conference on Applied Statistics.
Invited 3-hour Tutorial
Ribeiro, A. H., Adibuzzaman, M., Bareinboim, E.. Causal Inference in the Health Sciences.
Nov 2020 American Medical Informatics Association (AMIA 2020) Virtual Annual Symposium.
Contributed 3.5-hour Tutorial
Ribeiro, A. H., Adibuzzaman, M., Bareinboim, E.. Causal Inference in the Health Sciences.
Oct 2020 Graduate Seminars Series - Biostatistics and Biometrics
Sao Paulo State University - UNESP, Botucatu, SP, Brazil
Invited Lecture
Ribeiro, A. H.. Causal Inference from Observational Studies
May 2019 Graduate Seminars Series - Statistics
Federal University of Sao Carlos and University of Sao Paulo (UFSCar - USP), Sao Carlos, SP, Brazil
Invited Lecture
Ribeiro, A. H.. Learning Genetic and Environmental Probabilistic Graphical Models from Gaussian Family Data.
Jan 2017 Graduate Summer School - Sao Paulo State University - UNESP, Presidente Prudente, SP, Brazil
9-hour Short Course
Ribeiro, A. H., Soler, J.M.P.. Dimensionality Reduction and Structure Learning with Applications to Genomics
May 2016 61a Reunião Anual da Região Brasileira da Sociedade Internacional de Biometria (RBras), Salvador, BA, Brazil
4-hour Short Course
Ribeiro, A. H., Soler, J.M.P.. Dimensionality Reduction Applied to Genomics

Appearances in Popular Media

Oct 2021 “How I would like to continue my research... ”
Interview by Klaus Rathje on the DAAD Postdoctoral Networking Tour "AI in Medicine".
May 2021 Developing and Applying Causal Inference Methods in Public Health
Interview by Karina Alexanyan, Ph.D., for the Data Science Institute at Columbia University.


Assistant Professor

Feb 2018 - Jul 2018 Computer Engineering Department - Institute of Education and Research (Insper), Sao Paulo, SP, Brazil.
Software Design using Python

Teaching Assistant

Mar 2017 - Jul 2017 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Statistical Design of Experiments
Aug 2016 - Dec 2016 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Multivariate Data Analysis
Mar 2016 - Jul 2016 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Statistical Methods for Genetics and Genomics
Aug 2015 - Dec 2015 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Multivariate Data Analysis
Mar 2015 - Jul 2015 Architecture and Urbanism College - University of Sao Paulo (FAU-USP), Sao Paulo, SP, Brazil.
Mathematics, Architecture and Design
Aug 2014 - Dec 2014 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Statistical Techniques, Programming and Simulation
Mar 2014 - Jul 2014 Institute of Astronomy, Geophysics and Atmospheric Sciences - University of Sao Paulo (IAG-USP), Sao Paulo, SP, Brazil.
Numerical Calculus with Applications in Physics
Aug 2013 - Dec 2013 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Mathematical Modeling
Mar 2013 - Jul 2013 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Introduction to Computer Programming
Aug 2012 - Dec 2012 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Linear Programming
Mar 2012 - Jul 2012 Institute of Mathematics and Statistics - University of Sao Paulo (IME-USP), Sao Paulo, SP, Brazil.
Numerical Methods for Linear Algebra