Roni Kobrosly: Introduction to Causal Inference | PDNYC 2022

Описание к видео Roni Kobrosly: Introduction to Causal Inference | PDNYC 2022

This tutorial session is intended to give attendees a gentle introduction to applying causal thinking and causal inference to data using python. Causal data analysis is very common in many academic domains (e.g. in social psychology, epidemiology, macroeconomics, public policy research, sociology, and more) as well as in industry (all of the largest Silicon Valley tech companies employ teams of scientists who answer business questions purely with causal inference methods).

The tutorial will involve a combination of a presentation with open Q&A and group exercises contained in Jupyter notebooks. This session will cover the difference between correlation and causation, the pitfalls of conducting an analysis using observational data, how causal inference can help get around these pitfalls, and two examples of common, modern modeling approaches used to conduct causal inference (g-computation and estimating causal curves). After the tutorial, the attendees should have a good foundational understanding of causality and the ability to confidently explore the topic on their own. Causal inference can be a very theory-heavy topic, making it impenetrable to novices. In this tutorial, I'll aim to take a more practical perspective on causal inference, while still occasionally touching on the theory.

Tutorial Outline:

Introduction (15 min):
-"By the end of the tutorial you should be able to..."
-Motivating problem: vitamin D and COVID severity
-How causal inference questions differ from standard machine learning questions
-Experiments vs causal inference

Causal graphs and the four types of relationships to know (30 min):
-What is a "confounder"
-What is a "collider"
-What is a "mediator"
-What are "unrelated predictors"

Hands on exercise 1: G-computation (20 min)

Hands on exercise 2: Causal curves (20 min)

Closing thoughts (5 min):
-Tips for troubleshooting your own analyses
-Avoid multiple testing!
-Be humble. It is likely your research or business idea doesn’t work 🤷🏻

Bio:
Roni Kobrosly
I am a former epidemiology researcher who has spent approximately a decade employing causal modeling and inference. The bulk of my academic career was spent conducting data analyses to estimate the population-level effects of harmful environment exposures, when traditional randomized experiments were infeasible or unethical.

Since leaving the academic world, I've been loving my second life in the tech industry as a data scientist, ML engineer, and more recently as the Head of Data Science at a medium-sized health tech company based in Washington DC. I love mentoring junior data folks and explaining the magic of data analysis and modeling to non-technical audience.

I also am a member of the open-source community, being the author and maintainer of the causal-curve python package. This package provides a set of tools for estimating the causal impact of continuous/non-binary treatments (e.g. estimating the causal impact of a neighborhood's income inequality on local crime, or understanding the causal effect of increasing a product's price on conversion rates).

===

www.pydata.org

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...

Комментарии

Информация по комментариям в разработке