Skip to content
View skhan890's full-sized avatar

Block or report skhan890

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
skhan890/README.md

Sara Khan — Data Analyst Portfolio

Public Health · Clinical Analytics · Federal Contracting · R · SQL · Python

LinkedIn   GitHub   Portfolio


👋 Welcome

Data analyst with 9+ years of experience in public health surveillance, federal contracting, and nonprofit analytics — building CDC data pipelines, teaching R at Emory and the CDC, and finding the story hidden in messy data.

  • 🏥 CDC Global Immunization Division — Sr. Data Analyst (2016–2019)
  • 🎓 Emory Rollins School of Public Health — Adjunct Instructor, Biostatistics
  • 📊 CDC R Users Group — President & Trainer
  • 🌐 Dharma Relief — Founding Operations Lead; $649K raised, 1.2M masks delivered to 173 hospitals

🛠️ Tech Stack

Languages & Analysis

R Python SQL SAS Stata

Data Engineering & Infrastructure

ETL Pipelines  ·  AWS Docker GitHub REDCap

Visualization & Reporting

R Shiny ggplot2 Tableau Power BI ArcGIS

AI & Machine Learning

scikit-learn NLP Pipelines  ·  Anomaly Detection


📊 Project Portfolio

🦠 Global Polio Surveillance Pipeline

Production ETL pipeline for WHO/IMB across 79 countries. Invented a Spatial Binning method to fix a root data quality problem — one-size-fits-all metrics were hiding virus circulation in small provinces. Added 4 automated surveillance flags. Maintained on GitLab for 2 years. Published Vaccine X 2020.

Tools: R · SQL · WHO POLIS · WorldPop · GitLab  |  Code · Paper

Polio Surveillance Map


🏀 NBA × WNBA PER Dashboard

Interactive Shiny app comparing Player Efficiency Ratings across leagues. No public WNBA dataset existed — hand-scraped player data from Wikipedia + scripted Basketball-Reference.com. Presented at R-Ladies Atlanta × Atlanta Hawks.

Tools: R Shiny · ggplot2 · tidyverse · Web Scraping  |  Code · Demo

Hawks Dashboard


💛 Donor Revenue Analysis — Chan Center

Integrated 3 disconnected data sources (Donorbox, Luma, Excel) in R. Deduplicated donor identities, tracked payment history and tenure across records. Finding: only ~100 active donors. Directly drove new registration policy + membership survey.

Tools: R · Data Integration · Excel


💼 Experience

Role Organization Period
Analytics & Operations Lead Tallahassee Chan Center Feb 2021 – Mar 2026
Associate Consultant (Data Analyst) Gorman Consulting / Gates Foundation Oct 2019 – Sep 2021
Research Scientist University of Washington May – Sep 2019
Adjunct Faculty, Biostatistics Emory University, Rollins SPH Jan 2018 – Jan 2022
Senior Data Analyst (Federal Contractor) CDC, SciMetrika Jun 2017 – May 2019
Data Analyst (Federal Contractor) CDC, Carter Consulting Aug 2016 – Jun 2017
Statistical Programmer Alimera Sciences Jul – Jan 2017
Independent Data Analyst ThoughtBridge · BNL Consulting · FAMU 2019 – 2021

Highlights:

  • Chaired CDC R Users Group (1,000+ members across HHS); coordinated first agency-wide R training with 130 enrollees spanning all CDC campuses
  • Co-developed the Zika Pregnancy Registry database management system for national multi-site birth defect surveillance
  • Modernized a legacy SAS pipeline for Zika surveillance into R + SQL Server; built longitudinal newborn growth analysis in R Shiny
  • Conducted large-scale quantitative simulation modeling on ~800,000-entity network data for NIH R21 epidemiological modeling project (UW)
  • Sole instructor for BIOS 544: Introduction to R Programming at Emory; lab instructor for biostatistics courses serving ~650 incoming MPH/MSPH students annually

📚 Publications & Awards

Publications

  • VanderEnde KV, Voorman A, Khan S, Anand A, Snider C, Goel A, Wassilak S. New analytic approaches for analyzing and presenting polio surveillance data. Vaccine X. 2020. [PMC7090369]
  • Lind JN, Interrante JD, Ailes EC, Gilboa SM, Khan S et al. Maternal Use of Opioids During Pregnancy and Congenital Malformations: A Systematic Review. Pediatrics. 139(6), 2017.
  • Zaman K, Sack DA, Yunus M, Khan S et al. Immunogenicity of inactivated and live attenuated poliovirus vaccines in a birth cohort in Bangladesh. Vaccine. 2021.

Awards & Certifications

  • 🏅 2017 CDC Award for Science and Program Excellence in Emergency Response — Domestic Zika Pregnancy and Birth Defects Surveillance
  • 📋 Excellence in Volunteer Management — Florida Association for Volunteer Resource Management (FAVRM), 2025

🎓 Education

Degree Institution Year
MSPH, Public Health Informatics Emory University, Rollins School of Public Health 2016
BS, Public Health (cum laude) Rutgers University, Edward J. Bloustein School 2014

Capstone: Development of a Time Monitoring Application for Birth Defect Registries


🎤 Speaking

Talk Venue Year
Practicing Chan While Working Tallahassee Chan Center 2023
Sports Analytics & R Shiny R-Ladies Atlanta × Atlanta Hawks 2019
NBA/WNBA PER Methodology Emory Sports Analytics Team 2019

🏆 Volunteer & Leadership

  • Dharma Relief — Founding Operations Lead (2020–2026): Led COVID-19 PPE supply chain — raised $649K in 7 weeks, delivered 1.2 million masks to 173 hospitals across North America; secured transportation grant
  • IRC Refugee Resettlement Volunteer (2022–Present): Housing setup, tutoring, transportation, and community integration for refugee families

💡 Interests

Technology × mindfulness  ·  AI side projects  ·  video games  ·  meditation  ·  yoga  ·  hiking


GitHub Stats

Pinned Loading

  1. portfolio portfolio Public

    A portfolio of various data science/informatics ventures in R and beyond..

    HTML

  2. RLadies_Hawks RLadies_Hawks Public

    https://skhan22.shinyapps.io/RLadies_Hawks/

    R

  3. shamp-pass shamp-pass Public

    https://skhan890.github.io/shamp-pass/, forked from https://github.com/Athanasiamo/bookdown_password

    CSS

  4. surv_manuscript surv_manuscript Public

    R 1