Big Data for Reproductive Health (2021-2022)


Meeting the reproductive health needs of individuals and families – including access to family planning, prevention of cervical cancer and sexually transmitted infections, and safe motherhood – is a global health priority. Addressing the complex barriers to these needs will require a nuanced understanding of the target population. Data science methods may be an effective way to build understanding and identify solutions. While data science is increasingly used in global health, its application to reproductive health remains nascent.

The Big Data for Reproductive Health project began in 2018-2019, when the team developed a user-friendly app to display retrospective data from the Demographic and Health Surveys and engaged stakeholders in its design and implementation. In 2019-2020, the team partnered with stakeholders to build a precision medicine tool based on these data to support women in their contraceptive choices in low- and middle-income countries. The 2020-2021 team examined contemporary trends in contraceptive use as revealed by social media data.

Project Description 

In 2021-2022, this project team will apply data science to reproductive health issues through a variety of research projects in partnership with IntraHealth International’s digital health team and the Center for Global Reproductive Health’s Kenya-based research team. Research projects include:

  • Conflict and Contraception: This project investigates the impact of armed conflict on women’s contraceptive use, including onset, method switching, and discontinuation. Using big data sources, we explore trends in contraceptive use for women in the time preceding, during, and following the conflict period. 
  • Natural Language Processing (NLP): Two projects apply natural language processing to qualitative research already conducted by the Center for Global Reproductive Health, including 1) stigma associated with HPV and cervical cancer in Kenya; and 2) the impact of the U.S.’s global gag rule on nongovernmental organizations in Kenya. We will use quantitative methods like topic modeling to determine whether NLP can complement qualitative coding in reproductive health research. In conducting text mining and sentiment analysis and then comparing those results with the qualitative coding, we may be able to extract more data from these interviews and understand more about the accuracy of topic modeling in reproductive health research.

In addition to the aforementioned research projects, we will have opportunities for students to work with subject matter and methods experts to deepen their skills and identify new projects based on their own research interests.

Collectively, team members will explore data science methods, discuss new research ideas and consider how they can apply big data techniques and sources to existing research projects. They will also meet in small groups to work on tutorials, run code, scrape data and collaborate to develop and carry out their projects. The team will review previous Big Data for Reproductive Health projects, undergo advocacy training and learn how to use Demographic and Health Survey calendar data and the data visualization tool for research.

Throughout the year, the team will host data science incubator events to foster a community of practice and bridge the big data and reproductive health worlds. Team members will collaborate with invited speakers on reproductive health and will host a year-end workshop on using data science for reproductive health research, advocacy and policymaking. 

Learn more about this project team by viewing the team's video.

Anticipated Outputs

Novel models for evaluation of reproductive health challenges using data science techniques; two manuscripts for academic journals; two presentations at academic conferences


Fall 2021 – Spring 2022  

  • Fall 2021: Determine research workstreams by student interest, abilities and available opportunities; conduct literature review; host invited reproductive health and data science speakers; gain data science skills; apply data science methods to reproductive health research projects; present at International Conference on Family Planning
  • Spring 2022: Finalize analyses and submit abstracts; write manuscripts; host invited reproductive health speakers for feedback sessions; plan and cosponsor one-day workshop on using data science for reproductive health research, advocacy and policymaking; present at Women Delivery Conference

See earlier related team, Big Data for Reproductive Health (2020-2021).


Image: Reproductive Health in Burkina Faso, by Nairobi Summit on ICPD25, licensed under CC BY-NC-ND 2.0

Reproductive Health in Burkina Faso.

Team Leaders

  • Amy Finnegan, IntraHealth International
  • Megan Huchko, School of Medicine-Obstetrics and Gynecology
  • Kelly Hunter, Sanford School of Public Policy–Ph.D. Student

/undergraduate Team Members

  • Sunrita Gupta, Economics (BS)
  • Foxx Hart
  • Alexandra Lawrence, Statistical Science (BS)
  • Payton Little, Public Policy Studies (AB), Global Health (AB2)
  • Lauren Mitchell, Neuroscience (BS), Global Health (AB2)
  • Neha Shrishail, Neuroscience (BS)
  • Saisahana Subburaj, Program II (AB)
  • Linda Tang, Biology (BS), Statistical Science (BS2)
  • Shari Tian, Statistical Science (BS)
  • Bhamini Vellanki, Public Policy Studies (AB)
  • Aarushi Venkatakrishnan, Biology (BS), Computer Science (BS2)
  • Lynne Wang, Computer Science (BS)

/zcommunity Team Members

  • IntraHealth International