Data and Technology for Fact-checking (2019-2020)


Our society is struggling with an unprecedented amount of falsehoods, hyperboles and half-truths that do harm to democracy, health, economy and national security. Fact-checking is a vital defense against this onslaught. Despite the rise of fact-checking efforts globally, fact-checkers find themselves increasingly overwhelmed and their messages difficult to reach some segments of the public.

Building on a collaboration between Public Policy and Computer Science at Duke on computational journalism and fact-checking, this project seeks to leverage the power of data and computing to help make fact-checking and dissemination of fact-checks to the public more effective, scalable and sustainable.

Project Description

This Bass Connections project aims to build databases, systems and apps to achieve the following goals:

  1. Make fact-checkers more effective. By monitoring media and data sources and aggregating public interest, the project team hopes to identify important, check-worthy claims automatically and in real-time. This feed will decrease fact-checkers’ response time and guard against any potential bias (or perception thereof) in selecting what to fact-check.
  2. Help media consumers identify misinformation and disinformation faster, and make them feel like stakeholders in fact-checking. The team aims to make it easier for people to search for claims, and better yet, get alerted automatically as soon as they are exposed to misinformation. Usage data and feedback will in turn help identify check-worthy claims and diversify the coverage of fact-checking.
  3. Gain experience and learn lessons on building a sustainable, collaborative and inclusive ecosystem for fact-checking in the long run. The team will design an open data and system infrastructure, smart algorithms and best practices that will continue after the project to facilitate sharing and reuse of fact-checking efforts in the future.
  4. Make students aware of the grave new challenges of misinformation faced by our society today, and train them to become next-generation journalists and computer scientists to tackle these challenges.

Anticipated Outputs

A service that provides fact-checkers with live feeds of check-worthy claims automatically mined from various sources as well as public interest; apps and websites to help fact-checks reach the public; a system with open API access for technologists and journalists to collaborate on fact-checking; new Bass Connections course on computational fact-checking

Student Opportunities

Ideally, the project team will include 9-12 undergraduates and 3 graduate students with backgrounds in Computer Science, Electrical & Computer Engineering, Statistics, Public Policy or Political Science.

The majority of the team should have technical skills in computing and data analysis, but students with experience in journalism, policy and design are also encouraged to apply. Leadership skills and the ability to work with others with very different background and views would be a plus. The graduate students who serve as project group managers should have sufficient technical background in computing and data analysis.

Students will work in subgroups by tasks:

  • Data and infrastructure: data wrangling, speech-to-text, audio fingerprinting, information extraction, system scalability
  • Search and matching: matching fact-checks, drawing techniques from natural language processing and machine learning
  • Human-in-the-loop: innovative ideas for encouraging human participation and inputs, as well as apps and websites for fact-checkers and end-users.

Each group will have 3-5 undergraduates and will be managed by a graduate student. The entire team will meet weekly; each group will meet at least once more per week.

Faculty leaders plan to offer a new Bass Connections course (likely in Spring 2020) on computational fact-checking, cross-listed between PJMS and COMPSCI. If this course can be offered in the spring, students are expected to be enrolled in it. The class meetings may subsume the weekly whole-team meetings, but will not replace the group meetings.

Undergraduates will work closely with faculty and graduate student mentors, and interact with collaborators at other universities and Google. They will acquire relevant computational and data analysis techniques, and apply them to building databases and software in a highly collaborative setting. Besides honing computing skills, they will interact with researchers with diverse backgrounds and learn to appreciate the wide gamut of knowledge, skills and efforts required of fact-checkers and journalists. Students will also practice writing and presentation, targeting academic research venues as well as the public.

Like the undergraduates, graduate students will have an opportunity to collaborate with researchers in very different disciplines, and will have a chance of applying their research results and skills in a setting with immediate societal benefits. Moreover, they will also gain experience in managing team projects.

Student travel opportunities are to be determined.

There is an optional Summer 2019 component through the new Code+/CSURF summer program. Participating students will work full time over 10 weeks. There is no requirement that students participating in the Bass Connections project team take part in this summer program.


Summer 2019 – Spring 2020

  • Summer 2019 (Optional): Through a Code+/CSURF group, tackle the challenge of scaling up the system up to bring live “pop-up” fact-checking to a large number of users simultaneously
  • Fall 2019: Expand data collection efforts to improve accuracy of claim-matching algorithms, deploy end-to-end system and make API publicly available; release apps and website to public; make enhancements to system
  • Spring 2020: Evaluate effectiveness of project’s services, apps and websites in the field; make additional adjustments as needed; offer new Bass Connections course (tentative); disseminate results in academic research venues and to broader public


Independent study credit available for fall and spring semesters; summer funding available

See earlier related team, Data and Technology for Fact-checking (2018-2019), and a Data+ summer project, Data and Technology for Fact-checking (2018).


Image: Video still from Automated Fact-Checking App, by Duke University

Video still from Automated Fact-Checking App, by Duke University.

/faculty/staff Team Members

  • William Adair, Sanford School of Public Policy-DeWitt Wallace Center for Media and Democracy*
  • Pankaj Agarwal, Arts & Sciences-Computer Science*
  • Jun Yang, Arts & Sciences-Computer Science*