SSNAP: Scientific Social Network Analysis Project (2016-2017)


Crane’s seminal work on Invisible Colleges (1972) made it clear that modern scientific production is a collective affair: scientists build on each other’s work and collaborate to form latent communities around methods, topics and ideas. This collective activity forms a network of scientists that captures the social substrate of scientific productivity. The collective nature of scientific production has grown over time, with larger scientific teams, greater interdisciplinary production and rapid expansion of electronic collaboration as well as the development of collective knowledge products such as Wikipedia.

Project Description

This project’s goals are to understand the social dynamics of scientific production by modeling and mapping the social and topical structure of science production across the social and natural sciences. We aim to accomplish this goal by building representative corpuses for multiple broad topical areas (“fields”) and constructing the dynamic collaboration and bibliographic networks within each field.

Team members will come to understand the collaborative and organic nature of scientific production and field debates, becoming better informed about the nature of scientific progress and debates. They will learn methods for large-scale data collection, managing large free-form data files (electronic publication and grant records), including techniques for text parsing and disambiguation. They will learn social network analysis, including how to construct networks, build measures on networks and statistically model network dynamics. As collaborators on papers or proposals coming out of the project, they will gain hands-on research production and writing experience.

All teams will combine during the first semester for a set of lectures that will lay out basic theory and computation for network models, including statistical and sociological perspectives that shape this effort. Additionally, there will be focused training in programming tools needed for this research.

Anticipated Outcomes

Anticipated outcomes include manuscripts for publication, an NSF grant application and production of a collective data resource.

We will provide a shared data resource for affiliated faculty and students interested in using these data for general network modeling and research. We anticipate multiple papers on the scientific integration of fields and debates (topics will depend on student interest and ultimate fields, but likely include scientific integration in the social sciences, modeling innovation and recognition, race/gender integration in science communities and descriptive portraits aimed at the sociology of science outlets).


Summer 2016 – Spring 2017

We anticipate being able to start data collection and curation over the summer of 2016, build the collaboration and citation networks within fields in the fall and continue with statistical modeling of these networks in the spring. Our milestones will include presenting descriptive portraits of the disciplinary fields at DNAC in the fall and submitting an NSF proposal to extend this work to a wider field with broader temporal coverage by the spring NSF deadline.


Independent study credit available during the fall and spring semesters; summer stipend available

This Team in the News

Webs of Minds and Ideas Bind Duke’s Campus

Faculty/Staff Team Members

Lawrence Appelbaum, School of Medicine - Psychiatry & Behavioral Sciences- Brain Stimulation & Neurophysiology*
Deborah Attix, School of Medicine - Neurology, Psychiatry & Behavioral Science; DGNN*
Christopher Bail, Trinity - Sociology*
David Banks, Trinity - Statistical Science*
Scott Huettel, Trinity - Psychology and Neuroscience*
Katharina Koelle, Trinity - Biology*
James Moody, Trinity - Sociology*
Seth Sanders, Trinity - Economics*
Laura Sheble, Duke Network Analysis Center (DNAC)
Angela Zoss, Duke Libraries

Graduate Team Members

Taylor Brown, Sociology
Marcus Mann, Sociology
Jonathan Morgan, Sociology

Undergraduate Team Members

Magdalena Dakeva, Linguistics (AB), Computer Science (BS2)
Evan Donahue
Anne Driscoll, Economics (BS)
Mike Gao, Mathematics (BS), Economics (BS2)
Benjamin Clay McMullen
Muhammad Mubin, Computer Science (BS), Economics (BS2)
Madhavi Rajiv, Electrical & Computer Engineering, Philosophy (AB2)
Devesh Sharma, Computer Science (BS), Statistical Science (BS2)
Yueqi (Angie) Shen, Statistical Science (AB), Literature (AB2)
Thamina Stoll, Political Science (AB)
Rafael Ventura, Philosophy (AB)
Arthur Kwan Hung Wu, Statistical Science (BS), Computer Science (AB2)
Steven Yang, Computer Science (AB), Statistical Science (AB2)

Community Team Members

Stuart Borrett, UNCW - Ecology

* denotes team leader