Big Talk about Big Data: Discourses of ‘Evidence’ and Data in British Civil Society 2013


This seed-funded pilot project looked to identify how a small group of civil society organisations (CSOs) use the terms ‘data’ and ‘evidence’ in their public materials in order to critically examine the values that inform how they use social research. Specifically, it aimed to document whether there were any perceived advantages of data ‘bigness’ (volume, variety, and velocity) for these organisations’ work.

The research had two objectives: (1) to examine the discourses around ‘evidence’ and ‘data’ in CSOs, particularly those working in issue areas related to migration or social welfare; and (2) to relate these discourses to perceptions about what social research accomplishes in civil society or voluntary sectors contexts. By linking textual analysis with semi-structured interviewing, this project aimed to reveal the different ways that these concepts are discursively employed and perceived.

Principal Investigator

Will Allen


Engineering and Physical Sciences Research Council
Communities and Culture Network+


Civil Society




The research for this project drew upon two textual datasets (plural ‘corpora’, singular ‘corpus’) and a set of 11 qualitative semi-structured interviews. Interviews provided valuable windows into the ways that key members of CSOs perceive data and evidence in the course of completing their research, policy, and advocacy work. Snowball sampling was used to identify relevant UK CSOs that were broadly operating within migration or social welfare topics, or actively facilitating public discussion about these topics as in The Conversation UK. Then, key staff members of those organisations whose job titles indicated involvement in research, policy, senior management, or communications activities were contacted for interview. Interviews were transcribed, then analysed using Nvivo software.

Other important sources of CSO discourse about data and evidence include their published materials such as research reports, briefings, and press releases. These kinds of outputs are also valuable for this study because they represent a large part of the outward-facing profile of an organisation, particularly if other members of the public are accessing these resources. Study of these documents using a computer-assisted corpus linguistics approach can reveal patterns of language around mentions of ‘data’ and ‘evidence’ that may not be apparent from a surface, selective reading of a limited number of documents. This study collected, as far as possible, all of the main documents published online by the eight organisations that participated in the study from 1 January 2007 to 15 August 2014. These were manually downloaded following a specification that captured all main publication types available on the respective CSOs’ websites, excluding blogs. The resulting corpus of 2,704 items totalling 9,589,892 words was then analysed using the Sketch Engine, a web-based piece of lexicography software that was designed for handling large corpora.


For the findings please see the full report in outputs