Innovations in Digital Research Methods. Группа авторов
Procter, R.N., Halfpenny, P. and Voss, A. (2012b) ‘Research data management: opportunities and challenges for HEIs’, in G. Pryor (ed.) Research Data Management. London: Facet Publishing. pp.135–50.
Procter, R., Voss, A. and Asgari-Targhi, M. (2013a) ‘Fostering the human infrastructure of e-research’, Information, Communication & Society, 16(10): 1668–91.
Procter, R., Housley, W., Williams, M., Edwards, A., Burnap, P., Morgan, J., Voss, A. and Greenhill, A. (2013b) ‘Enabling social media research through citizen social science’. ECSCW 2013 Adjunct Proceedings, 3.
Research Councils UK (2014) e-Infrastructure. Available at www.rcuk.ac.uk/research/xrcprogrammes/otherprogs/einfrastructure (accessed 28 Jan 2015).
Savage, M. and Burrows, R. (2007) ‘The coming crisis of empirical sociology’, Sociology, 41(5): 885–99.
Stahl, B., Eden, G. and Jirotka, M. (2012) ‘Responsible research and innovation in Information and Communication Technology: identifying and engaging with the ethical implications of ICTs’, in R. Owen, J. Bessant and M. Heintz (eds), Responsible Innovation. Chichester: Wiley & Sons. pp.199–218.
Stewart, J., Procter, R., Williams, R. and Poschen, M. (2013) ‘The role of academic publishers in shaping the development of Web 2.0 services for scholarly communication’, New Media & Society, 15(3): 413–32.
Thelwall, M. (2009) ‘Introduction to webometrics: quantitative web research for the social sciences’, Synthesis Lectures on Information Concepts, Retrieval, and Services 1(1): 1–116.
Voss, A., Asgari-Targhi, M., Procter, R. and Fergusson, D. (2010) ‘Adoption of e-Infrastructure services: configurations of practice’, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 368(1926): 4161–76.
Waldrop, M. (2008) ‘Science 2.0: great new tool, or great risk?’ Scientific American. Available from www.sciam.com/article.cfm?id=science-2-point-0-great-new-tool-or-great-risk (accessed 12 Dec 2014).
Webb, T., Joseph, J., Yardley, L. and Michie, S. (2010) ‘Using the internet to promote health behavior change: a systematic review and meta-analysis of the impact of theoretical basis, use of behavior change techniques, and mode of delivery on efficacy’. Journal of Medical Internet Research, 12(1), e4.
1www.epsrc.ac.uk/about/progs/rii/escience/Pages/intro.aspx. (All URLs were accessed on 17 Dec 2014.) Terascale computing achieves speeds of teraflops, where a teraflop is a trillion floating point operations per second.
2www.dames.org.uk
3http://thedrs.sourceforge.net
4As the recent controversy over the Facebook experiment conducted by researchers at Cornell and the University of California, it is essential to think very carefully about the ethical implications of conducting such studies. See www.theguardian.com/technology/2014/jul/02/facebook-apologizes-psychological-experiments-on-users
5www.maptube.org
6For example, the Center for Urban Science and Progress (CUSP). See cusp.nyu.edu
7www.openstreetmap.org
8See, for example, the ‘reading the riots’ project, Lewis et al. (2011).
9This has given rise to the new specialism of ‘data journalism’. News media organizations have also been at the forefront of experiments in citizen journalism and crowdsourcing data analysis. For an example of the latter, see www.theguardian.com/news/datablog/2009/jun/18/mps-expenses-houseofcommons
10www.dropbox.com
11See, for example, the Extreme Science and Engineering Discovery Environment (XSEDE) www.xsede.org/web/guest/gateways-listing
12https://study.sagepub.com/halfpennyprocter
2 The Changing Social Science Data Landscape
Kingsley Purdam
Mark Elliot
2.1 Introduction
2.1.1 The Age of Data
More than a century since the ground-breaking social surveys of Booth in London 1 and Rowntree in York in the UK, and the subsequent development of mass observation methods in the 1930s, we are now in an age of almost overwhelming volumes of data about many people’s attitudes, circumstances and behaviour. Such data extends from people’s views to images of them, their locations and movements, and their communications. The data is very diverse; it includes lifelong health and prescription records, genetic biomarker profiles and family histories, satellite images, digital passports and their use, databases from product warranty forms, consumption transactions, online browsing records, email and web communications, social media, and mobile phone use. As Berners-Lee and Shadbolt (2011:1) highlight, ‘data is the new raw material of the 21st Century’.
Social science and the societies that it studies have entered the age of data, though not necessarily the age of data access. Nevertheless, access to this data is increasing; for example, administrative record data held by public bodies, including government departments, is being widened.2, 3 The term ‘big data’ has been much used to describe the data revolution and whilst a little simplistic as a concept it moves us forward from Sweeney’s (2001) discussion of the ‘information explosion’, insofar as it captures the growth in the collection and availability of information (for discussions see boyd and Crawford, 2012; Mayer-Schönberger and Cukier, 2013; O’Reilly Radar Team, 2011). Big data denotes volumes of data so large that they are kept in so-called data warehouses, which are digital data storage facilities often cutting across different national borders and data regulation regimes. It is the volume of data (when potentially information about all, or nearly all, of a particular population is included, as opposed to a sample), the variety of the variables, and the speed with which it can be discovered and accessed that open up new opportunities for research and methodological innovation (Mayer-Schönberger and Cukier, 2013; IBM, 2013).
The term ‘big data’ is used differently by different authors, with some including orthodox or well-established forms of social science data, such as survey responses and focus group transcripts (Elliot et al., 2013).4 The new types of data can have very different origins and structures. Some might be collected primarily for research use, whilst other data might be produced as a secondary outcome to another activity, for example, buying a product online or posting views on a blog. Some of this new data has been around in some form and quantity for some time, but its use in social science research has been limited, perhaps because of access and infrastructural constraints, methodological uncertainties and a lack of interest in, or opportunity for, social research use (Elliot et al., 2013).
In many ways, conceptually the term ‘big data’ fails to capture the all-encompassing nature of the socio-technical transformation that is upon us. Many people who use the term qualify it by stating that big data is not just about volume but also other features: that data can be captured, updated and analysed in (almost) real-time and that it can be linked through multiple data capture points and processes. However, such characterizations are not sufficient; they still express the notion of data as something we have, whereas the reality and