What is ‘Datafication’?

Our research is focused on what is increasingly referred to as the ‘datafication’ of school. This term was popularised by Mayer-Schoenberger & Culkier’s (2013) early descriptions of ‘big data’, and has since been taken up by all manner of researchers to refer to the transformation of digital interactions into records that can be collected and commodified. In acting as short-hand for the increased prevalence of digital data in informing what takes place in society, this neologism encompasses a number of different issues that we need to think carefully about.


#1. The massive production of data tracking human activity and behaviour

One foundational aspect of datafication is the vast number of digital systems and digital devices that are now located in and around schools. These devices and systems are now legion – from the data generated through the use of a school’s ‘Learning Management System’ through to the hundreds of smartphones and other personal devices being brought in each day by students, parents and staff. Every keystroke, click, swipe and tap will result in data trails. This data is sometimes attributable to the individual who is doing the typing clicking, swiping and tapping, other times it can be used to place individuals into different categories, groups and profiles. Often these are data ‘traces’ that individuals are unaware of – such as the continuous GPS tracking of a phone, monitoring of WiFi activity or absence of action (for example, a learning management system will generate data on what a user is not doing as well as what they do).

A good example of this latter type of data is much of what constitutes so-called ‘smart’ environments. Indeed, the notion of the ‘smart school’ is based around the placement of hundreds of different sensors, monitors and indicators – all generating data relating to how people are interacting with the physical environment. These data therefore derive from many seemingly innocuous aspects of school life – from the opening of doors to rising noise levels.  This is set to continue with the growth of networked-connected sensors (what is sometimes referred to as the ‘Internet of Things’).

All told, any school is now a place where massive amounts of data are being generated and systematically collected by the people within it, and the things that they are doing. In this sense, datafication refers the contemporary drive to turn vast amounts of activity and human behaviour into data points that can be tracked, collected and analysed.


#2. Rapid advances in data processing

Of course, the idea of ‘datafication’ refers to more than the mere existence of this data. A second element of datafication is the rapid advances made over the past 10 years in how this data can be collated and analysed. As inferred above, the digital devices and systems now found in the contemporary school result in the voluminous production of data – what is often referred to in terms of ‘massive’ data-sets and the idea of ‘Big Data’. Thus digital devices and systems are now resulting in millions of data-points being tabulated and analysed through increasingly complex analytic technologies. Crucially, these data-sets are being continuously generated, and often combined with other data-sets to form new data-sets (what is referred to as the ‘recombinatorial’ quality of digital data).

These advances arise from a number of developments in computing – not least the massive advances in computational power, the vast storage capacity of cloud computing, and the reduced cost of powerful processing units. Alongside this are much-publicized advances in machine learning and other aspects of AI. At the same time, more mundane programming is now supported by the development of relatively simple programming languages and libraries (such as R and Python) which have changed the same programming to a more collaborative and co-constructed process. Now we have large software ecosystems and communities of developers – working together on online spaces such as GitHub and StackOverflow, and taking code from software libraries rather than building programs from scratch. At the same time, the realm of online information visualisation is booming – leading to a diversity of ways that data is presented and fed-back to users. All told, these analytic techniques (and the outputs and artefacts they produce) are equally as significant as the rise of digital smartphones and apps (which we usually are wowed by!).


#3. The social, economic, cultural and political consequences of data

Of course, the analysis of data is always carried out for a purpose! Third, then, is the consequences of these technical and analytical advances when applied in a social context such as a school – what might be seen as the ‘-fication’ prefix of the term.

Here, the notion of datafication conveys a sense of the social, economic, cultural and political consequences of what might otherwise appear to be technical and analytic developments. In short, these data analyses now play in a key role in how society is ordered. For example, social relations are increasingly imagined and enacted in terms of data. Many different types of decision are now made on the basis of this data, some of which have life-changing consequences. All manner of predictions are made using these data, as individuals are profiled, sorted and categorised on the basis of ‘their’ data trails and traces.

Key here, then, are the ways in which ‘who’ an individual is (more accurately, who an individual is understood to be by others) is not constituted through their actions, or knowingly self-defined. Instead, an individual’s identity is increasingly constructed for them through the analysis of ‘their’ data by a number of usually unseen third parties – such as platform companies, data brokers, businesses, government, public institutions and the state. In these sense it is now increasingly important what one’s data says about oneself – as Cheney-Lippold puts it, “We are data”. More specifically, rather than being perfect ‘data-doubles’ as is sometimes claimed, our data-identities can be highly partial, transient and temporary – aligning us with all manner of emergent categories that we are not aware of, but which have significant immediate and longer-term implications.

This aspect of datafication therefore raises questions over how individuals are able to negotiate their interactions and entanglements with data. As Hintz et al. (2019) puts it, “The concept of datafication … reflects a shift which is not merely technological, but also social and political, and it therefore confronts us with questions of power, agency and control” (Hintz 2019, p.2)



There is a lot to think about here. Of course, it is important to recognise these shifts as not simply an inevitable consequence of the development of networked digital technologies. The technical ability to ‘datafy’ schools needs to be accompanied by the human desire, willingness and/or compliance to do so. In this sense, institutional imperatives and personal priorities are an equally important aspect of understanding this shift in the ways in which contemporary society is organised and arranged.

At the same time, it is worth thinking further how some of our concerns over fairness and social justice might be worked into the first two (ostensibly technical) elements of datafication. For example, how might less discriminatory algorithms be developed by inserting our own code and sub-routines into software libraries and repositories? How might programmers and developers be engaged in ethnical discussions in forums such as GitHub and StackOverflow?

In summary, then each of the three elements of datafication just outlined provide a neat rubric for framing the key questions that we want to exploring throughout our project. We shall use these distinctions throughout our subsequent work.