TY - JOUR
T1 - Machine Learning, Synthetic Data, and the Politics of Difference
AU - Jacobsen, Benjamin
N1 - Benjamin N. Jacobsen is a Lecturer in Sociology at the University of York as well as a Visiting Fellow on Professor Louise Amoore’s ‘Algorithmic Societies’ project at Durham University. His current research explores the ethico-political implications of generative modelling and synthetic data on society and culture.
PY - 2025/1/16
Y1 - 2025/1/16
N2 - What is the relationship between ideas of sameness and difference for machine learning and AI? Algorithms are often understood to participate in the continual displacement of the different and heterogeneous in society in favour of sameness, of that which is socio-politically similar and proximate. In contrast to this prevalent emphasis on sameness, however, this paper argues that there is a nascent heterophilic logic underpinning the intersection of synthetic data and machine learning, a move towards actively generating differences and heterogeneous data attributes to train, fine-tune, and optimize algorithms. Yet, these synthetic attribute data are nonetheless always machine compatible, devoid of their socio-cultural dynamics and tensions. As such, through a critical examination of three core dimensions of this emergent politics of difference of synthetic data – disentanglement, compositionality, and normativity – the paper argues that this has the potential to ultimately undercut a politics of intervention that seeks to foreground the systemic unfairness and violence of machine learning.
AB - What is the relationship between ideas of sameness and difference for machine learning and AI? Algorithms are often understood to participate in the continual displacement of the different and heterogeneous in society in favour of sameness, of that which is socio-politically similar and proximate. In contrast to this prevalent emphasis on sameness, however, this paper argues that there is a nascent heterophilic logic underpinning the intersection of synthetic data and machine learning, a move towards actively generating differences and heterogeneous data attributes to train, fine-tune, and optimize algorithms. Yet, these synthetic attribute data are nonetheless always machine compatible, devoid of their socio-cultural dynamics and tensions. As such, through a critical examination of three core dimensions of this emergent politics of difference of synthetic data – disentanglement, compositionality, and normativity – the paper argues that this has the potential to ultimately undercut a politics of intervention that seeks to foreground the systemic unfairness and violence of machine learning.
U2 - 10.1177/02632764241304687
DO - 10.1177/02632764241304687
M3 - Article
SN - 0263-2764
JO - Theory, Culture and Society
JF - Theory, Culture and Society
ER -