Data is about relationships, not individuals

6 min readJul 29, 2021

A post about pluralism in data and AI and considering non-Western ethical frameworks like Ubuntu.

In a recent Pew Research Centre survey of 602 technology innovators, developers, business and policy leaders, researchers and activists, 68% indicated that they don’t think we’re heading towards AI designed for the public good within the next decade. The main reason the respondents gave was that developers and deployers of AI are focused on profit-seeking and social control. Which is a bit of a bummer.

Now, I’m not pretending that there’s an easy fix here, but something that is seriously important if we want to address this in the long term if finding ways to hop on the pluralism train. Pluralism is the recognition that a multiplicity of voices, rather than one dominant or technical one, results in a more complete picture of the issue at hand.

In this post I’d like to suggest that alongside, you know actually giving a shit and taking more perspectives into account, we also need quite a major shift in our understanding about what data is and who it represents.

Pluralism in data

Pluralism is defined by The Data Feminism Principles rather succinctly:

“Principle #5: Embrace Pluralism. The most complete knowledge comes from synthesizing multiple perspectives, with priority given to local, Indigenous, and experiential ways of knowing… From a gender perspective, [this] means beginning with the perspectives of women and nonbinary people. On a project that involves international development data, [this] means beginning not with institutional goals but with Indigenous standpoints.”

The Design Justice Network has also transformed this thinking into one of its design principles, stating:

“We center the voices of those who are directly impacted by the outcomes of the design process.”

Philosophers like Donna Haraway and a whole load of other smart cool people (mainly feminists) have demonstrated over and over again that all knowledge is partial, and all knowledge is situated. Data isn’t neutral for crying out loud! There is no ‘god’s eye view’ from which one can see everything objectively. Like a landscape seen from different viewpoints, the different ways a question can be asked and the choices made in categorisation capture the same scene from different perspectives. Data is constructed, it is situated in the context of where it is collected and the perspectives of those who collected it. Or in other words he (it’s mainly a he) who holds the pen decides who gets counted and what they count for.

There is an inexcusable amount of evidence that AI systems perpetrate systematic injustice, an issue which often stems not only from non-representative training sets, but the fact that there isn’t that much data in the first place about marginalised communities.

Which isn’t to say that data isn’t unbelievably useful. It’s just that it’s not a one way ticket to truth. To make it even more useful you might want to cross reference it with some data gathered in a different way, from a different perspective. Most scientists worth their salt know this, but it doesn’t seem to be a massive part of tech culture.

As data activist organisation as Data 4 Black Lives puts it:

“…new data systems have tremendous potential to empower communities of color. Tools like statistical modeling, data visualization, and crowd-sourcing, in the right hands, are powerful instruments for fighting bias, building progressive movements, and promoting civic engagement.
But history tells a different story, one in which data is too often wielded as an instrument of oppression, reinforcing inequality and perpetuating injustice. Redlining was a data-driven enterprise that resulted in the systematic exclusion of Black communities from key financial services. More recent trends like predictive policing, risk-based sentencing, and predatory lending are troubling variations on the same theme. Today, discrimination is a high-tech enterprise.”

There are a lot of brilliant organisations and toolkits that give practical ways for teams to engage more broadly with communities when collecting and using data, Data 4 Black Lives and the Data Feminism Principles being just two. But I think there’s another cultural shift that’s needed to fully enable us to fully embrace pluralism… we need to stop thinking that data is only about individuals!

It’s not just about you (gif)

Data is about relationships, not individuals

Our Data Bodies is another data activist organisation who do fantastic work in marginalized neighborhoods in Charlotte, North Carolina, Detroit, Michigan, and Los Angeles, California. They look at digital data collection and human rights, work with local communities, community orginizations, and social support networks, and show how different data systems impact re-entry, fair housing, public assistance, and community development. They define a ‘data body’ as “a manifestation of our relationships with our communities and institutions, including institutions of privilege, oppression, and domination.”

Perhaps in order to embrace pluralism, we need to move away from the idea that data is about individuals in the first place- it’s always about multiple people, with multiple perspectives.

In 2020, Projects by If released the Society Centered Design manifesto, calling for designers to prioritise societal needs over those of the individual when building technology. They make the point that “data protection frameworks like GDPR or CCPA express our rights only as individuals. This individualistic lens has shaped how we now design for digital rights. But data rarely represents a single person — it usually describes many people.” Take for example your phone bill, your social media data, or your weekly grocery bill (if you purchase food for more people than yourself). All of these datasets, while associated fundamentally with you, are also representations of other people. Even your pay cheque represents your relationship with your employer.

This individualistic lens stems from the western-centric approach we have towards ethics for data and AI. Whilst of course there is huge value in the body of work produced by the AI ethics community, of the 160 sets of guidelines in the AI Ethics Guidelines Global Inventory, the majority come from the Europe and USA. Most of these are largely based on what John Tasioulas, the Director of the Institute for Ethics in AI, University of Oxford (aka someone who knows a lot more about philosophy than I do) has described as a ‘crude, preference-based utilitarianism’ which conceive of people exclusively as individuals.

The focus on the individual is problematic because it doesn’t seem to handle the intersection between individual actions, and impacts on groups, for example the amalgamation of individual instances of racism which are then amplified back across the network.

There are however many philosophical cultures that take a different approach to personhood which could help us reframe our thinking. Sabelo Mhlambi from Harvard has published a discussion paper that advocates for a fundamental shift in how we conceive of personhood in AI ethics. He points to the fact that the digital colonialism and surveillance capitalism enabled by artificial intelligence will not preserve the human dignity of all, just those whose perspectives are foregrounded. In contrast, Mhlambi suggestst that:

“The relational Sub-Saharan African philosophy of ubuntu reconciles the ethical limitations of rationality as personhood by linking one’s personhood to the personhood of others.”

In opposition to the deeply held Western idea of ‘I think therefore I am’, Ubuntu can be expressed in the phrases “I am because you are,” and “a person is a person through other persons.” He uses this framework to show that the harms caused by artificial intelligence, with a particular focus on automated decision making systems (ADMS), are in essence violations of ubuntu’s relational personhood and relational model of the universe. It’s well worth a read.

So, perhaps if we want to move beyond our existing hierarchical and exclusionary approaches to building systems, maybe we need to rethink our approach to data and understand it as a set of relationships. Doing this may seem abstract and esoteric, but in doing so we may fundamentally open more space for more perspectives.

Perspectives (gif)

Data is about relationships, not individuals

Pluralism in data

Data is about relationships, not individuals

Written by Miranda Marcus