Playing with data

· July 17, 2020

Content warning: this blog post contains mentions of white supremacy, slavery, colonisation, and genocide.

One way patrons can explore GLAM collections is through the digital content that institutions make available as data. I have often heard (white, men) data scientists speak of this type of exploration in terms such as “playing with the data.” It is in this context that GLAM institutions must be particularly mindful that not all of the material we hold is appropriate for being converted into data to be played with. We must be aware of the responsibilities we have to treat the information in our custody with the respect it deserves, which may include, in some cases, not allowing this information to be used as a plaything.

GLAM institutions, especially archives and museums, hold material that relates to both individuals and communities that is personally and culturally sensitive. Much of this material was acquired through or documents ongoing practices of white supremacy and colonisation. If we as a sector are not careful, the decisions that are made if this material is datafied – i.e. digitised and structured in such a way as to enable it to be made accessible as data – may continue those practices. If we are to make appropriate decisions around if and how to make this material available as data, these must be guided by the expertise of the people and communities the material relates to. In a position statement titled “Computing in the Dark: Spreadsheets, Data Collection and DH’s Racist Inheritance”, P. Gabrielle Foreman and Labanya Mookerjee point out that, specifically in the US context as well as elsewhere, white people have historically treated Black people as commodities to be listed as data on manifests and in ledgers. They give examples of problems that can arise, such as gaps in data, or data that is seen as less reliable and authentic, when data curation practices unthinkingly centre whiteness and white institutional power. Meanwhile, in Aotearoa New Zealand, Karaitiana Taiuru has developed a guideline with frameworks for handling Māori data, recognising it as a taonga. This includes a proposed test that was developed by Hirini Moko Mead in Tikanga Māori: Living by Māori values, which recognises the mauri and tapu aspects of Māori data.

Taiuru states that:

A common argument against tikanga and customary rights are that they are no longer relevant in modern society (Archie, 1995). The same is often said of the Holy Bible and other religious literature.

The Hebrew Bible is still considered very relevant within my own culture. If I think about material relating to my own community that might be held by archives and museums, the concept that comes to mind is כבוד (cavod). This is generally translated as “honour” and is the same word that is used in the ten commandments to command us to honour our parents. It is an attitude we are expected to have towards the divine and the sacred, recognising those qualities in the people around us and in the objects we use to record sacred information and perform sacred acts. Community leaders have already started to ask questions about whether ethical principles of הלכה (halakhah, Jewish law) should be integrated into artificial intelligence and other automated systems, though to my knowledge they have not publicly addressed the question of ethical halakhic frameworks for handling information relating to Jewish individuals and communities as data. This may begin to change, however, as more cultural institutions that hold Jewish material, such as the United States Holocaust Memorial Museum, start to investigate computational methods of exploring their collections.

While acknowledging that I am no community leader or halakhic expert, so do not have the authority to speak on behalf of anyone but myself, my own gut feelings on this are that I would want to see GLAM institutions handling collections relating to Jewish communities with a great deal of cavod. Bearing in mind the context in scripture in which this term is used, to me, this would mean asking questions about the items in the collection such as “If this was my parents’ information, or my parents’ community’s information, would I still think it appropriate to digitise it and make it available as part of a dataset?”

In Jewish religious tradition there are strict prohibitions against counting (i.e. datafying) people. Different Jewish communities will observe these prohibitions in slightly different ways and to greater or lesser extents, but the taboo against counting people will likely exist in almost all Jewish communities. The best approach would be to seek advice from a religious leader from the community the collection relates to on this matter.

Jewish communities also have strong historical reasons to want to prevent government institutions acquiring data about us. Jews have been religious and cultural minorities in the countries we’ve lived in throughout the global diaspora and have been marginalised and perseceuted as a result. There is a long historical record, in our religious texts, of being wary of trusting dominant secular authorities with our information. The fact that Nazi authorities used data about the German and Eastern European Jewish populations to perpetuate genocide against us during the שואה (shoah, devastating storm, the Hebrew word for the Holocaust) looms large in our cultural memory, making many Jews reluctant to disclose their personal information to government authorities, for fear of how it could be used.

Different Jewish communities will also have different cultural taboos around the use of technology and the internet, which may affect their willingness for their information to be shared online. Communities that are more insular and make little use of technology may have less familiarity with the internet and be more fearful about the implications of sharing their information. Whereas communities that are very digitally connected and do a lot of outreach work online may be a lot more willing to have their information shared.

I suspect these issues come up relatively infrequently with regards to Jewish cultural collections, because the Jewish community is a relatively privileged community in many ways, and tends to have the financial resources and skills to manage our own information. Jewish archives and museums tend to be under Jewish control and therefore Jews are in fact the ones making decisions about how our information gets used. This is in contrast to Indigenous communities, whose cultural artefacts and information were stolen by colonisers and are still held hostage by colonial institutions. Indeed, in the case of Israel, the Jewish state is the invading colonial power, from the perspective of the Palestinian people. It’s one thing to be in full control of your own information and to choose to make it available as data, knowing others might play with it. It’s another thing entirely for the people that stole your land and your cultural information to make it available as data for others to play with, or even to come to you to ask if they can do so. If you as a community don’t yet have full control and possession of your information, how is it truly possible to give full, free, informed consent for an institution to make that information available in a form that can be played with?

I’m not saying that play is always and forever an inappropriate approach or attitude to take towards cultural collections in the form of data. There are some datasets or projects that may specifically invite or require playfulness in engaging with the material. However, given that these are cultural treasures we are talking about, any play should be done with conscious care, delicacy, and humility. It’s like playing with your friend’s grandmother’s jewellery: it might be good for imaginary play, but it’s old and delicate and precious and it’s not yours, so please make sure you have permission before you play with it and be very careful.