Our interest in algorithms is primarily shaped by a concern about what happens whenthe rigid, quantitative logic of computation tangles with the fuzzy, qualitative logics ofhuman life. – Nick Seaver in Knowing Algorithms, 2014
Just after I wrote my previous post, I discovered this fascinating 2014 article by Rob Kitchin, ‘Thinking critically about and researching algorithms’. The first 15 pages or so are exactly what I was looking for, hence this follow-up post. You’d really be much better off just reading Kitchin’s article but, in case you need some convincing, I’ll give you a bit of a primer.
Can you just go through that again for me…what’s an algorithm?
Citing John MacCormick’s 2013 book, Nine Algorithms That Changed the Future, Kitchin writes that algorithms now shape a startling array of “everyday practices and tasks, including those that perform search, secure encrypted exchange, recommendation, pattern recognition, data compression, auto-correction, routing, predicting, profiling, simulation and optimisation”. In response, “the field of software studies has emerged over the past decade taking software as its object of critical analysis, considering how it is produced and deployed, and how it does work in the world”. However, such analysis, according to Kitchin, has tended to look more closely at the ‘compiled code’, as distinct from the ‘raw’ algorithms it is comprised of.
To explain how algorithms are produced and deployed, Kitchin describes how this would occur in the case of calculating “the number of ghost estates [unfinished or unoccupied housing estates] in Ireland using a database of all the properties in the country that details their occupancy and construction”.
There is no readily defined algorithm for such a calculation so one needs to be created. First, we need to define what is a ghost estate in terms of (a) how many houses grouped together constitute an estate (e.g., 5, 10, 20, etc)?; (b) what proportion of these houses have to be empty or under-construction for that estate to be labelled a ghost estate (e.g., 10%, 20%, 50%, etc)? We can then combine these rules into a simple formula — “a ghost estate is an estate of 10 or more houses where over 50% of houses are vacant or under-construction”. Next we can write a program that searches and sifts the property database to find estates that meet our criteria and totals up the overall number. We could extend the algorithm to also record the coordinates of each qualifying estate and use another set of algorithms to plot them onto a digital map. In this way lots of relatively simple algorithms are structured together to form large, often complex, recursive decision trees (Steiner 2012; Neyland 2014). The methods of guiding and calculating decisions are largely based on Boolean logic (e.g., if this, then that) and the mathematical formulae and equations of calculus, graph theory, and probability theory.
Coding thus consists of two key translation challenges centred on producing algorithms. First, translating a task or problem into a structured formula with an appropriate rule set (what is sometimes called pseudo-code). Second, translating this recipe into code that when compiled will perform the task or solve the problem.
I think it’s important to understand this process and these two ‘translation challenges’ if we are to make sense of algorithms’ impact on our lives.
Purely formal beings of reason
In terms of the first ‘translation challenge’, there is obviously a great need to understand the situation to be ‘algo-ised’ so perspicuously that “you can explain it to something as stonily stupid as a computer” (Fuller 2008: 10, in Kitchin). Kitchin makes the key point that many situations and processes – such as those that dominate education – cannot be algo-ised, at least not without massively oversimplifying them. Philip Kerr and Scott Thornbury have both written fairly recently about adaptive learning company Knewton and the problems with applying oversimplified algorithms to (language) learning.*
Nonetheless, while programmers might vary in how they formulate code, the process of translation is often portrayed as technical, benign and commonsensical. This is how algorithms are mostly presented by computer scientists and technology companies: that they are “purely formal beings of reason” (Goffey 2008: 16).
This New York Times article from last week perfectly illustrates this illustration of algorithms. The article by Quentin Hardy is titled ‘Using algorithms to determine character’ and describes the work of Paul Gu and his company Upstart. Gu uses data such as people’s “SAT scores, what colleges they attended, their majors and their grade-point averages” in order to determine whether to lend them money or not.** The article tells us about Douglas Merrill, “a former Google executive whose company writes loans to subprime borrowers through nonstandard data signals.” ‘Nonstandard data signals’? Weasel words if ever I’ve heard any.
One signal is whether someone has ever given up a prepaid wireless phone number. Where housing is often uncertain, those numbers are a more reliable way to find you than addresses; giving one up may indicate you are willing (or have been forced) to disappear from family or potential employers. That is a bad sign.
As Merrill goes about his merry work identifying new people to sell money to, he does so under the apparent illusion that algorithms are ‘technical, benign and commonsensical’: “We’re always judging people in all sorts of ways, but without data we do it with a selection bias,” Hardy quotes him as saying.
Finally, Hardy introduces us to Jure Leskovec, “a professor of computer science at Stanford, is finishing up a study comparing the predictions of data analysis against those of judges at bail hearings, who have just a few minutes to size up prisoners and decide if they could be risks to society.”
“Algorithms aren’t subjective,” he said. “Bias comes from people.”
Hardy immediately makes the obvious point: Where do you suppose the algorithms come from, Mr Stanford Professor?? This suggests the mindset that described by Seaver in which algorithms are seen as “strictly rational concerns, marrying the certainties of mathematics with the objectivity of technology.” As Kitchin points out, what is certainly not taken account of here are
the complex set of decision making processes and practices, and the wider assemblage of systems of thought finance, politics, legal codes and regulations, materialities and infrastructures, institutions, inter-personal relations, that shape their production
Although there’s much more to Kitchin’s article (which is why you should really just have read it in the first place but anyway), I can’t let this post drag on too much longer so I will finish with another pithy quote:
For all the benefits the logic of computation asserts, questions remain as to what is lost in a society that whole-heartedly embraces the logic of computation? The notion that nearly everything we do can be broken down into and processed through algorithms is inherently highly reductionist. It assumes that complex, often fuzzy, relational, and contextual social and economic interactions and modes of being can be logically disassembled, modelled and translated, whilst only losing a minimum amount of tacit knowledge and situational contingencies.
* Incidentally, it was Thornbury’s ‘Mouse That Roared’ post that lead me to Kerr’s blog which then lead me to Selwyn’s work that started me down this track.
** How would Upstart even get hold of this data in the first place? The article doesn’t say, but what if it were through a partnership with a schooling system or university? What if the university was sharing data from its LMS? If this sort of partnership is not already occurring, Evgeny Morozov and George Siemens have both published articles in the last week or so suggesting that it is quite possible in the forseeable future.