Monthly Archives: June 2024

Big Data: Surveillance Capitalism and Our Digital Selves

Big Data: Surveillance Capitalism and Our Digital Selves

Ibad Kureshi
Ibad Kureishi, Senior Research Scientist, Inlecom Systems, UK
Volume 15 Issue 1 June 2019

In a presentation given at CBEC on 22nd December 2017 entitled “Big Data: Losing Control of your Digital-Self” [1], I lamented about the ease with which companies have surreptitiously amassed a wealth of knowledge about us. Us as in the individual – you or I, not an aggregate sum. This data is being used in a multitude of ways and even where not malicious its use may cause harm.

Our world is changing. Behind all the apps, all the smart devices, and all modern digital comforts, there is one impetus – collect all the data all the time. The most valuable commodity in the digital world is not a crypto-currency but in fact our data. Even in the physical world data is more valuable than oil [1]. Everything we do leaves a digital footprint. Landing on a webpage creates a trail of evidence of our activities, both on our own devices (in the form of cookies) and on the servers running the website (in terms of access logs). The advertising eco-system that now drives the Web 2.0 and e-Commerce world, in fact exposes our data to hundreds of other entities without us knowing the extent or giving explicit consent.

While we consider this a necessary evil of the digital world, the ubiquity of digital devices means that this phenomenon of data harvesting translates to the physical world as well. As we walk through a public place – or any place – we leave traces of our presence. The signals (WIFI, GPRS, Bluetooth) emitted by our devices are detected and logged. Should you be so inclined, your home router can be converted to spy on the comings and goings of your neighbour1 or their income level by counting the number of smart devices. Linking these detector systems with CCTV, loyalty cards, other smart devices (bulbs, home assistants, device finders) allows organisations to create rich models of ‘us’.

These rich models are the new commodity of the surveillance capitalism era. A term coined by Shoshana Zuboff in 2015 [2], surveillance capitalism is a new economic order that claims human experience as free raw material for hidden commercial and security practices [3]. The addictive nature and reward schemes of cyber (e.g. Snapchat), and cyber-physical (Pokémon Go) apps has led experts to estimate that we touch our mobile devices anywhere between 80-2000 times a day [4-5]. Through this constant use of our devices, the phone manufacturers and the app designers are able to collect data on us passively. Sensors within the device such as Accelerometer, GPS, App Census and Usage, 3G/4G signal strength, available WIFI Connections and device specific information sensors [6], allow the data collectors to infer1 our age, gender, income, level of education, sexual identity, activity and preferences, political leanings, eating habits, friendship groups, and health [7]. The common retort to learning of the nature and scale of the data acquisition is, “What’s the harm? So, what if they personalise my ads?” However, the full context, circumstance and extent of the data use are not fully understood.

Understanding the problem from a Nicomachean lens [8] we can question the problem using the 5 W’s. Why is our data being collected? Possibly, this is the easiest of the five questions to answer. Our data is being collected to feed a process known as data-driven development. Computer scientists, engineers and domain experts the world over are building wonderful futuristic things, such as medical diagnostic tools, transport and logistics solutions, new business models, tools and services, and revolutionary urban infrastructure planning, to name a few. These developments have led to new commercial opportunities and a whole sector of pay-as-you-use services. This ‘servitisation’ first seen in the computer infrastructure world through cloud computing has spread to vehicle ownership (through ride share apps), books, films and music (through streaming services), to tourism (through accommodation sharing apps). The provision of these services and the entire business models is both reliant on our data and generates further data about the human experience.

What data is collected and what is it used for? While the first half of this question was answered in the preceding paragraphs, finding a complete answer to the latter half is problematic. At face value our data is used by those we give it to, to provide us a service, and to determine new products, services, or marketing opportunities. While a benign sounding outcome, new products, services, or marketing opportunities can span the design of a new screw-driver [9], all the way to a targeted campaign to influence elections [10]. Further, as we see in the next questions, when and where the data enters the security apparatus is completely obfuscated from us – the data subjects.

When was the data collected and when will it be used? Rightly or wrongly, many a famous personality find themselves in trouble for comments made 10-15 years ago because in some archive there is an errant tweet or post. While we may believe we have deleted a mis-informed tweet as soon as humanly possible, there are data aggregators that are automatically farming our activities in real-time. It is not just large organisations, anyone with a Twitter account can collect and store Twitter activity using the public interfaces.  Posts and tweets are not necessarily deleted from these archives. This information (known as a firehose) is then sold on to anyone with a credit card. It is foreseeable that an alternate Equifax-LinkedIn hybrid emerges the allows employers to get a moral, ethical or expected performance score of existing or potential employees that is based on their historical data footprint. The young adults (Gen-Z) of today (ages 20 and below) have lived their entire lives under the auspices of surveillance capitalism. The full impact of the data their parents and they themselves have shared about themselves is yet to be seen.

Where is our data being kept? This is where things become murkier. Our data has been collected over the last two-decades through different online and physical services by organisations who have changed names and owners hundred times over.  Technology evolves every 18 months and companies are constantly cycling deprecated (in the process of being replaced by new technology) equipment. So, what happened to the hard drive holding our biodata when we registered with a website, hotel, or conference in 2009? Is the hard drive still floating between offices? Was it dumped in the trash when the computer stopped working? Did someone else recover that information? Was the data sold on? Is the data still with the organisation? Do they keep it in the cloud? Is it secure? Before Hotmail/Outlook and Gmail cornered the email market, think of all the email accounts we had created in the nineties and noughties. Did we delete all the emails, pictures and information from our Supernet or Cybernet accounts? Did it disappear from their backups? Did we delete all our information and pictures from early social media e.g. Orkut and MySpace?

Who has our data? The final question for which no one can realistically give a complete answer. As is already clear from the other four-W’s, we don’t know the full extent of Why our data was collected, what all was collected (beyond what we put in a web-form) and what it will be used for, how far back does the data collection go, or what the data is now. The European Union’s Regulation on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (also known as General Data Protection Regulation: GDPR) tried to make a first stab at solving the Who problem [11]. On the 25th of May, 2018 when the regulation went into effect we got a brief glimpse into the scale as many responsible organisations informed the data-subjects that their data was being held and what it was being used for [12]. However, the data subject either blindly clicked accept to the new terms and conditions or completely ignored the emails [13].

General attitudes in Pakistan tend to either be that Pakistan and Pakistani society is technologically so far behind Silicon Valley that the implications of these technologies are inconsequential, or that it does not matter if the pictures posted on Facebook or Instagram are processed by some algorithm. But the pervasiveness of digital technologies should not be underestimated. A look at Google’s Play Store [14] and Apple’s App store [15] usage shows that the vast majority of applications downloaded and used by Pakistani’s are made and designed by non-Pakistani entities. We are inadvertently surrendering our digital identities to foreign companies. The models they generate to represent us may have inherent biases that are can be seen affecting people of colour in the West [19-22]. The new tools and services built on these models will inevitable find their way into the Pakistan (banking KYC and loan assessment software, student performance evaluation software, etc.)

We may never know the full extent of who has our data or whether it will come back to bite us in an Orwellian, or Huxley-ian, or Gasset-ian dystopia. The general consensus is that it will be a dystopia.

[1]A demonstration of these techniques and their effectiveness were covered in a seminar by the author, a recording of which can be found on the CBEC Facebook pages.

References:

  1. https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data
  2. Zuboff, S. (2015). Big other: surveillance capitalism and the prospects of an information civilization. Journal of Information Technology, 30(1), 75-89.
  3. Zuboff, S. (2019). The age of surveillance capitalism: the fight for the future at the new frontier of power. Profile Books.
  4. http://time.com/4147614/smartphone-usage-us-2015/
  5. https://blog.dscout.com/mobile-touches
  6. http://www.surveyswipe.com/passive-data-collection.html
  7. https://theglassroom.org/glassroomlondon/exhibits
  8. Sloan, M.C. (2010). “Aristotle’s Nicomachean Ethics as the Original Locus for the Septem Circumstantiae”. Classical Philology. 105: 236–251. doi:10.1086/656196
  9. Chandra A. and Chandna, P. (2011) “Ergonomic design of hand tool (screwdriver) for Indian worker using comfort predictors: a case study” International Journal of Advanced Engineering Technology, vol. 2, no. 4, pp. 231-238
  10. Cadwalladr, C., & Graham-Harrison, E. (2018). The Cambridge analytica files. The Guardian, 21, 6-7.
  11. http://support.gnip.com/apis/firehose/overview.html
  12. https://dev.twitter.com/
  13. https://eugdpr.org/
  14. https://www.bloomberg.com/news/articles/2018-05-25/blocking-500-million-users-is-easier-than-complying-with-gdpr
  15. https://thenextweb.com/eu/2018/12/27/gdprs-impact-was-too-soft-in-2018-but-next-year-will-be-different/
  16. http://xyologic.com/
  17. http://appannie.com/
  18. https://atlantablackstar.com/2016/01/31/study-racial-discrimination-in-mortgage-lending-continues-to-impact-african-americans-with-a-black-name-lowering-ones-credit-score-by-71-points/
  19. https://www.theguardian.com/inequality/2017/aug/08/rise-of-the-racist-robots-how-ai-is-learning-all-our-worst-impulses
  20. https://www.forbes.com/sites/bernardmarr/2019/01/29/3-steps-to-tackle-the-problem-of-bias-in-artificial-intelligence/#51fc08297a12
  21. https://www.mortgagebrokernews.ca/news/technology/ai-needs-a-lot-more-work-before-it-can-be-safely-used-in-mortgage-253820.aspx
  22. Morstatter, F., Pfeffer, J., & Liu, H. (2014, April). When is it biased?: assessing the representativeness of twitter’s streaming API. In Proceedings of the 23rd international conference on world wide web (pp. 555-556). ACM.