The Ethics Of Data Science
The twin motors of data and information technology are driving innovation forward in most every aspect of human enterprise. In a similar fashion, Data Science today profoundly influences how business is done in fields as diverse as the life sciences, smart cities, and transportation. As cogent as these directions have become, the dangers of data science without ethical considerations is as equally apparent - whether it be the protection of personally identifiable data, implicit bias in automated decision-making, the illusion of free choice in psychographics, the social impacts of automation, or the apparent divorce of truth and trust in virtual communication. Justifying the need for focus on the ethics of data science goes beyond a balance sheet of these opportunities and challenges, for the practice of data science challenges our perceptions of what it means to be human.
If ethics is defined as shared values that help humanity differentiate right from wrong, the increasing digitalization of human activity shapes the very definitions of how we evaluate the world around us.
Margo Boenig-Liptsin points out that our ever-increasing reliance on information technology has fundamentally transformed traditional concepts of “privacy”, “fairness” and “representation”, not to mention “free choice”, “truth” and “trust”.[i] These mutations underline the increasing footprint and responsibilities of data science – beyond the bytes and bits, data science shakes the perceptual foundations of value, community, and equity. If academia has been quick to establish Data Science programs around statistics, computation and software engineering, few programs address the larger societal concerns of data science, and fewer still analyse how responsible data practices can be conditioned and even encouraged.[ii] Let’s sketch out the contours of this challenge.
Perhaps no area of ethics in data science has received more attention today than the protection of personal data. The digital transformation of our interactions with social and economics communities reveals who we are, what we think, and what we do. The recent introduction of legislation in Europe (GDPR), India (the Personal Data Protection Act, 2018)., and California (the California Consumer Privacy Act of 2018) specifically recognize the rights of digital citizens, and implicitly address the dangers of the commercial use of personal and personally identifiable data. These legal frameworks try to rebalance the inequitable relationships of power and influence between organizations and individuals in codifying ethical benchmarks including the right to be informed, the right to object, the right of access, the right to rectification, and the right to be forgotten.
The attention given to this legislation extends far beyond concerns for data protection. These attempts to define proper and illicit data practices respond to a number of ethical questions. As data become the new currency of the world economy, the lines between public and private, between individuals and society, and between the resource rich and the resource poor are being redrawn. Who owns personal data, and which rights can be assigned with explicit or implicit consent? To what extent should the public and private organizations be able to collect and control the vast records of our human interaction? To what extent should these data controllers and data processors be held accountable for the loss or misuse of our data?
The ability to conscientiously take decisions among alternative possibilities has long been a viewed as a condition that separates man (or at least the living) from machines. As innovations in data science progress in algorithmic trading, self-driving cars, and robotics, the distinction between human and artificial intelligence is becoming increasing difficult to distinguish. Current applications of machine learning cross the threshold of decision support systems and enter the realm of artificial intelligence where sophisticated algorithms are designed to replace human decision-making.
Crossing this threshold introduces several ethical considerations. Can economic and/or social organizations rely on increasingly complex methodologies in which many understand neither the assumptions nor the limits of the underlying models? Are we willing to accept that these applications, which by their very nature, learn from our experience – making us prisoners of our past and limiting our potential for growth and diversity? Do we understand that the inherent logic of these platforms can be gamed – which creates opportunities to “cheat” the system? Last and but not least, who is legally responsible for the implicit bias inherent in automated decision -making?
John Battelle suggested years ago that our digital footprints provide indelible roadmaps through the database of our intentions.[iii] The leitmotif of Data Science has been to help organizations understand the objectives, motivations and actions of both individuals and communities. The work of Michal Kosinski and David Stillwell has promised even more further in suggesting that the pertinence of prescriptive analytics can be greatly enhanced in focusing on patterns of behavior (personality traits, beliefs, values, attitudes, interests, or lifestyles) rather than clusters of demographic data.[iv]
Applications of micro-targeting has since been pitched as powerful tools of influence in the fields of marketing, politics and economics. Even if an individual’s ability to exercise “free choice” has long been a subject of debate, the practice of feeding consumers only information that they will agree with binds rationality even further. Moreover, micro-targeting techniques allow researchers extrapolate sensitive information and personal preferences of individuals even when such data is not specifically captured. Finally, as the “client becomes the product”, there is a real danger that data science is used less to improve an organization’s product or service offering than to turn consumers into objects of manipulation.
The goal of Information technology has long been to provide a single version of the truth to facilitate exchanges of products, services and ideas. For a number of reasons tied to the evolution of both the global economy and national markets, a perceptual gap has grown between this “ground truth” and the trust consumers have in the intermediaries (like the State, the banks, and the corporations) that capture, collect, and monetize this data. The social mechanics of the World Wide Web distort the relationship even further, putting on equal footing fact and fiction, favoring the extremes to the banality of normality.
Distributed ledger technologies in general, and blockchain technologies in particular, offer their share of hope both for a more transparent and traceable source of information. Yet this vision of an Internet of Value is partially clouded by the potential societal challenges of relatively untested technologies. Can technology in and of itself be the standard for both truth and trust? To what degree will people and organizations accept the primacy of transparency? On what basis can social values like the right to be forgotten be reconciled with the technical requirements of public ledgers? Freed from the conventions and the logic of financial capitalism, can human nature accept a radically different basis for the distribution of wealth?
Human and machine intelligence
Although the impact of information technology on the organization of private and public enterprise has been largely debated over the last four decades, the impact of data science on the function of management has received considerably less attention. In the trade press, technology is often seen as ethically neutral, proving a digital mirror of the dominant managerial paradigms at any point in time. In academia the relationship is subject to closer scrutiny, authors like Latour, Callon and Law have demonstrated how different forms of technology influence the way that managers, employees and customers perceive the reality of social and economic exchanges, markets and industries.
When focusing on data science, these concerns for the context bring to light their own lot of ethical considerations. If Data is never objective, to what extent must management understand the context of how the data was collected, analyzed and transmitted? Similarly, as algorithms have become more pervasive and complex, to what extent do managers need to understand their assumptions and their limits? As applications assume larger and larger roles in key business processes, to what extent should management be defined around the coordination of human and software agents? Seen from another perspective, as artificial intelligence matures, which functions of management should be delegated to bots and which should be reserved for humanity?
The Ethics of Data Science will be the theme of BAI’s short program this September 6-15th at the SDM Institute for Management Development in Mysore, India. The Fall Session will explore the opportunities and challenges of digital citizens, automated decision making, micro-targeting, distributed ledgers, and artificial intelligence using corporate testimony and cases in both the public and private sectors. The session is open to management and engineering students, as well as working professionals. For details and inscriptions, please see http://baifall.com
Lee Schlenker, March 21, 2019
* This article originally appeared in the Visionary Marketing's Marketing & Innovation Blog
Lee Schlenker is a Professor of Business Analytics and Community Management, and a Principal in the Business Analytics Institute http://baieurope.com. His LinkedIn profile can be viewed at www.linkedin.com/in/leeschlenker. You can follow the BAI on Twitter at https://twitter.com/DSign4Analytics
[i] Ericson, Lucy (2018), It’s Time for Data Ethics Conversations at your Dinner Table
[ii] Fiesler, Casey (2017), More than 50 Tech Ethics Courses, with links to syllabi
[iv] Kosinski, Mikal et al. (2013), Private traits and attributes are predictable from digital records of human behavior