top of page

Big data raises ethical questions

Volume X  Issue 2

Published October 2018

        Privacy rights are often an afterthought for many people, a meaningless consequence of a meaningless checkbox under privacy rights. But after two decades of modern technology, companies have accumulated vast stores of information, the analysis of which may be society’s deliverance or devastation, a phenomenon academics have aptly named, big data.

​

        Big data commonly refers to the collection, storage, and analysis of copious amounts of data — in other words, big data refers to a lot of information. This excess of information has become typical in fields where modern computer infrastructure has permitted access to a broad range of people (such as federal censuses, medical records, and online shopping).

 

        Because no technological development is complete without the proper ethical skepticism, a number of debates have arisen questioning the morality of using big data for research, policymaking, or commercial ventures. Such philosophical quandaries regarding big data are so numerous, they have spawned their own category of ethics: data ethics.

 

        “The ethics of data focuses on ethical problems posed by the collection and analysis of large datasets and on issues ranging from the use of big data in biomedical research and social sciences, to profiling, advertising and data philanthropy as well as open data,” according to Luciano Floridi and Mariarosaria Taddeo’s article “What is data ethics?”

        To massively simplify the field, the central theme of data ethics is whether the methodology used in big data assures the privacy rights of the individuals that make up the database. Given this, many will instinctually point toward the need for transparency, which, in an ideal world, would manifest as restrictions on the people working with the data and notifications to people whose data is being used. Unfortunately, this would be followed by questions of what data and to whom would such transparency refer to, ultimately stymieing our ability to study and benefit from big data.

​

        Take the example of researchers wishing to utilize big data in an epidemiological retrospective study of syphilis by parsing through medical records. In this retrospective, the data being used will come from surveys done 10 years ago as well as current medical data. Of course, information such as the individuals’ names is necessary to the study, but some sensitive information — gender, race, past diseases, etc. — may be key risk factors associated with syphilis. At the same time, this information has the potential to discriminate among certain demographics.

 

        If researchers were required to get the permission of every individual, the resulting population would be skewed toward people whose information isn’t susceptible to malintent.

​

        Without analyzing all of the big data, the potential benefits of such studies would be lost. Any situation involving big data ideally maintains a transparency in how the data is handled and not a transparency in what the data actually is, otherwise we might as well take a hammer to the government’s server room.

Vol10Is2DATA.png

        Where this ideal breaks down is in private use of big data. Companies such as Amazon store the data of its customers, and there is nothing, aside from Jeff Bezos’s morality, to impede the malicious use of big data.

 

        At the same time, Amazon is doing much to improve the quality of life and are at the forefront of many technological developments, constantly rolling out new products like the Amazon Echo or Amazon Web Services. But here again, does the reactionary ethical outburst take effect. What’s to stop the Echo or AWS from accumulating information about your day-to-day life and funneling it into the engine of Amazon? (See Sam Hosmer’s article for the answer)

 

        Currently, the pathways through which big data flows are as opaque as they are promising, but no progress can occur in the presence of controversy. In the end, a basic transparency about how personal data will be used in the world of big data, not to hinder big data to the point of bureaucratic insanity, needs to be established because, currently, there is no limit to what big data can be used for — good or evil.

bottom of page