Big Data: ethical implications & example


#1

Discuss the ethical implications of Big Data. Provide an example of one of these implications.


#2

Big data refers to the collection and analysis of complex, large data sets. These data are so big, varied and changes really fast (with an addition or change of records every instance) making it all impossible to use traditional techniques to analyze these data (John Walker, 2014). So, as a replacement for this and to process the variety of data sets stored, data analytics tools are used. These big data are generated all the times by almost everything around us. An example of this can be that data is being generated through our use of social media, while checking out or buying a product online, and even while browsing the web.

With the data so big and almost everything stored, it certainly brings with it the ethical implications as well. Privacy, security of personal information, control over data and identity are some of the ethical implications of big data along with examples of their implications are explained in brief in the paragraphs to follow.

Everyone looks for privacy be in the real or digital world. But with big data, privacy is a big question mark. We use social media, we explore website pretending that no one knows us but that isn’t possible (Jain, Gyanchandani & Khare, 2016). Looking at what pages you like, what kind of interactions and posts you have and what the browsing activities you have can easily give the one looking at those details an idea about your personality and activities. The news about Cambridge Analytica data scandal can be taken as an example here. The business collected data from the Facebook users from 2014 and collected around 87 million user details. Those collected data were then used in an attempt to influence voter opinions in the favor of the politician they support. This caused a big uproar where Facebook had to pay more than half a million dollars to settle the case and a new set of rules was introduced by EU and other countries called GDPR (General Data Protection Regulation), which was recently rolled out in select countries while others are thinking of similar rules.

Security of personal information shared can still be done while ensuring that the confidentiality, integrity, and security by placing proper security rule sets (Ring, 2016). To ensure the confidentiality of the same, people are provided with different security codes to enter to access and transfer money from their account to another using the online system. If that factor is removed, see how easy it would be for anyone with a little information about you or your banking details to simply empty your account.

Transparency is important and without it clouds of uncertainty, questions floats. People need to know and want to know how the data is being used. If the data is used for anything that you’re not comfortable, one won’t be doing things that will allow them to collect data or prevent them from collecting such information. Also, knowing who the users of those data are is necessary here.

Big data close that option to not exit and just get a fresh start. As big data never forget anything while being universal. If you or your business have a very bad news which is published that will still exist even though the page is removed from the website that published it as there are web achieve site like one by Internet Archive called Way back machine, which just loads that page out. Similar to the business website and reviews. Even for a criminal charge, a person record is cleared after a certain term while this isn’t clearly set in case of search engine and data.

Identify can be compromised with big data. In the United States, with just three fields comprising of zip code, birth date and sex of a person, one can know the social security number uniquely for around 87% of people. Those three fields aren’t considered to be personally identifiable information and are commonly shared. People identify can be compromised with a log of web searches of a person. One can be identified or at least give a good idea of the personality trait of a person.


References

Jain, P., Gyanchandani, M., & Khare, N. (2016). Big data privacy: a technological perspective and review. Journal Of Big Data , 3 (1), 1-2.

John Walker, S. (2014). Big Data: A Revolution That Will Transform How We Live, Work, and Think. International Journal Of Advertising , 33 (1), 181-183.

Ring, T. (2016). Your data in their hands: big data, mass surveillance and privacy. Computer Fraud & Security , 2016 (8), 5-10.


#3

Database System:

A database is a collection of related data necessary to manage an organization. It includes transient data such as input documents, reports, and intermediate results obtained during processing. Database is collection of logically interrelated data and description of this data, designed to meet the information needs for organization. Database system is an integrated collections of related files along with the detail about their definition, interpretation, manipulation and maintenance. A Database Management system is a set of procedures that manage the database and provide the access to the database in a form required by any application program. It effectively ensures that necessary data in the designed form is available for diverse application of different organization.

Characteristics of a Database

  • Structure: Data types, data behavior

  • Persistence: Store data on secondary storage

  • Retrieval: a declarative query language,

A procedural database programming language.

  • Performance: Retrieve and store data quickly correctness

  • Sharing: concurrency

  • Reliability and resilient

  • Large Volumes

Data Warehouse:

A data warehouse is a single, complete and consistence store of data obtain from a verity of different sources made available to end users in what they are understand and use in a business context is called data warehouse. A data warehouse is subject - oriented, integrated, time - variant, nonvolatile, collection of data in support of management’s decision making process (Wallance, 2015).

Subject - Oriented:

  • Data is arranged and optimized to provide answer to questions from diverse functional areas.

  • Data is organized and summarized by topic.
    – Sales/ Marketing / Finance/ Distribution/ Etc.

  • It focuses on modeling and analysis of data for decisionmakers.

  • Excludes data not useful in decision support process.

Integrated:

  • Data Warehouse is constructed by integrating multiple heterogeneous sources.

  • Data preprocessing are applied to ensure consistency.

  • The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization.
    – Multiple Sources
    – Divers Sources
    – Divers Formats

Time - Variant

  • The Data Warehouse represents the flow of data through time.

  • Can contain projected data from statistical modes

  • Data is periodically uploaded then time - dependent data is recomputed

  • Provides information from historical perspective e.g. past 5 - 10 years.

  • Every key structure contain either implicitly or explicitly an element of time

Non volatile

  • Once data NEVER removed

  • Represents the company’s entire history
    – Near term history is continually added to it.
    – Always Growing
    – Must support terabyte databases and multiprocessor

  • Read - Only database for data analysis and query processing

  • Data Warehouse requires two operations in data accessing
    – Initial loading of data
    – Accessing of data

How Database are related to data Warehouse?

"All data warehouses are databases, not all databases are data warehouses” by ANSI/X3/SPARC Database System Study Group (2014).A data warehouse is an especially setup database designed to hold large amounts of data for reporting purposes. While a normal database is a structured collection of records or data that is stored in a computer system. A normal database is optimized for transactional activity (while keeping a small amount of history) a data warehouse will be optimized for large scale reporting. Within a data warehouse data from several systems will typically be merged together to present a global enterprise view. Data warehouses will also typically keep a very long history from several years to the entire life of the company so that very long term trends can be viewed. Finally we can say that, Data warehouse identifies a number of characteristics that differentiate warehouses and marts from conventional operational databases.


References

ANSI/X3/SPARC Database System Study Group. (2014). Reference Model for DBMS Standardisation. ACM SIGMOD Record, Vol. 15, No. 1, 15-17.

Wallance, P. (2015). Introduction to Database Management System. UK: PEARSON.


#4

Big Data are mega amounts of data that are more clearly described by the 3 Vs, volume, velocity and variety. These data’s are huge in size, continuously and consistently accumulate, are varied in content and are very difficult to analyze. (Wallace, 2015, p. 117) For example the data collected from social media, web blogs, web search, photo sharing, GPS etc. are known as big data. The place where they store the big data is known as warehouse and data when used from this warehouse is known as data mining.

Now why do big companies and governments spend so much on colleting data? Any company or government who can collect relevant data will be the most powerful among their competitors. The power of the world has shifted in this modern day from having weapons to having information as having better information can give way to greater decisions making power for impact.

There is a saying, with great power come greater responsibility and it is rightfully said. Big data holds hidden patters that are huge source of information on customer behavior pattern on their likes and dislikes which are highly lucrative to whoever can get their hands on it.

Data ownership does not mean that they own the data but is a loose interpretation of ownership. Big data is collected from various organizations and locations that receive the data and is assumed to be acquired through consent. Having an option to secure data does not necessarily mean that is will be respected. In most cases an individual does not own the data. It is but a small check before they are transported through the data transmission highway know as big data. (White, Ariyachandra, & White, 2017)

In a recent news that rocked the world, we can look at Facebook, a social site that has over 2 billion users worldwide. The data breach from Facebook, mostly catered to the American citizens have been blamed for the unethical use in the political campaign in the presidential election. As reported by The Guardian, "The data analytics firm that worked with Donald Trump’s election team and the winning Brexit campaign harvested millions of Facebook profiles of US voters, in one of the tech giant’s biggest ever data breaches, and used them to build a powerful software program to predict and influence choices at the ballot box. (Cadwalladr & Graham-Harrison, 2018)”

This is how the use of big data can have ethical issues which have direct impact on the society bringing changes to it.


References

Cadwalladr , C., & Graham-Harrison, E. (2018, March 17). Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach. The Guardian .

Wallace, P. (2015). Information System in Action. In P. Wallace, Introduction to Information System (pp. 4-9). New Jersey: Pearson Education, Inc.

White, G., Ariyachandra, T., & White, D. (2017). Big Data, Ethics, and Social Impact Theory - A Conceptual Framework. Journal of Management & Engineering Integration; tURLOCK , 22-28. Retrieved from https://proxy.lirn.net/MuseProxyID=mp02/MuseSessionID=co10e271d/MuseProtocol=https/MuseHost=search.proquest.com/MusePath/central/docview/2044314128/EFA3837CE0D14410PQ/3?accountid=158986


#5

As the name suggests, big data refers to the collection of data that are enormous in size, varied in content and fast to accumulate. Big data is generally defined using three characteristics of volume, variety and velocity. Volume refers to the increasing amount of data that are captured, variety implies that the data are captured and combined from various sources while velocity implies that data is generated with increasing speed (Someh, Breidbach, Davern, & Shanks, 2016).

The companies can use big data to find patterns and preferences that will help them in developing effective marketing program. But with big data comes big problems - ethical problems (Wallace, 2015). One of the pressing challenges is the ethical implications of Big Data. The diverse ethical issues associated with Big Data are:

Privacy

Any data on human subjects raise privacy issues. Ensuring privacy of data is related with defining and enforcing information rules about the collection, use and retention of personal information. Despite contributing to the big data, individuals do not have ownership rights over their data. The privacy notion implies that data owners should have control over personal data. Individuals do not have the ability to manage the flow of their information among the third parties over internet (King & Richards, 2014). Thus it is important to ensure that the privacy of the ones contributing to big data is not dead.

Confidentiality

Privacy and secrecy are two different things. Maintaining the confidentiality of the data shared over the internet is another ethical implication related with big data. There are chances of the data being sold or shared to third parties without the knowledge of individuals that make them reluctant about trusting the organization with their personal information. The ubiquitous nature of big data services is making individuals more anxious about the misuse of their personal data by big data services. So big data needs to maintain the confidentiality of the information to align with ethical values.

Transparency

The individuals cannot keep track of the data once it is provided. The data are sold or shared to third parties without the knowledge or consent of individuals possibly in the ways they do not want or expect. That is why it is important for the big data services to let the data owners know about where and how their data is being used.

Identity

Big data can compromise the identity of the data owners. The institution can determine who we are on the basis of the information provided that may not always be true. Our identity can be falsified over internet. Because of this, the data owners may be exposed to the things they do not want or desire.

For example: The promotional messages that were intended to pregnant women were being sent to a high school going girl in her teens because the big data service providers identified her as a pregnant women by analyzing the data about her (Wallace, 2015).

So it is important for the big data service providers to consider the ethical issues to maintain the privacy, confidentiality and transparency of the information without compromising the identity of data owners.

References

King , J. H., & Richards, N. M. (2014, March 28). What’s Up With Big Data Ethics? Retrieved from Forbes: https://www.forbes.com/sites/oreillymedia/2014/03/28/whats-up-with-big-data-ethics/#64c097d83591

Someh, I. A., Breidbach, C. F., Davern, M. J., & Shanks, J. (2016). Ethical Implications of Big Data Analytics. Research in Progress, Istanbul, Turkey. Retrieved from http://aisel.aisnet.org/ecis2016_rip/24

Wallace, P. (2015). Introduction to Information Systems (2nd ed.). New Jersey: Pearson Education, Inc.


#6

Wallace (2015), in their book Introduction to information systems, big data refers to extremely large compilation of data that are diverse in content and are acquired faster and are difficult to hoard and interpret using traditional approaches. It can facilitate to collect, store, and process that makes ability to translate increasingly large and complex data sets into a variety of sources which converts into competitive advantage.

Today, the tremendous growth of social media and consumer-generated content on the Internet has inspired the development of the so-called big data analytics to understand and solve real-life problems. The utility of big data analytics to better understand important hospitality issues, namely the relationship between hotel guest experience and satisfaction (Xiang, Schwartz, Gerdes Jr, & Uysal, 2015). In the same way, governmental organizations, banks, corporate etc. are also holding the big data to retrieve their citizens’ details, banks’ customers’ details, and corporate customers’ details.

The big data that we generate through website, mobile applications, and social media are accessible for strategic planning, and opens up a wealth of opportunities for managers seeking insights about their markets, customers, industry, and more. It often becomes the main source of business intelligence that managers tap to understand their customers and markets, and make strategic plans.

Now for the ethical concerned, the definition of business morals, rules, code of conduct, principles gets changed which provide guidelines for right and truthful behaviour in specific situations.

Security in today’s world is one of the important challenges that people are facing all over the world in every aspect of their lives (Basharat, Azam, & Muzaffar, 2012). It is true that the data and its accumulation help improve customer care service in many ways. However, such huge amounts of data can also bring forth many privacy issues, making Big Data Security a prime concern for any organization. So far, organization are working on it by using computational security and other digital assets in a distributed framework like MapReduce function of Hadoop, mostly lack security protections.

Transparency, today’s consumers have access to a wealth of free tools and resources. From email to social media, some of the most important parts of lives which are happening on platforms that have never once requested our credit card information. At the same time, these data collection and advertising models are illuminating a greater need for transparency in the tech space. This is why; it is said that consumers should expect to see companies falling on a spectrum. The situation may not be one or the other.

References

Basharat, I., Azam, F., & Muzaffar, A. W. (2012). Database security and encryption: A survey study. International Journal of Computer Applications, 47 (12).

Xiang, Z., Schwartz, Z., Gerdes Jr, J. H., & Uysal, M. (2015). What can big data and text analytics tell us about hotel guest experience and satisfaction?. International Journal of Hospitality Management, 44 , 120-130.

Wallace, P. (2015). Introduction to information systems . Pearson Higher Ed.