Discuss what is a database and how they relate to data warehouses.
A database can be referred to be a space for holding data. Through the database, we can store, organize, protect, and deliver data in a computerized structure. It consists of information in a series of two-dimensional tables (Cuzzocrea, 2011). Similarly, a data warehouse in the analysis and decision-making process by providing easy access to the database. A data warehouse can also be said to be a collection of a database which works together. This helps an organization get new deeper insights into the data they hold in multiple databases.
A database consists of single or multiple files which need to be stored in a server or a computer. Mostly all large organizations don’t just store all their records in a single computer of an individual but rather a system which is centrally hosted (Wang, Shen & Sun, 2013). This system consists of a server of one or more devices. The server provides the service over a network. Thus, these types of servers are protected from human-related disasters to other threats with the inclusion of controlled access, power backup, security and much more precautions in place to get physical access to the facility.
Collection of databases working together can be considered as a data warehouse. It makes the integration of data from multiple databases possible helping give a more concentrated look into the data for smarter, better business decisions based on it. Data warehouse provides a set of tools and architecture to systematically organize and understand data from multiple sets of databases.
Let’s consider an example to better understand database and data warehouse and their interlinkage. If you’re provided with a login to the server holding two databases, unless they’re specifically labeled to be database or data warehouse; there’s no easy way with which one can differentiate either of those. However, there’s a distinctive difference which we look into with another example. Considering an example of Bhat-Bhateni Supermarket, which has stores in multiple locations. For this example, let’s consider the number of branches the business holds to be eight. Bhat-Bhateni has a database for each of the store to keep track of sales, profit, stock and alike. On top of it, managers in each store provide a report as well. However, those details won’t give the overall picture of how well the entire business is doing. If you want to obtain the overall picture of the performance, each of the eight records must be tallied and computed because the database is different. Now, if the business wants to have a detailed look at it, they can make use of a data warehouse. With this, all the eight-store database will transfer data individually to the central database which can now be used to perform analysis and get a bird-eye view of the entire business operation.
Cuzzocrea, A. (2011). Pushing artificial intelligence in database and data warehouse systems. Data & Knowledge Engineering , 70 (8), 683-684.
Wang, X., Shen, J., & Sun, C. (2013). Data Warehouse Oriented Data Integration System Design and Implementation. Applied Mechanics And Materials , 321-324 , 2532-2535.
Big data refers to collections of data that are so enormous in size, so varied in content, and so fast to accumulate that they are difficult to store and analyze using traditional approaches. The three "Vs.” is the defining features for big data (Wallance, 2015):
Volume: Data collections can take up petabytes of storage and are continually growing.
Velocity: Many data sources change and grow at very fast speeds. The nightly ETL The process often used for data warehouses is not adequate for many real-time demands.
Variety: Relational databases are very efficient for structured information stored in tables, but businesses can benefit from analyzing semi-structured and unstructured data as well.
As a physiological discipline of study, ethics is a systematic approach to understanding, analyzing and distinguishing matters of right and wrong good and bad, and admirable and deplorable as they relate to the well - being of and the relationship among sentient being (Rich, 2016).
The capabilities of big data analytics bring to the fore a larger array of ethical concerns or ethical implication of Big Data with potentially more wide-reaching implications for individuals, organizations, and society.
First, privacy has been explored already (Lyon, 2014), but other, potentially greater concerns are not well understood. For example, algorithmic decision-making, profiling of individuals and discrimination, control and surveillance of individuals and lack of transparency in the big data value chain, all raise concerns regarding the ethical use of big data analytics, and the social consequences, of big data analytics (Clarke, 2016).
Second is profiling, It could occur by classifying individuals into groups, intentionally or unintentionally, based on race, ethnic group, and gender, social and economic status, while offering or restricting special treatments or services to individuals or groups. For Example, the latest Facebook data theft issue also is a suitable example of profiling.
The third is Data Aggregators, It combines the data from multiple sources and creates a new picture of individuals based on their data. Ethical issues might arise in each segment of the value chain, with the final owner of the data using the data for purposes that can be very different from the initial intention.
Forth is monitoring and surveillance of individual’s behavior, Organizations then continuously observe and monitor individual’s behaviors can offer personalized services and products, which also implies that these individuals are no longer exposed to all options and choices available on a marketplace.
The Big Data revolution raises a bunch of ethical issues related to privacy, confidentiality, transparency, and identity. Big Data is about much more than just correlating database tables and creating pattern recognition algorithms. Big Data, broadly defined, is producing increased powers of institutional awareness and power that require the development of what we call Big Data Ethics. The Facebook acquisition of WhatsApp and the whole NSA affair shows just how high the stakes can be. Ensuring that you have effective ethical standards and governance for using customer data can ensure that your organization gains the benefits of Big Data while managing the associated risks. Consumers are likely to expect such transparency more and more, and a healthy dose of self-regulation may prove to be the best way to avoid outside regulation
Clarke, R. 2016. "Big data, big risks,” Information Systems Journal (26:1), pp. 77-90 (doi: 10.1111/isj.12088). Wallance, P. (2015). Introduction to Database Management System. UK: PEARSON.
Lyon, D. 2014. "Surveillance, Snowden, and Big Data: Capacities, consequences, critique,” Big Data & Society (1:2), pp. 1-13 (doi: 10.1177/2053951714541861).
Rich, K. (2016). Introduction to ethics. MIS Quarterly.
Wallance, P. (2015). Introduction to Database Management System. UK: PEARSON.
Raw data cannot be presented in required formats or interpreted for that matter. Database on the other hand is the process or system of categorizing these raw data into meaningful information which can be easily retrieve. The database holds information from various sources that are stored in a logical format. This collection helps to reduce inconsistency, improves accuracy, improved performance and security. (Wallace, 2015, p. 102)
With substantial and consistent increase in every day data from various sources and database, it becomes a challenge to manage it. How can you give special attention to collecting, transforming and converting it into meaningful information which is then used to make strong business decision? Data warehousing is the key to making this possible. (Rogers, 2000)
Database is flexible and the data can be changed. It can also help give reasonable decision making options. But when two or more types of systems need to be combined the speed may suffer. This can hamper the decision making as the information are not reliable. (Rogers, 2000) A data warehouse on the other hand not accepts data from various sources but also cleanses it and places it in correct form. (Rogers, 2000)
For example, excel, MS access, SPSS (Statistics for Data Analysis and Visualization) are examples of database while Redshift, BigQuery are some examples of data warehouse. Database are a smaller part of data warehouse. In a business setting, an entrepreneur will have to look after staff salary, inventory, sales,
Rogers, C. (2000). DATA WAREHOUSING. Allied Academies International Conference. Academy of Information and Management Sciences. Proceedings; Arden , 6-10.
Wallace, P. (2015). Information System in Action. In P. Wallace, Introduction to Information System (pp. 4-9). New Jersey: Pearson Education, Inc.
Database is the collection of information stored in systematic manner. It is the repository of information that is used as a backing data storage for some specific application or a set of applications. Database is often used for online transaction processing that includes accessing and updating data online in near real time with multiple concurrent accesses. According to Wallace (2015), "database is an integrated collection of information that is logically related and stored in a way to minimize duplication and facilitate rapid retrieval.”
The use of database helps an organization to reduce redundancy and inconsistency, improve information integrity and accuracy, enhance ability to adapt to changes, improve performance and scalability and increase security. In order to create and manage database, organizations use Database Management System (DBMS). It simplifies the process of data recording and retrieval that helps in efficient operation.
Data warehouse, on the other hand, is a system that pulls data from various sources within an organization together for the purpose of analysis and reporting. Data warehouse is generally used for online analytical processing that assist in decision making process. Data warehouse can be defined as a central source of the data that have been cleaned, transformed, and cataloged so that they can be used by managers and other business professionals for data mining, online analytical processing, business analysis, and decision support. Thus database is a component of data warehouse.
For example, in a hospital the general information of patients like name, age, address, diagnosis, referred doctor, date of appointment etc. are the recorded in the database on daily basis. The database contains only the current data ignoring the historical ones. The database can be accessed by multiple users simultaneously without affecting the system’s performance. The data warehouse is used to analyze the age of the people visiting hospitals, the hospital admission rate of people, the medicines that are highly demanded, number of patients with similar diagnosis that supports in decision making.
To conclude, database focuses on systematic recording of the information while data warehouse provides historical information that support in decision making. However integrating the database and data warehouse together to form a system that can support recording of transaction and analytics can eliminate data latency and avoid delay in business insights.
Wallace, P. (2015). Introduction to Information Systems (2nd ed.). New Jersey: Pearson Education, Inc.
If we see today’s information management system then it relies on the database and the software that manages it. Wallace (2015), in their book Introduction to information systems, the database is an integrated collection of information that is logically related and stored in such a way as to minimize duplication and facilitate rapid retrieval.
For instance; I worked for Jobs Dynamics, recruitment agency. Recently, we announced vacancy in the position of Management Trainee for Janata Bank Nepal Ltd. and there were almost 1600 numbers of applicants for that position. For every applicant, they have to visit website, log in to www.jobsdynamics.com and apply to the position with their all documents required. So far, this system has helped to store all the profiles of the candidates (database) in a systematic way that can be retrieved at any time.
Its major advantages over file processing systems include:
· Reduced redundancy and inconsistency: Previously, Jobs Dynamics should hold the applicant profile in a hardcopy which today has turned into softcopy in a systematic order.
· Improved information integrity and accuracy: The times that need to hold the hardcopy for every individual has reduced. The system has made easy for applicant and store data in a system which is accurate.
· Improved ability to adapt to changes: The manual work would need numbers of staffs for the short listing process. But, the system has made ease that a single person can hold the whole data.
· Increased security: The data stored in a system has its backup which will be secured for long time period. Mostly all large organizations don’t just store all their records in a single computer of an individual but rather a system which is centrally hosted (Wang, Shen & Sun, 2013). Here, Jobs Dynamics also using a system with the inclusion of controlled access, power backup, security and much more precautions in place to get physical access to the facility.
Another integration strategy involves data warehouses. The data warehouse is a central data repository containing information drawn from multiple sources that can be used for analysis, intelligence gathering, and strategic planning. The process to build a data warehouse is called extract, transform, and load (ETL). For instance; OLAP cube data warehouse using data mining techniques, to support the university’s public relations, admissions, and planning divisions in the efficient recruiting of students by surveying, through interviews; the opinions of management and operational personnel, and through documents; the attributes in application forms and annual reports (Poonnawat, Komlayut, & Henchareonlert, 2010).
The data warehouse makes data easily accessible for strategic planning, and opens up a wealth of opportunities for managers seeking insights about their markets, customers, industry, and more. It often becomes the main source of business intelligence that managers tap to understand their customers and markets, and make strategic plans.
For instance; the database of all the requirements of the company stored at website ( www.jobsdynamics.com ) and Jobs Dynamics extracts its analysis of every month to the requirements and the position that are very in need in the market. This has helping company to forecast the future requirements by positions and salary details. This is how the database is working as a data warehouse for strategic planning, and opens up a wealth of opportunities for managers seeking insights about their markets, customers, industry, and more.
Poonnawat, W., Komlayut, S., & Henchareonlert, N. (2010). The enhancement of ODL student recruiting campaign with data warehouse development and data mining techniques. Asian Association of Open Universities Journal, 5 (1), 41-47.
Wallace, P. (2015). Introduction to information systems . Pearson Higher Ed.
Wang, X. G., Shen, J., & Sun, C. (2013). Data Warehouse Oriented Data Integration System Design and Implementation. In Applied Mechanics and Materials (Vol. 321, pp. 2532-2538). Trans Tech Publications.