Privacy in Big Data and Decision Making
A Logical Data Warehouse is part of an architectural layer that consists of a logical layer and sits above the data stored in DW or Data Warehouse, which makes it possible to view data stored in DW and throughout the organization without the need of transfer or changing data. This data warehouse does not replace the traditional warehouse whose functions include persistence, data gathering, transforming; instead, it works alongside with fetching and converting data functions in real-time. In this article, the author discusses the concerns and steps one must take to gain control of security and privacy in a data warehouse and how the government is keeping with advancements in big data for decision making.
As stated by Shueh (2014), Data Warehouses has its set of challenges dealing with security issues, and it becomes difficult when the data stored in vast warehouses with diverse security needs serving many clients. Data Warehouses to work with utmost flexibility and strong security it needs to be setup in an environment with strict monitoring on performance to keep the data safe from hackers or data theft while it needs to be accessible to the user at the same time. The system must also log the activities of the user to prevent intrusion as the data stored here aggregated from different sources, which makes the warehouse a viable target (Kim, Trimi and Chung 2014).
Here are some common scenarios of security in Data Warehouses:
A firm is dealing with data from multiple sources stored in its data warehouse. It has to ensure employees have access to view data from their respective division relevant to them and at the same time have full privileges to access the data of all divisions and sections at the corporate offices.
An Organization dealing with sensitive personal data of its clients has to deal with its company policy for privacy laws and handling of data in case of government regulations.
Data Warehouse of a company sells data to its clients who can view them but applicable to those who have paid or subscribed but should not have access to others data.
Primary database security linked with system and object privileges. It prevents the users from executing unauthorized SQL commands that can give access to full-fledged data. The first step will be to profile roles or groups with various levels of data privilege, and this is possible on an administration level that has full control over the system.
There is a way to enforce fine-grained control for accessing data through SQL statements and even tie security policy at the row level. This is possible by using VPD or Virtual Private Database. It enables multiple user access to data in a secured way and at the same time segregating data. It ensures that users can see their salary and account details and not of others through security enforced on a column level even if the enterprise hold details of many firms. Its application-based logic limits data on a database level where security implemented. No lines of code changing are necessary to enforce these policies whether it is a standard or custom-built application. Security built within the warehouse and cannot bypass by any ad-hoc tool or other applications, which makes this critical application.
VPD can extend via Oracle Label Security, which is an Oracle Database security option to ensure safety control at label-based access. It is VPD enabled and increases the safety by providing row-level access and deploying secure data at warehouses. It allows pulling information from multiple sources into a big system while having the functionality and ease of access to manage and control from a centralized system. No SQL or PL programming needed because it is a standalone application. Fine-grained access control, aggregation of data and risk minimized as security is built-in at the row level (Wright 2014).
Transparent data encryption is done secure data in database columns, as they are stored in operating system files. It manages key encryption via secure storage and management in an external security module. In existing applications encryption routines embedding not needed, which reduces the cost and complexity of the encryption as simple commands can encrypt data. It reduces the difficulty of encrypting problem by integrating into the Oracle Database. It is because most encryption methods make a call to its functions inside the application code, which requires much expertise to encrypt and write software well. It does not work with SQL*Loader but an only direct path.
Data is not protected when it moves over a network even if RDBMS has multiple levels of security. Network encryption supported by Oracle are the industry standard.
Data Encryption Standard encryption toolkit enables encryption and decryption of data through DES algorithm; MD5 cryptographic hash ensures data security by the one-way hash algorithm and Triple Data Encryption or 3DES toolkit has support for two and three key modes (Biham and Shamir 2012).
Big Data and Decision Making:
As stated by Bondarev and Zakirov (2015) Big-data has already penetrated the business sector, but the public sector is now embracing the idea to make real-time decisions from a vast pool of data including sources from emails, the internet (journals, yellow pages, and whitepapers), audio and video media, and sensors used in various industries. Multiple sources have indicated that Government can take decisions in overcoming challenges in regards to national interests, such as disaster management, job opportunities, terrorism and improvement in health care system in better serving the citizens. Therefore, the government can adopt latest technologies including Hadoop and NoSQL to organize data into useful statistics. A study made by IT professionals show that government agencies can save up to 14 percent with statistical programs equivalent to $500 billion (McAfee et al. 2012).
It would have been easy for these programs to take off in government agencies if not for the contradictions that are prevailing among its employees. Public sector employees put forward their reasons behind opposing it such as big-data being a modern fad or they are not capable of coping up with the newer technological trends.
Globally many places such as Dublin, Singapore, New York, London, have embraced Big-data as an analytical option to foster the overall development of the city infrastructure or choosing the right leader who can lead. It resulted in the development of new job opportunities and job profiles such as chief data officer, chief innovation officer, etc. Backdated legislations and jurisdictions are pulling the government backward from implementing experimental ideas (Kim, Trimi and Chung 2014).
A great innovative step Chicago adopted SmartData project, which collects data, analyze it, identifies the trends and offer solutions. It being open source will encourage other cities to adopt and promote this idea. Taken for example, IBM has taken it a level forward by providing analytic solutions for the government in areas of cyber security, defense, finance, national intelligence, social welfare programs and public safety (Choucair, Bhatt and Mansour 2015).
Some of the strategies needs to be adopted are:
iii. Defense and Intelligence – All Source ISR, PMQ for the Military, i2 Enterprise Insight Analysis for Security and Defense
iv. Social Programs – Watson Analytics, Social Program Compliance, Social Program Performance (Ibm.com 2016)
Obstacles in adopting Big Data:
Obstacles that companies face regarding decision-making are
i. Inadequate support from top executives for data analysis as for managing company planning
ii. Inconsistent information reporting in business divisions and functional operations
iii. Lack of correct, relevant and timely data among business
iv. Insufficient tools for collection, integration and analyzing of information
v. Insufficient computational data expertise among employees and top leaders
Hence, companies can organize a survey and ask for feedback to incorporate strategies and implementing ideas without hindering progress (Aber 2015).
Data Warehouse security is more than just a necessity now with growing concerns with user data security, accessibility, and privacy at large enterprises. Virtual Private Database is a leading choice for its convenience including user data segregation, application-based logic for all applications, no change of codes and its strong security, which cannot bypass by other applications or tools. Oracle Label Security with the major focus being security at database rows and Transparent Data Encryption for its encryption of data by encryption keys. Other additional security measures are different encryption algorithms such as DES, 3DES. Big Data have been prevailing for a long time in private firms, as they were quick to approach, but the governments are also taking up the idea in spite of particular backlashes among the employees. The government is gradually adopting this method to improve the overhaul the old bureaucratic method and many international cities have already implemented it in improving the city infrastructure and other prevailing issues. Chicago has developed the first open source project SmartData to gather data and provide solutions, and IBM made great strides with Big Data in the field of Law Enforcement, Social Programs, Defense and Intelligence and Emergency Management. IBM provides these solutions to the governments for easier adoption.