Data Warehousing in Information Technology

source
Data warehouse is the centralized repository that integrates different types of data from various sources such as transactional, external applications, and legacy. The data warehouse provides environment that is separated from operational functions to support organization’s analysis, reporting, data mining and other Business Intelligence (BI) functions. This isolation helps to perform queries without causing any impact on the transactional and operational systems which are responsible for primary transactions.

Data warehouse helps to optimize the continuous pulling data out of transactional systems and conversion of that data into ready to use information. Moreover, data warehouse used to process the humongous amount of complex data and perform queries on that data very efficiently.
Large organizations prefers data warehousing because of is numerous improvements and positive gains after successful implementation.

Enhanced Business Intelligence
Though the improved data access, decision makes are able to query actual data to retrieve information based on their needs. Due to the various sources of data, Managers and executives no longer have to make their decision on limited data. Moreover, data warehouses can be applied directly to business processing such as Inventory management, marketing segmentation, sales, and financial management.

Increased System and Query Performance
Data warehouses are designed to optimize the speed of analysis and retrieval of data. In addition, it is also designed for storing huge volumes of data and query that data with high speed. Efficient distribution of system load across an entire organization’s technology infrastructure reduces the burden on operational environment.

Timely Access of Data
Data warehouses have scheduled data integration routines known as ETL (Extraction, Transformation, and Loading) which consolidate data from multiple various sources and transform the data into actionable information. So that, business users can access data easily from one interface. Therefore, the consistent use of query and consolidated data repository tools enables business users to spend more time on data analysis and minimize the time on gathering data.

Enhanced Data Quality and Consistency
Due to the efficient conversion of data from various sources into to common actionable format, business units and other departments can produce the consistent results within the organization. Production of consistent data from each department will boost up the confidence in the accuracy of data.  Subsequently, overall confidence in the organization’s data also increases.

Historical Intelligence
Data warehouse can store large volumes of historical data, so that, the organization can analyze the data based on different time periods and trends to make the future predictions. Advancement in reporting and analysis of multiple time-periods are the main benefits of the data warehousing. 

High Return on Investment
Data warehouse implementation and other business intelligence systems generates higher amounts of revenue and more cost savings. Studies states that, organizations that have implemented data warehouses have increased revenue and decreased expenses than organization that have not.

Problem areas when implementing Data warehousing:

Data Quality
Due to the large volumes of data coming from various sources, when tries to combine with inconsistent data from other sources, it raises the errors. It may encounters data quality challenges like duplications, inconsistent data, missing data and logic conflicts. Poor quality of data affects the analytics and reporting.

Cost
Though implementing data warehouse is to save the expenses, it has other hidden problems with respect to cost. According to the survey, there are very low number of highly skilled staff to lead the non-BI technicians. So, with few experienced staff, it is not easy to deliver effective results. In reality, these kinds of efforts are very costly. Lastly, high maintenance cost for high maintenance systems.

Integration
Integration of data collected from various sources is one of the difficult task. Different types of tools for every operation of the data warehouse. In order to generate the desirable solution, organization must spend considerable amount of time to analyze how the various source of data can be integrated.

Suitable Approaches for Data Warehousing:

Inmon’s top-down approach
According to Inmon, data warehouse is a centralized repository for entire organization. Dimensional data marts are created after the complete data warehouse is created. Atomic data is stored at the lowest level of detail in data warehouse. Inmon defines that data warehouse as subject-oriented, time variant, Non-volatile and integrated approach.

Kimball’s bottom-up approach
According to Kimball, the data marts are created first. These data marts provides view about organizational data and when needed these data marts can be combined into a larger data warehouse. Kimball’s approach focus on ease of end-user accessibility and high performance to the data warehouse. Kimball defines, data warehouse is nothing more than the combination of all data marts.

Finally, while designing data warehouse, organization must focus on long term and short term business objectives. Analyze the sources of data and its quality as well as quantity. Evaluate the level of resources.