What is the best methodology to use when creating a data warehouse? Once you decide to build a data warehouse, the next step is deciding between a normalized versus dimensional approach for the storage of data in the data warehouse. A key advantage of a dimensional approach is that the data warehouse is easier for the user to understand and to use. Also, the retrieval of data from the data warehouse tends to operate very quickly. Plus, if you are used to working with a normalized approach, it can take a while to fully understand the dimensional approach and to become efficient in building one. The normalized structure divides data into entities, which creates several tables in a relational database.
|Published (Last):||25 June 2017|
|PDF File Size:||1.65 Mb|
|ePub File Size:||4.49 Mb|
|Price:||Free* [*Free Regsitration Required]|
We are living in the age of a data revolution, and more corporations are realizing that to lead—or in some cases, to survive—they need to harness their data wealth effectively. The data warehouse, due to its unique proposition as the integrated enterprise repository of data, is playing an even more important role in this situation.
There are two prominent architecture styles practiced today to build a data warehouse: the Inmon architecture and the Kimball architecture. This paper attempts to compare and contrast the pros and cons of each architecture style and to recommend which style to pursue based on certain factors. In terms of how to architect the data warehouse, there are two distinctive schools of thought: the Inmon method and Kimball method. They both view the data warehouse as the central data repository for the enterprise, primarily serve enterprise reporting needs, and they both use ETL to load the data warehouse.
The key distinction is how the data structures are modeled, loaded, and stored in the data warehouse. This difference in the architecture impacts the initial delivery time of the data warehouse and the ability to accommodate future changes in the ETL design. When a data architect is asked to design and implement a data warehouse from the ground up, what architecture style should he or she choose to build the data warehouse?
The Inmon approach to building a data warehouse begins with the corporate data model. This model identifies the key subject areas, and most importantly, the key entities the business operates with and cares about, like customer, product, vendor, etc.
From this model, a detailed logical model is created for each major entity. For example, a logical model will be built for Customer with all the details related to that entity. There could be ten different entities under Customer. All the details including business keys, attributes, dependencies, participation, and relationships will be captured in the detailed logical model.
The key point here is that the entity structure is built in normalized form. Data redundancy is avoided as much as possible. This leads to clear identification of business concepts and avoids data update anomalies. The next step is building the physical model. The physical implementation of the data warehouse is also normalized. This normalized model makes loading the data less complex, but using this structure for querying is hard as it involves many tables and joins.
So, Inmon suggests building data marts specific for departments. The data marts will be designed specifically for Finance, Sales, etc. Any data that comes into the data warehouse is integrated, and the data warehouse is the only source of data for the different data marts. This ensures that the integrity and consistency of data is kept intact across the organization. Figure 1. The Kimball approach to building the data warehouse starts with identifying the key business processes and the key business questions that the data warehouse needs to answer.
The key sources operational systems of data for the data warehouse are analyzed and documented. ETL software is used to bring data from all the different sources and load into a staging area. From here, data is loaded into a dimensional model. Here the comes the key difference: the model proposed by Kimball for data warehousing—the dimensional model—is not normalized. The fundamental concept of dimensional modeling is the star schema.
In the star schema, there is typically a fact table surrounded by many dimensions. The fact table has all the measures that are relevant to the subject area, and it also has the foreign keys from the different dimensions that surround the fact. The dimensions are denormalized completely so that the user can drill up and drill down without joining to another table. Multiple star schemas will be built to satisfy different reporting requirements. So, how is integration achieved in the dimensional model?
The key dimensions, like customer and product, that are shared across the different facts will be built once and be used by all the facts Kimball et al. This ensures that one thing or concept is used the same way across the facts. This is the document where the different facts are listed vertically and the conformed dimensions are listed horizontally. Where ever the dimensions play a foreign key role in the fact, it is marked in the document. This serves as an anchoring document showing how the star schemas are built and what is left to build in the data warehouse.
Now that we have seen the pros and cons of the Kimball and Inmon approaches, a question arises. Which approach should be used when? This question is faced by data warehouse architects every time they start building a data warehouse.
Here are the deciding factors that can help an architect choose between the two:. It has been proven that both the Inmon and Kimball approach work for successfully delivering data warehouses. In a hybrid model, the data warehouse is built using the Inmon model, and on top of the integrated data warehouse, the business process oriented data marts are built using the star schema for reporting.
We cannot generalize and say that one approach is better than the other; they both have their advantages and disadvantages, and they both work fine in different scenarios. The architect has to select an approach for the data warehouse depending on the different factors; a few key ones were identified in this paper.
Breslin, Mary. Accessed May 22, Inmon, W. Building the Data Warehouse, Fourth Edition. Marakas, George M. Prentice Hall, Accessed May 23, Accessed May 26, Accessed May 25, He is passionate about data modeling, reporting and analytics.
Menu Menu. Background In terms of how to architect the data warehouse, there are two distinctive schools of thought: the Inmon method and Kimball method. The Inmon Approach The Inmon approach to building a data warehouse begins with the corporate data model.
Share this post. Data Modeling is Data Governance. Data Warehouse Design — Inmon versus Kimball. Understand Relational to Understand the Secrets of Data. The Digital Transformer. We use technologies such as cookies to understand how you use our site and to provide a better user experience.
By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
Data Warehouse Design – Inmon versus Kimball
When it comes to designing a data warehouse for your business, the two most commonly discussed methods are the approaches introduced by Bill Inmon and Ralph Kimball. Debates on which one is better and more effective have lasted for years. But a clear-cut answer has never been arrived upon, as both philosophies have their own advantages and differentiating factors, and enterprises continue to use either of these. Inmon defines a data warehouse as a centralised repository for the entire enterprise. Dimensional data marts are created only after the complete data warehouse has been created. Thus, the data warehouse is at the centre of the corporate information factory CIF , which provides a logical framework for delivering business intelligence. Keeping in mind the most important business aspects or departments, data marts are created first.
Summary : in this article, we will discuss the differences between Kimball and Inmon in data warehouse architecture approach. To those who are unfamiliar with Ralph Kimball and Bill Inmon data warehouse architectures please read the following articles:. Both architectures have an enterprise focus that supports information analysis across the organization. This approach enables to address the business requirements not only within a subject area but also across subject areas. Bill Inmon recommends building the data warehouse that follows the top-down approach. Then it is integrating these data marts for data consistency through a so-called information bus. Kimball makes uses of the dimensional model to address the needs of departments in various areas within the enterprise.
Kimball vs. Inmon Data Warehouse Architectures
When it comes to data warehouse designing, two of the most widely discussed approaches are the Inmon method and Kimball method. For years, people have debated over which one is better and more effective for businesses. Initiated by Ralph Kimball, this data warehouse concept follows a bottom-up approach to data warehouse architecture design in which data marts are formed first based on the business requirements. The primary data sources are then evaluated, and an Extract, Transform and Load ETL tool is used to fetch different types of data formats from several sources and load it into a staging area.
Inmon or Kimball: Which approach is suitable for your data warehouse?
We are living in the age of a data revolution, and more corporations are realizing that to lead—or in some cases, to survive—they need to harness their data wealth effectively. The data warehouse, due to its unique proposition as the integrated enterprise repository of data, is playing an even more important role in this situation. There are two prominent architecture styles practiced today to build a data warehouse: the Inmon architecture and the Kimball architecture. This paper attempts to compare and contrast the pros and cons of each architecture style and to recommend which style to pursue based on certain factors. In terms of how to architect the data warehouse, there are two distinctive schools of thought: the Inmon method and Kimball method.