Session
Trends in Data Warehouse Data Modeling: Data Vault and Anchor Modeling
October 25th, 10:15 AM
Newgate Suite
Abstract
Since the 1990s Data Warehouse design has been dominated by two quite different approaches:
- The Corporate Information Factory or Top-Down approach advocated by Bill Inmon.
- The Datawarehouse Bus Architecture or Bottom-Up approach advocated by Ralph Kimball.
In the past 10 years, a number of hybrid and alternative approaches to data wareshouse modeling have gained popularity. Of these, the most important are:
- Hub-and-spoke architecture
- the Data Vault method proposed by Dan Linstedt
- Anchor Modeling propsed by Lars Rönnbäck (with others)
The hub-and-spoke architecture is best seen as a hybrid approach, integrating an Inmon data warehouse with a collection of Kimball data marts.
Data vault and anchor modeling form a departure of both Inmon and Kimball.
The approaches have a few characteristics in common:
- extremely normalized "anchor-oriented" data model
- (more or less) extensible schemas allow a more agile approach to data warehouse design and maintenance
- more granular approach to recording data history, potentially resulting in smaller data volumes and increased auditability
In addition to these benefits, these methods also result in a number of unique challenges, such as an extreme explosion in the number of tables, and therefore query complexity.
Already, some of these challenges are being addressed:
- a particular database optimization called table elimination (implemented in MariaDB) helps more efficient execution of complex joins
- data warehouse modeling and model generation tools such as Quipu and RapidACE (both open source) aid in constructing database schemas and generating the code or ETL procedures to load these data warehouses.
During this session, the key components of both Data Vault and Anchor modeling will be explained and contrasted to the more traditional Inmon and Kimball approach to data warehousing, and a demonstration will be given of the open source tools that help in designing and loading these data warehouses.
Slides
Speaker

Roland Bouman
BI and Web Developer
Roland Bouman is a business intelligence and webdeveloper. He's part of the MySQL and Pentaho communities and maintains a technical blog at http://rpbouman.blogspot.com/. He's authored 2 Pentaho related books: "Pentaho Solutions" and "Pentaho Kettle Solutions".




























