Operational Data Storage (The ODS!)

March 8th, 2010 by Henk Leave a reply »

Introduction

in several projects i have seen the usage of an ODS. Although common in the field of datawarehousing it appears to have different aspects within different projects. Also the place in the architectures sometimes looks strange (for example i’ve seen the ODS inbetween a staging and datawarehouse layer instead of having it besides the staging or datawarehouse area)

I’m not going to be yet another one who knows how to use it or telling you what it exactly is. This post will only show some findings and  practice i’ve found and learned.

What is an ODS

some definitions

found via Google

Googling on “What is operational data storage” gives me for example the following definitions:

“An operational data store (ODS) is a type of database often used as an interim area for a data warehouse. Unlike a data warehouse, which contains static data, the contents of the ODS are updated through the course of business operations. An ODS is designed to quickly perform relatively simple queries on small amounts of data (such as finding the status of a customer order), rather than the complex queries on large amounts of data typical of the data warehouse. An ODS is similar to your short term memory in that it stores only very recent information; in comparison, the data warehouse is more like long term memory in that it stores relatively permanent information.”

or

“The Operational Data Storage (ODS) system provides a way to save data that can be shared by multiple flow components or flows. ODS is a type of database that serves as a quick-access data storage. An ODS system lets you perform many queries on small amounts of data, and differs from a data warehouse, in which large amounts of information is stored and queries are run on a large volume of data.”

found on wikipedia:

An operational data store (or “ODS”) is a database designed to integrate data from multiple sources to make analysis and reporting easier. Because the data originates from multiple sources, the integration often involves cleaning, resolving redundancy and checking against business rules for integrity. An ODS is usually designed to contain low level or atomic (indivisible) data (such as transactions and prices) with limited history that is captured “real time” or “near real time” as opposed to the much greater volumes of data stored in the Data warehouse generally on a less frequent basis.

According to Inmon

Inmon’s definition is an integrated, volatile, up-to-the-minute picture of the business. This is f.e. a useful structure for one-to-one marketing and customer relations, in addition to other areas where only the most recent transactions are important to the operational business process.

According to Kimball

Kimball describes the ODS as “the “front edge” of the data warehouse, no longer warranting a separate designation or environment. Thus, the only free-standing ODS we should expect to encounter is supporting the operational needs with integrated transation data” (from the Life Cycle Toolkit)

According to D. Linstedt

A Data Vault supports all the aspects of operational information need. In the architecture of the Data Vault an ODS therefor cannot exists. The principles of an ODS returns or are integrated in the Data Vault principles.

History of the ODS

(found on the web)

“In the early 1990s, the original ODS systems were developed as a reporting tool for administrative purposes. They were usually updated daily and provided reports about business transactions for that day, such as sales totals or orders filled. This type of system is now referred to as a Class III ODS. With changes in technology and business needs, the Class II ODS evolved to track more complex information such as product and location codes, and to update the database more frequently (perhaps hourly) to reflect changes. Class I ODS systems arose from the development of customer relationship management (CRM). In Class I systems, synchronous or near-synchronous updates are used to provide customers with consistently valid and organized information. Another version, the Class IV ODS, was recently developed with an added capacity for more interaction between the data warehouse or data mart and the ODS. ”

Key Elements

From my experiences and above information I see the following core elements of an ODS:

  • Data is volatile / temporarily stored. The ODS contains no or limited history.
  • Integration of 1 or more sources is still possible. Similar to the datawarehouse an ODS can contain data from multiple sources. The only reason the ODS exists is the usage of more then 1 source for the integration of the data (with only 1 source, the system itselfs should be best suited to provide the information!)
  • frequency of loading data is higher then the load of data into a data warehouse. Thus an ODS is more (near) real-time then a DWH (when not based on data vault principles f.e.)
  • primary not intended for tactical of strategic usage.
  • no standard modelling of datasets (its free to use any model for its intend)

Is an ODS a source for your DWH?

Because data can flow from source or stage area directly into the ODS it therefor can be then used as a source for the Datawarehouse. In my opinion it then formally should first be loaded to the staging area (when it was extracted from the source directly).  However practically there are no objections to load it to the datawarehouse directly because the data is already on the same platform.

In practice

I am trying to avoid to give my view whether an ODS should be used within the field of datawarehousing or not. Maybe this is even better to write about in a seperate post. I’ll keep it in mind.

My Opinion

In practice i’ve seen the usage of the term ODS with different  definitions / meanings / intentions / reasons.  to my opinion, depending of the choosen architecture, an ODS is justified or not. Still I try to avoid the usage. The only fair reason to implement such an module into a DWH/EDW architecture is that near realtime required data from more than 1 source is needed for operational BI purpose.

Note from the author: Yes yet another blogposting in English, my apologies! this post has been draft for a long time, even before my switch to the dutch language.

Excuses van de auteur dat deze post weer in het engels is. Dit artikel stond reeds lang in concept klaar, nog voordat ik besloot over te gaan in het bloggen in het nederlands.

Share this by:
  • email
  • Twitter
  • LinkedIn
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • BlinkList
  • Live
  • StumbleUpon
Advertisement
blog comments powered by Disqus