Andy on Enterprise Software

How healthy is your data warehouse?

June 1, 2006

Not all data warehouses are created equal. Indeed both custom-built and some packaged data warehouse products can have surprising limitations in terms of their functionality. Just as I referred recently to Dave Waddington’s excellent checklist of things that would indicate a master data management problem, I would like to propose a series of questions that could be used to assess the depth of functionality of your data warehouse, whether it is custom built or packaged. For this list I am indebted to Dr Steve Lerner (until recently IS Director, Global Finance Applications and Integration at pharmaceutical firm Merial), who was kind enough to set out a series of symptoms that he had found indicated a problem with a data warehouse application. What I like about these is that they are all real business problems, and not a series of features defined by a software vendor or database designer. They are as follows.

1. Do you have difficulty conducting what-if analysis for a variety of business or product or geographical hierarchies?

2. Would it be hard for your current system to determine the impact of a business organization change on Operating Income?

3. Would it be hard for your current system to determine the impact of realigning geographical associations on regional profitability estimates?

4. Do you have difficulty restating historical data?

5. Can you view historic data using both a time-of-transaction basis and a current basis?

6. Can you currently restate historical data using new account structures?

7. Do you have difficulty viewing composites of data from sources with different granularities along key dimensions (i.e., comparing daily sales for a month, to forecast sales done monthly, to your annual profit plan, and to a five year long-range projection)?

8. Do you have difficulty with “bad data” getting into your current data warehouse?

9. Do you have difficulty maintaining the accuracy of your reference data?

10. Do you have difficulty with traceability from source to report?

So, how did your data warehouse application score? If it did not do well (i.e. failed on several of these ten points) then you should be concerned, because business users are likely to do exactly these types of things with the data in the warehouses, if not today then at some point. When they struggle, they will come looking for you.

A potential application of this checklist would be identify the best and worst data warehouses in your company. This type of “health check” could be useful in prioritising future investment e.g. it may highlight that some systems are in urgent need of overhaul or replacement. If you work in an IT department then going to business users with this kind of health check could be seen as being very pro-active and enhance the IT department’s credibility with the business. If you are a systems integrator then creating a process for measuring the health of a data warehouse along these lines could be a useful tool that could be sold as a consulting engagement for clients.

del.icio.us:How healthy is your data warehouse?  digg:How healthy is your data warehouse?  reddit:How healthy is your data warehouse?  Y!:How healthy is your data warehouse?

Casting spells

I write this column using some software called Blogger, which is fairly simple to use but is rather limited in some ways, so I am probably going to switch to a more flexible blog editor soon. However one cause of constant entertainment is the Blogger spell check function, which almost makes the thing worthwhile. My typing is erratic at best, so I frequently encounter the Blogger spell checker. At first I found its eccentric suggestions annoying, or even inept, but now I find that they have a certain charm of their own. It takes me back to the early days of word processing, when spell checkers were crude, and their alternative suggestions for one’s typographical errors were sometimes wildly inappropriate. Blogger’s spell checker recalls that era, as it presents sometimes surreal suggestions for what to a human eye is a pretty easy mistake to spot. For example, if you misspell:

“management” as “managemnet”

then you are presented with two alternatives. Its best guess is “mincemeat”, which is somehow appropriate in a couple of cases of managers I can recall, but not really a very likely error. Its only other attempt is “mankind”. This is not an isolated case. If you write about federated databases then it is endearing to see the typo:

“federaion” have the two alternatives: “bedroom” or “veteran”

proposed by the beastie. “Bedroom”? I would love to understand the algorithm that came up with that one. I was also impressed by:

“peformance”

Instead of the pretty obvious “performance” it rather sweetly suggests “peppermints”.

However my favorite is that if you type “Blogger” as a phrase then not only does it not recognize it. The term “blog” also sadly is a complete mystery to it, which might seem an omission given that it is intended as a spell checker for, er, blogs, or perhaps “blocs” as the spell checker so helpfully proposes. For “blogger”then it suggests the wonderfully ironic:

“blocker”

How true, how true. I would be interested to hear of your worst spell check horror, or indeed of a spell checker whose ineptness rival Blogger’s. Any offers?

del.icio.us:Casting spells  digg:Casting spells  reddit:Casting spells  Y!:Casting spells