Andy on Enterprise Software

If all you have is a hammer…

June 29, 2007

Claudia Imhoff raises an important issue in her blog regarding the cleansing of data. When dealing with a data warehouse it is important for data to be validated before being loaded into the warehouse in order to remove any data quality problems (of course, ideally you would have a process to go back and fix the problems at source also). However, as she points out, in some cases e.g. for audit purposes, it is actually important to know what the original data actually was, not just a cleansed version. This issue gets at the heart of a vital issue surrounding master data, and neatly illustrates the difference between a master data repository and a data warehouse.

In MDM it is accepted (at least by those who have experience of real MDM projects) that master data will go through different versions before producing a “golden copy”, which would be suitable for putting into a data warehouse. A new marketing product hierarchy may have to go through several drafts and levels of sign-off before a new version is authorised and published, and the same is true of things like drafts of budget plans, which go through various iterations before a final version is agreed. This is quite apart from actual errors in data, which are all too common in operational systems. An MDM application should be able to mange the workflow of such processes, and have a repository that is capable of going back in time and tracking the various versions, not just the finished golden copy. A good MDM repository should allow you to track back through master data as it is “improved” over time, not just look at the golden copy. The golden copy only should be exported to the data warehouse, where data integrity is vital.

People working on data warehouse projects may not be aware of such compliance issues, as they usually care only about the finished state warehouse data. MDM projects should always be considering this issue, and your technology selection should reflect the need for your MDM technology to track versions of master data over time.

del.icio.us:If all you have is a hammer...  digg:If all you have is a hammer...  reddit:If all you have is a hammer...  Y!:If all you have is a hammer...

Master data: from jungle to garden in several not so easy steps

June 20, 2007

I very much liked a succinct article by the ever-reliable Colin White on MDM approaches. Companies still struggle to get to grips with what a roadmap for MDM is all about, with apparently competing (and incomplete and immature) MDM technologies and management consultants who are only a few pages ahead of the customers in the manual. This piece neatly sets out the end goal of MDM and the various approaches to getting there (via analytic MDM or operational MDM as a start). It would have been even better had it explained in more detail how the alternatives can be run in parallel, and going into more depth on the issues of each sequences of steps. However by clearly separating out operational and analytic MDM and showing how these are complementary he is already doing a significant service.

The issue he mentions with “approach 1″ i.e. the “complexity of maintaining a complete historical record of master data” can be dealt with if you choose an analytic MDM technology which has built-in support for analysis over time. Colin points out that a key step is to end up with a low-latency master data store as the system of record for the enterprise, acting as a provider of golden copy master data to other sources, both transaction systems and analytical ones such as an enterprise data warehouse. If properly implemented, this will result in a change of the centre of gravity of master data, from the current situation where the system of record is ERP to a situation where the enterprise master data repository is actually the system of record, providing data through a published interface (and an enterprise service bus) through to all other systems, including ERP. This is a desirable end state, and is a key step to starting to unlock the monolithic ERP systems that companies use today into more manageable components.

I really hope that this paper gets the attention that it deserves. Getting most of the key messages into two page article is quite an achievement. I would like to see this developed further, and hopefully it will be.

del.icio.us:Master data: from jungle to garden in several not so easy steps  digg:Master data: from jungle to garden in several not so easy steps  reddit:Master data: from jungle to garden in several not so easy steps  Y!:Master data: from jungle to garden in several not so easy steps

The other shoe drops

June 8, 2007

For sometime I had been wondering which company Microsoft would buy to enter the MDM market. This is a key area in the broader business intelligence arena that they aspire to progress in, and was a major gap in their offering. Stratature was their choice, and it was a smart choice. Stratature plays in the analytical MDM area rather than being an operation transaction hub (like Siperian, say). It had built up a good reputation for flexible hierarchy management, an important feature of most MDM applications. They competed directly with Razza (an excellent tool which Hyperion purchased but Oracle seems to have now buried) and Kalido.

Stratature is the kind of bite-sized (16 employees) acquisition that Microsoft likes. It prefers to catch a company when it is small so that it can easily absorb the technical staff and mould them into the Microsoft way of doing things. When it has deviated from this rule (Great Plains, Navision) it has discovered why this was a good rule in the first place.

Congratulations to Ian Ahern, who impressed me on the several occasions I met with him. He also supports my (possibly biased) thesis that all the best MDM people are Brits. The terms of the deal are not public, and it would have been interesting to see what valuation a good MDM vendor achieved; I am sure it worked out well for Stratature’s shareholders. This now leaves Kalido as the main remaining independent analytic MDM vendor. This is not necessarily a bad thing for Kalido. Informatica has shown how you can thrive once your competitors get swallowed by the behemoths. Being stack-neutral in data management carries advantages.

del.icio.us:The other shoe drops  digg:The other shoe drops  reddit:The other shoe drops  Y!:The other shoe drops

Master data initiatives need co-ordination

June 3, 2007

A generally good article by Colin Beasty about CDI shows a common misconception regarding data warehousing. The article rightly points out that CRM (via Siebel etc) essentially failed to resolve the “single versions of the truth” about customer, with apparently 20-40 systems in a large company having customer data (this sounds plausible but he doesn’t quote a source of this). However he says that data warehouses can’t address this since “data integrity and validity are optional”. Here he seems to be mixing up an operational data store and a data warehouse, or at least a good data warehouse. An operational data store might well be a dump of data straight from a transaction system without work being done on the data (purely for performance reasons) but a data warehouse should definitely not be. A data warehouse is supposed to be pulling together data from multiple systems and providing a single, consistent view across the enterprise. It cannot do that without having a stage of validation of data, rejecting data that is inconsistent with the company’s business rules. If not, it is a case of “garbage in, garbage out”. Now certainly, if you have a source of customer data that is a well implemented CDI hub, rather than several sources (an ERP system, a CRM system etc) then essentially the CDI hub has carried out the validation and resolution stage already i.e. it is acting as a single system of record for customer data. However the warehouse cannot relax, since it also has to deal with all the other kinds of transaction and master data as well. Indeed, I would argue that a hub-based approach carries with it some dangers. If you implement a CDI hub, then do the same for product using a PIM solution, then you will realise that you need another hub for employee, asset, etc. CDI hub technology typically does not handle other types of master data as it is hard coded around the (important) class of master data called customer.

The article acknowledges that CDI is a subset of MDM, but does not draw attention to the danger of a piecemeal hub implementation one datatype at a time. What is needed is a master data repository that can act as a system of record for all types of master data, itself feeding both data warehouses and other systems (possibly via SOA as the article mentioned, but that is essentially optional). Without this realisation we are in danger of creating yet another set of master data sources without really getting to the heart of the issue. You can have multiple hubs, but somewhere you need a single repository which at least knows where every version of master data is in the enterprise, whether in hubs, ERP or elsewhere; better still if that MDM repository can act as an active provider of master data elsewhere, since it will have the enterprise-wide business rules needed to ensure data quality, which systems closer to operational processes may not have. Without a fully integrated approach to master data we are in danger of just adding unnecessary duplicate sources of master data (since these data are, after all, not going away in the ERP systems). Somewhere a true “master of master data” needs to exist, and that needs to be owned by business people with the authority to resolve inter-department disputes over master data (and not just customer data). Otherwise we are just adding another layer to the spaghetti.

del.icio.us:Master data initiatives need co-ordination  digg:Master data initiatives need co-ordination  reddit:Master data initiatives need co-ordination  Y!:Master data initiatives need co-ordination

MDM and risk

May 31, 2007

It is not often that I even bother to read articles written by vendors, but there were some good points made in an article by a practice manager for Sipierian regarding MDM and regulation. The point being made was how increased regulation, both in the US with its Sarbanes Oxley and Patriot Act, but also elsewhere with things such as Basel 2 in financial services, should be a significant external “push” for MDM to complement internal “pull” by corporations. In order to measure the overall risk levels at a bank you need to know the total aggregate positions taken with counter-parties, and be able to see whether there are any high exposures with particular clients (the case of Enron springs to mind). In order to do this you need to know exactly who you are doing business with, including subsidiaries of that company, and yet how well do companies really know this?

Many MDM projects set out to get a better understanding of the total picture of either customers or suppliers, since their multiple source systems and classifications of these make it very hard to get a single consistent picture. Certainly many years ago Shell realised that it had no idea how much business it did with, say, Ford or Unilever, since quite apart from internal classification overlap, it was not clear exactly what “Ford” or “Unilever” consists of. This was a key reason why it invested heavily in an enterprise data warehouse project. Multinational companies have so many subsidiaries, often with different trading names (for example Shell owns companies like Bharat Petroleum, Unilever is known as “Hindustan Lever” in India) that it is unlikely that individual operating units have carefully checked the Dun & Bradstreet numbers of all these companies and classified them correctly.

This is important enough when dealing with a global account, but can be critical when dealing with financial trades. I know of one MDM initiative that a financial services organisation that started off as a direct result of Enron, when it transpired that in fact the organisation thought it knew how much exposure it had with Enron, but rapidly discovered that it did not when Enron collapsed. I certainly know of one famous financial institution where a former VP admitted to me that the bank had “no clue” how much business it did with a large, complex beast like Deutsche Bank, for all the usual MDM reasons.

The thing I find curious is all these regulations are all pretty much in place now, and although companies have spent a money on compliance, it is clear from these two cases that the problems are far from solved. The next time an Enron-like event happens (and it will) companies will not only be nursing losses from their exposed positions, but may also have regulatory problems if it turns out that they actually did not truly know the extent of their exposure. Given the state of data quality and master data in most large organisations, I wonder whether companies are being complacent or regulators simply sleepy in checking the effectiveness of the systems at companies. Having a report that tells you your exposure level is all very well, but how reliable are the numbers that make that up? My experience of working with data warehouse and MDM applications tells me that they are likely to be a lot less reliable than many people think.

If you find all this talk of banks rather abstract, consider this: the average hospital has 25 systems that record patient information. If you are one of those patients, how confident are you that these will all tie up?

del.icio.us:MDM and risk  digg:MDM and risk  reddit:MDM and risk  Y!:MDM and risk

How do I love thee - let me count the ways

May 26, 2007

Those in the industry know that there is a dance that goes on between vendors who crave analyst endorsement, and analyst firms who portray vendor independence to end-user firms while happily trousering large fees from the vendors that actually constitute most of their income. The Cranky PM has written entertainingly on this corrupt relationship before, but usually the analyst firm at least makes a pretence of playing hard to get. However a recent puff piece for Informatica by Ventana analyst David Stodder raises the bar on sycophantic behaviour by analysts towards vendors who have taken out paid contracts with them. David gushes about Informatica’s support for MDM and how all right-minded “companies starting out with MDM look at what Informatica has to offer”. How about an alternative view:

“Informatica does not make a MDM application”

Strong stuff. What kind of vicious competitive slur can this be? Perhaps it is from a jealous competitor, or maybe some cynical spurned journalist or analyst that Informatica was less generous towards with its cheque book? Er, no, the source of this counter-statement is actually on the official Informatica web site. Last time I looked, having an ETL tool and a purchased data quality product, however good, does not equate to an MDM application, and all credit to Informatica for not pretending otherwise.

What amused me was this piece was entitled “analyst insight”. Yep, dazzling insight there all right.
(with apologies to Elizabeth Barret Browning).

del.icio.us:How do I love thee - let me count the ways  digg:How do I love thee - let me count the ways  reddit:How do I love thee - let me count the ways  Y!:How do I love thee - let me count the ways

The missing link

May 14, 2007

I thought that Connie Moore made a good point in an article regarding BI and BPM: BI vendors are missing out on the “process” end of things. I would go broader than that and say that MDM vendors are similarly missing a trick, and that in MDM it matters more. If you are building some reports then process may certainly have relevance, but when it comes to master data it is central. How does master data get created, read, updated and deleted? For example a marketing manager may want to introduce a new consumer type (as an aside, I discovered to my general mortification this week via garlik.com that I am classified in marketing terms as a “contented grey”, which I suppose was better than some of the painful sounding alternatives like “constrained solo”). This is a new type of master data and you can be sure that a major new type will have impacts on several systems, so will require quite probably a review or two before it goes upstairs for sign off. That process of creating, review, revision and sign-off currently probably happens by email, yet it should really be managed properly by a workflow tool. This is exactly what MDM should be all about, and yet most of the vendors I saw at the recent trade show in London had a look as blank as a Woolworths shop assistant when I asked them about workflow and process within their tools.

Several of the successful MDM projects I have seen make process quite central to the project. A few MDM products have support for workflow, but most are missing out and will need to work with other vendors to provide this, which is a less appealing proposition to customers than having an integrated approach.

del.icio.us:The missing link  digg:The missing link  reddit:The missing link  Y!:The missing link

Master Data comes to London - day 2

May 2, 2007

The CDI/MDM Institute conference has just wrapped up in Kensington, and I felt it was very successful. Organisers IRM ran an efficient conference, and attendance on day 2 held up pretty well. There was a fine keynote speech given by a certain dashing ex-Kalido founder, and the rest of the day was in a series of tracks with customer case studies and some vendor and consultant presentations. The second day felt a bit light on customer case studies compared to day one, but overall this conference did well at getting a reasonable number of real customers talking about projects, rather than just being a string of subtle or not-so-subtle sales pitches from vendors.

Aaron Zornes gave a reasonably balanced overview of the main thirteen MDM vendors (D&B, Dataflux, Data Foundations, Hyperion, I2, IBM, Initiate, Kalido, Oracle Purisma, SAP, Siperian, Teradata, Visionware) with a recurring theme that few have really got to grips with providing support for the data governance/collaboration element of MDM. Certainly the workflow around how master data is a key part to most MDM projects, but it looks as if most of the projects out there at present today are doing this via home-grown efforts, e.g. one UK company I spoke to had done this, using Biztalk as a framework. As more and more MDM projects go beyond pilots into production then the aspect of maintenance of master data will become more pressing, so I would expect to see customers increasingly demanding this. Indeed they should really be insisting that workflow support for MDM processes are one of the key evaluation criteria for software selection.

Overall, it seems that MDM is moving into the mainstream in the UK.

del.icio.us:Master Data comes to London - day 2  digg:Master Data comes to London - day 2  reddit:Master Data comes to London - day 2  Y!:Master Data comes to London - day 2

Master Data comes to London - day 1

May 1, 2007

This week is the CDI/MDM Institute London conference. It is a useful bellwether of MDM progress in the UK, and based on the attendance today it looks like MDM interest is indeed picking up in the UK. There are just over 300 attendees and 22 vendors exhibiting. Compared to the one last year there are encouragingly more customer case studies (last year the speakers were mostly vendors presenting), for example from Panasonic, BT, Harrods, Allied Bakeries, M&S and the Co-Op.

It is noticeable that the CDI v MDM debate continues to favour the broader view that customer is just one (important) type of master data, with the MDM acronym now being used by most of the vendors. The Panasonic case study was a good example, starting out as a product master data initiative and now spreading to customer, and then on to market information. The speaker was able to share some real business benefits form the initiative (enabling new products to be launched two weeks quicker as well as data quality savings), measured in millions of pounds. IBM claimed to be integrating its various acquired technologies, which is an improvement from the conversation I had with them a year ago when it was claimed that there was nothing wrong with having a clutch of separate, incompatible repositories, one for customer, one for product etc. When I asked how many different repositories would be needed to cope with all the different types of master data in an enterprise I received the mystical answer “seven”, at which point I gave up as the conversation had seemed to move into the metaphysical realm. We shall see where the integration efforts lead.

Aaron Zornes gave a useful high level split of the MDM market into the groupings of:

- operational e.g. CDI hubs like Siperian, Oracle
- analytical e.g. Hyperion Razza, Data Foundations
- collaborative i.e. workflow (e.g. Kalido does a lot of this)

which seems to me a useful split. Certainly no one vendor does everything, so understanding where the strengths of the vendors are, even in this simplistic way, at least helps customers narrow down which vendors are most likely to match their particular problem.

IBM, Initiate, SAS and Kalido are the main sponsors of the event, and once again SAP chose not to attend (to be fair, SAP did speak at two of the US MDM conferences). Nimesh Mehta assures me that SAP MDM is making steady market progress, but with no numbers he is willing to share I cannot verify this. However the buzz at the conference suggest that most customers here are using products from specialist vendors. One repeated theme in talking to SAP MDM early adopters is its apparent inability to deal well with customer data, perhaps not surprising given the A2i heritage of the product. No doubt SAP has lots of resources to throw at this problem, but at present it is not obvious that it is getting much in the way of production deployments. Clearly SAP’s dominant market position should get it on to every MDM shortlist, but how many real broad deployments there are in production is much less clear.

There were a couple of entertaining exhibit conversations. One Dilbert-esque one was with a sales person. I asked the following question “what does your product do - is it a repository, or data quality tool, or something else?”, The sales person took a sudden physical step back like a scalded cat and said “oh, a technical question; I’ll need to find someone else to answer that.” Now maybe I’m old-fashioned but “what does your product do?” seems to me a question that even a software salesman should be able to hazard a guess at. What kind of questions do you think this salesperson is likely to be able to field? I’m guessing anything beyond “where is the bar?” or “where do I sign the order form” is going to prove challenging.

I was amused to see Ab Initio had a stand. Ab Initio is famous in the industry for its secretive nature e.g. customers have to sign a NDA in order to see a product demo. This is driven by its eccentric founder Sheryl Handler, and makes life hard for its sales and marketing staff. There was indeed no printed brochure or material of any kind, and they (very charming) sales person I spoke to was unable to confirm very much about the company other than it seemed pretty certain that there was a UK office. Ab Initio’s technology has the reputation of being the fastest performing data transformation tool around, and in the UK has most of the really heavy-duty data users (BT, Vodaphone, Tesco, Sainsbury etc) as customers. It must certainly make it interesting trying to sell the thing, but perhaps the aura of mystery paradoxically helps; after all, this is not a company that anyone could accuse of aggressive marketing.

del.icio.us:Master Data comes to London - day 1  digg:Master Data comes to London - day 1  reddit:Master Data comes to London - day 1  Y!:Master Data comes to London - day 1

Data Governance and MDM

April 30, 2007

There is a good article about business ownership and MDM in DM News, from the rather unlikely pen of the marketing director of Siperian. Although any article written by a vendor should be the reading equivalent of held at a distance and handled with tongs, this piece actually has a lot of good sense in it. The key thesis is that MDM initiatives will not do well if owned by the CIO office or IT department, since success critically depends upon business engagement in the area of data ownership and governance. Now that MDM is developing momentum, some IT departments are embarking on large, enterprise-wide MDM initiatives. The hidden point of the article is that these projects are frequently being done using software from Oracle or SAP rather than an independent company like, well, Siperian, say, but you can forgive this subtly disguised message. The point is that without the business standing up and saying “Fred over there owns the notion of customer and will handle disputes over it between departments” and similarly for other master data, things will end in tears.

I was impressed recently by a client of mine who had already set up a cross-functional business team to do exactly this, and they could list not only the department which had been agreed to own various major master data elements (not just customer but asset, product, production facility, person etc) but actually had real people’s names attached to them i.e. it was not just wishful thinking. There were still plenty of issues with that project, but at least they had established the groundwork that would give them a chance of success. MDM initiatives which are driven by IT without this level of involvement to resolve boundary disputes are doomed to failure, whether the technology they use is from a mega vendor or an independent.

del.icio.us:Data Governance and MDM  digg:Data Governance and MDM  reddit:Data Governance and MDM  Y!:Data Governance and MDM