Andy on Enterprise Software

Squaring the MDM circle

April 7, 2008

Jill Dyche raises an important point about the how companies are tackling MDM. She mentions the “random acts of MDM” that are done in isolation in a particular business area, or involving a particular data domain, which are unlikely to evolve into an enterprise-wide MDM solution.

The tricky issue that companies face is that MDM is a genuinely large-scale endeavour, and because we all know how well giant enterprise projects usually go, they are understandably reluctant to take on an enterprise-wide project. Instead they pick off an easier piece, such as one particular data type, or perhaps a broader set of master data types but only in a subset of the enterprise, say across one division. As Jill says, such isolated initiatives won’t in themselves magically grow into enterprise MDM. There is a further danger in disconnected initiatives. At this point the vendor technology out there is at very different stages of maturity depending on what kind of data you want to tackle, and on what scale. Some vendors have a well proven customer hub technology, but with limited experience in tackling product data (and may lack key functionality to do this e.g. attribute inheritance) and usually have very limited ideas about business process workflow and data governance support. Other vendors from the PIM or the analytic MDM world usually have much better business workflow support, yet may have limited scalability e.g. you it would be a brave person who tried doing a 100 million record customer hub using a PIM product. The vendors with a CDI heritage are adding more workflow capability, and the PIM and analytic MDM vendors are working on scalability, but these are works in progress rather than completed and tested features and functions. Hence separate initiatives may end up using different technologies due to the demands of a particular area, and it would be easy to end up with one technology to handle product data, another for customer data, and maybe another where analytics were the driver.

In my view you need to combine an enterprise-wide vision with a practical, bite-sized approach i.e. thing big but start small. You can build a broad enterprise strategy that encompasses data governance processes for example, even if you decide to build out actual master data hubs in a stepwise fashion, beginning with certain high value data domains or company divisions that can best benefit form improve master data. However you need to keep the big picture in mind in order to avoid (or minimise) duplicate technology investments that may prove hard to fit together. There are no magic bullets here, but enterprise architects need to put in place the processes and broad strategy that will lead to a better master data in the long term ,even if the technology to deliver across the enterprise is only partly here today. Setting up proper data governance, and getting business people committed to it, should have real benefits and will be valid efforts whatever technologies are deployed.

del.icio.us:Squaring the MDM circle  digg:Squaring the MDM circle  reddit:Squaring the MDM circle  Y!:Squaring the MDM circle

XML is not enough

March 28, 2008

I just read a particularly clear explanation of how XML contributes to helping with, but does not really solve, the problem of data integration. This is major issue as companies begin to deploy applications in the form of services, since as you bring elements of an application together via web services you usually also have to worry about how the data used by the application is going to be passed to another. There are just too many versions of XML, and insufficient semantic integration support, to just say “ah, we don’t need to worry about that - we are XML compliant”, yet this is exactly the marketing position of some vendors. As the article points out, a higher degree of semantic integration is needed. Master data management applications seek to provide this by establishing a repository of trusted information which has the necessary level of understanding to map the various definitions of “customer”, “product”, “fixed asset”, “location” etc together.

Whether you deploy such an application in a “co-existence” mode or “operational” mode is less important than going through the process of mapping together the competing definitions of master data strewn throughout any large company. Having a dial tone on my telephone enables me to phone someone in Argentina, but does not mean that we can communicate unless we also speak the same language. In the same way XML is a useful, but insufficient, building block in the path to data reconciliation in the enterprise. Only higher level semantic-based models are going to do that, and they will be hard work to implement given the amount of human interaction between different departments and company subsidiaries needed to resolve the differences that have built up over time.

del.icio.us:XML is not enough  digg:XML is not enough  reddit:XML is not enough  Y!:XML is not enough

MDM outside the box

February 27, 2008

MDM is all about connecting up and managing versions of what should be shared data. Generally people worry about their internal systems, since numerous distinct systems in an enterprise typically think that they “own” a particular piece of master data such as customer, product or asset. However, in addition to this it also important to consider external data i.e. data that you do not control within an enterprise but which you still need to relate to your own data. There is more of this than you might imagine, and MDM vendors would be well advised to take account of it. One example is Siperian’s recent link to Standard & Poor’s counter-party data, but there are many other cases. Purisma did such a job of linking to D&B data that D&B bought the company.

Other types of common external data are consumer data from Acxiom or Experian, patient data from IMS, financial and news data from Reuters and doubtless many more, some of which will be industry specific. Mapping such data back to internal systems is an important and non-trivial job e.g. D&B company data changes regularly, so it is more than just a one-off exercise.

MDM platform vendors would be well advised to consider building more such pre-built links to their platforms, as at present there are fairly few such links, yet customers surely will want these more and more as they deploy MDM more widely.

del.icio.us:MDM outside the box  digg:MDM outside the box  reddit:MDM outside the box  Y!:MDM outside the box

Shameless Self Promotion

February 11, 2008

There is an MDM whitepaper which you can download (free with registration) from the Bloor website:

https://www.bloor-research.com/research/white_paper/908/master_data_management.html

It is a high level overview of the MDM market, and discusses general trends and issues rather than getting into vendor specifics; it does include a new high level functionality model for MDM products. Unsolicited feedback thus far has included:

“Very comprehensive and detailed”
“Great Job”
“Very well written”
“Right on the money”
“One of the best papers I have read on MDM”.

Of course, I may be a little biased, but it may be worth a look…

Thanks to some readers of this blog who provided feedback on my early drafts; much appreciated.

del.icio.us:Shameless Self Promotion  digg:Shameless Self Promotion  reddit:Shameless Self Promotion  Y!:Shameless Self Promotion

The MDM Blues

January 31, 2008

After living in denial for some time, IBM have got the “multi domain” message about MDM which I have been bleating on about at length for years. They have just announced a repackaging of their MDM offerings under the banner “IBM Infosphere MDM Server”. This puts IBM firmly on the path of a server architecture that can deal with multiple types of MDM data in a consistent manner, not just customer and product but all the many other kinds of master data e.g. location, asset, contract, brand, financial profile, …..IBM has been sensibly enabling their MDM offerings in an SOA context, and MDM Server comes with 800 pre-packaged SOA services that can be invoked. IBM has bought high quality MDM technology and now at last has a strong vision of how to bring it all together.

However it is worth emphasising that this is a roadmap. For now there will remain the separate CDI hub technology (bought from DWL) and the PIM Hub technology (bought from Trigo). Over time these technologies will be integrated with common services, but this is a multi-release strategy. It is great news that IBM has finally realised that multi-domain is the right way to go, but prospects and customers need to reassure themselves about whether the roadmap meets their time horizons.

del.icio.us:The MDM Blues  digg:The MDM Blues  reddit:The MDM Blues  Y!:The MDM Blues

Orchestrating MDM Workflow

December 28, 2007

France is rarely associated with enterprise software innovation (test: name a French software company other than Business Objects) but in MDM there are two interesting vendors. I have already written about Amalto, but the more established French MDM player is Orchestra Networks. Founded in 2000, this company has been selling its wares in the French market since 2003, and has built up some solid customer references, mainly in the financial services arena but also with global names such as Sanofi Aventis and Kraft.

The great strength of their EBX technology is the elaborate support for complex business process workflow, an area neglected by most MDM vendors. For example a customer may have an international product code hierarchy, and distribute this to several regions. Each of the regional branches may make local amendments to this, so what happens when a new version of the international hierarchy is produced? EBX provides functionality to detect differences between versions or branches and to allow for merging of these versions, supporting both draft “project” master data and the production versions, keeping track of all changes and supporting the workflow rules to support the full life-cycle of master data creation and update.

Typically such functionality is delivered with only by PIM vendors (Kalido is an exception), yet EBX is fully multi-domain by design, so is not restricted to any one class of master data. This will give it an advantage in competitive situations with vendors who have historically designed their technology around one type of master data (customer or product) and are only now realising the need to support multiple domains.

So far Orchestra Networks has confined itself to France, but opens its first overseas office in London soon. The company has taken the time to build out its technology to a solid level of maturity, and has productive partnerships with Informatica (for data quality and ETL) and Software AG, who OEM EBX and sell it globally at the heart of their own MDM offering.

In my own experience of MDM projects, the handling of the business processes around creating and updating master data is a key issue, yet most hub vendors have virtually ignored it, assuming this is somehow “out of scope”. Hub vendors typically focus on system to system communication e.g. validating a new customer code by checking a repository, and perhaps suggesting possible matches if a similar name is found. This is technically demanding as it is near real-time. However human to system interaction is also important, especially outside the customer domain, where business processes can be much more complex. By providing sophisticated support for this workflow Orchestra Networks can venture into situations where CDI vendors cannot easily go, and as I have written previously there are plenty of real business problems in MDM beyond customer.

It will be interesting to see how Orchestra Networks fares as it ventures outside of France in 2008.

del.icio.us:Orchestrating MDM Workflow  digg:Orchestrating MDM Workflow  reddit:Orchestrating MDM Workflow  Y!:Orchestrating MDM Workflow

Labours of Hercules

December 22, 2007

If you want to understand the key issues around an MDM project then I would encourage you to read an excellent case study that has just been published in CIO magazine. It discusses a major project to try and sort out the master data at Nationwide Insurance (one of the largest US insurers). The case study illustrates the kind of organisational and business decisions that need to be made in order to succeed with this type of project. It is an unusually detailed write up of how a company with fourteen general ledgers, seventeen finance data warehouses alone and 300,000 spreadsheets took a root and branch approach to radically improving the situation of their master data, and by and large seem to have succeeded.

It is also daunting reading in one way, as it shows the level of business commitment that is required to sustain a project of this scale. Anyone thinking that a purchase of an MDM tool and a small project team with a million dollars or so of budget will do the job needs to read this case study. I have seen a few serious MDM projects in my time but this was certainly one of the more ambitious ones.

Finally, a very happy Christmas to you, my readers. Have a lovely holiday.

del.icio.us:Labours of Hercules  digg:Labours of Hercules  reddit:Labours of Hercules  Y!:Labours of Hercules

The Gaul of it

December 18, 2007

I came across an interesting new MDM vendor recently called Amalto, a start-up from Paris (though they already have a California office). They have only been selling their software for less than a year, but already have a good set of early customers, such as Rio Tinto, Total, SNCF and BNP Paribas. Their Xtentis product offers a generic MDM repository with data movement (EAI like) functionality, and they make heavy use of standards (Eclipse, Ajax etc). Unusually, they use an XML database rather than a relational database as their underlying storage mechanism. Given the relatively low data volumes typical in MDM applications, this approach seems interesting, since XML databases are strong at handling data with complex structures (e.g. variable depth hierarchies) that one often encounters in master data. In case you think XML databases are unproven, Berkeley DB is probably the most widely deployed DBMS in the world, being embedded in many mobile phones, for example, and most phone users don’t have deep DBA skills. On a parochial note, it is nice to see a European software company emerging for a change (another MDM vendor is Orchestra Networks, also French).

Though an early stage company, Amalto is making good progress in the French market and in 2008 will start to expand to the USA. If they can firm up their positioning (confusingly, they also have a product for B2B exchanges, a quite different market, resold by Ariba) and develop good systems integration partnerships in the US then they should be an interesting addition to the MDM space. Their technology is innovative and their early customer stories sound promising.

del.icio.us:The Gaul of it  digg:The Gaul of it  reddit:The Gaul of it  Y!:The Gaul of it

Never mind the quality, feel the width

December 7, 2007

Frank Buytendijk (ex Gartner analyst, now with Oracle) makes an importantpoint about data quality on his blog: it is inherently dull. This in itself causes problems both to people within organisations who care about data quality (there must be a few of you out there) and for data quality vendors, who struggle to sell their products at a decent price point in sufficient numbers. I have written about this before, in which I pointed out just a couple of real life cases of poor data quality that I have personally encountered, each of which cost many millions of dollars.

The reason that data quality is generally excellent in the area of salary and expense processing is that people care deeply about what they get paid, and you can be pretty sure than any clerical errors get spotted and complained about very quickly. However in most cases data quality occurs due to people being asked to enter or maintain data for which they see no personal or even obvious company benefit. Data that is useful for “some other department” is never going to receive the same care and attention that your own personal expense claims get.

As Frank says, in order to move data quality higher up the enterprise priority list, it needs to widen its perspective: move beyond talking about customer names and addresses. Yes, this is important if you are doing mailshots, and certainly poor customer name and address management can have more serious consequences, but most executives have got better things to do than worry about whether their mailshots are being duplicated.

Despite numerous acquisition over the years (First Logic, Similarity, Vality, …) there are still plenty of small data quality vendors out there, some with very interesting technology. Yet aside from Trillium, few have managed to get even into double figures of millions of revenue. This is not due to an absence of a real problem to address.

Some data quality vendors rightly see master data management as a way of repositioning their offerings in a more fashionable area, but they need to realise that data quality is just a feature of a complete MDM solution. Hence they need to partner with broader-based MDM repository vendors who themselves often lack proper data quality technology, rather than pretending they themselves are a complete solution. They should also do a better job of highlighting quantified customer dollar benefits achieved from the use of data quality technology. This should not be hard to do since data quality projects usually have excellent payback. Yet time after time the example used in data quality collateral are the tired name and address cleanup, followed by an esoteric discussion about whether probabilistic or deterministic matching is better (paying customers don’t care - they are interested in what benefits they see). Far too few data quality case studies mention hard-dollar benefits to the customer.

Data quality should have much going for it: it is a very real problem, the condition of data quality in most large organisation is horrible (and far worse than generally realised), and the costs of this are significant and cause genuine and in some cases very serious operational problems. Yet the industry as a whole has done a poor job of explaining itself to the people with the cheque books in enterprises.

del.icio.us:Never mind the quality, feel the width  digg:Never mind the quality, feel the width  reddit:Never mind the quality, feel the width  Y!:Never mind the quality, feel the width

The dust settles

November 29, 2007

I had a chance recently to dig a little deeper into the recent acquisition of Purisma by Dun & Bradstreet. The way that this news leaked out was a case study in how not to do software PR. The news came out in an investor briefing by D&B, and there were no clues as to whether D&B was even going to continue selling the Purisma technology, or just use it for internal purposes. After all, D&B has daunting master data issues. 150 million companies are tracked, one and a half million updates a day made to the company information it sells: plenty of data management implications there. So what did this mean to Purisma customers and prospects? No clues were offered.

Having now spoken to Bob Hagenau, who was VP of products and co-founder of Purisma, the smoke has cleared a little. Purisma will be retained as a stand-alone business unit, with its own enterprise sales force. The Purisma technology will continued to be sold in its present form, though it is too early to say what the technology roadmap will look like; I am going to take a wild stab in the dark and say bet that further integration with D&B data will feature. Clearly the D&B name brings many benefits: a parent with deep pockets, a customer base that is essentially all large corporations and so a potentially wonderful leads channel. However the botched news release shows the dark side of a large parent in an different core industry: falling foul of the corporate bureaucracy, in this case the corporate press office.

Hopefully, as the acquisition beds down, Purisma will learn how to work the D&B corporate systems and avoid future press gaffes, while taking advantage of the undoubted resources that D&B can bring to bear.

del.icio.us:The dust settles  digg:The dust settles  reddit:The dust settles  Y!:The dust settles