Andy on Enterprise Software

Opening up data quality

May 7, 2008

There is an interesting web forum which seeks to bring an open source approach to the world of data management. Of interest are topics involving the creation of open source de-duplication, profiling, matching and cleansing tools (hat tip to CW for pointing this out).

No doubt the tools here are at an early stage and won’t directly compare in broad functionality with a major data quality vendor. However, for many people with less sophisticated requirements that may not matter. The rise of products like MySQL has shown how influential an open source product can become given the right circumstances.

I would be very interested as to whether any readers of the blog have any experience with the tools here, or any views on the merits or otherwise of an open approach to data quality and data integration.

del.icio.us:Opening up data quality  digg:Opening up data quality  reddit:Opening up data quality  Y!:Opening up data quality

A Burning Platform

April 30, 2008

I was amused by a piece regarding data quality in which a data quality initiative at a chemical manufacturer was kicked off only after a warehouse burnt down and the company discovered that they had no way of tracing which customers would be affected. I recall a similar example at Shell, where a data quality initiative received a serious management boost when an oil well was drilled into an existing well (fortunately it was not in use at the time) due to faulty positional data in a computer system.

These kind of incidents demonstrate that management care about data quality when there is a crisis, but when things are running smoothly it is usually a long way down the priority lists of businesses. This is a puzzle in many ways, and frustrating for the data quality software industry, where few vendors see widespread deployment in enterprise initiatives. Usually data quality software is implemented (if ti is at all) in a piecemeal project by project fashion, and many companies have no data quality software deployed at all.

Perhaps it is just one of those subjects that few get excited over. Author Kurt Vonnegut once said: “Another flaw in the human character is that everybody wants to build and nobody wants to do maintenance.” and perhaps this is an inherent problem with data quality - it is hard to make it appealing to executives. Some more creative marketing by the industry could perhaps change this perception.

del.icio.us:A Burning Platform  digg:A Burning Platform  reddit:A Burning Platform  Y!:A Burning Platform

Tilting at Windmills

April 22, 2008

I wrote recently about likely further consolidation in the MDM market. A further example, albeit on a small scale, happened today as FullTilt, a PIM provider, was bought by QAD. QAD is a public company selling ERP software, with around 1,500 employees which has been listed since 1997 though its history goes back to 1987. FullTilt was known by industry insiders to be “in play” for many months, and has been openly for sale for some time. It is a relatively small company that has struggled to get scale, and so a deeper pocketed parent makes some sense for it.

This is another example of companies in more mature markets seeking to get exposure to the fast growing MDM market. It will not be the last such move.

del.icio.us:Tilting at Windmills  digg:Tilting at Windmills  reddit:Tilting at Windmills  Y!:Tilting at Windmills

Snack Time?

April 11, 2008

The MDM market, from a term barely used before 2003, has grown up into a market where literally dozens of vendors compete. As with any fast growing market, the big players have tended to acquire rather than build their own technology. IBM bought Trigo and DWL, Oracle bought Siebel and Hyperion, SAP bought A2i, TIBCO bought Velosel, Teradata licensed the code to I2’s MDM offering, D&B bought Purisma, Microsoft has bought Stratature and the list goes on.

In my view this consolidation as vendors seek to fill out their product lines is far from complete. For any data integration vendor it makers sense to have an MDM offering. TIBCO has already made its move, but this still leaves Informatica (so far just buying a data quality tool in Similarity) and even Ab Initio, though it is hard to figure out anything much about that secretive vendor. This also leaves Sun (with SeeBeyond) and other EAI vendors, and perhaps even a dark horse like EMC from a storage perspective.

While a number of technologies have already been purchased, as noted earlier, there are still several MDM independents out there that could potentially be bought up: Initiate, Siperian, Kalido, Visionware and, in France,, Orchestra Networks and Amalto. Some of these are big mouthfuls, but there are plenty of vendors with deep pockets who may like the look of a fast growing market (over 50% annual growth in the coming years predicted by Forrester).

MDM platform vendors themselves should consider whether to plough their own furrow or be vendor-neutral with regards to data quality. Some have their own algorithms, some partner with one or more data quality vendors, some offer a choice. There are a scary number of data quality vendors out there, many of them quite bite sized (few have more than USD 10M in revenue) and so I would be surprised if nothing happened here n the next twelve months. Moreover MDM platform vendors from a CDI heritage often get a rabbit in the headlights look when asked about data governance and complex work-flow (a screen or two to handle customer address mis-matches does not count as business process workflow support in my book). By contrast vendors from the PIM or analytic MDM world often have quite sophisticated offerings here, but may lack raw high volume performance. Hence there is room for some of the smaller vendors with tasty technology offerings to be snapped by those wishing to fill out their technology stack.

If there was a big move e.g. Informatica buying Initiate, then this may trigger a response from other MDM platform vendors who at the least could then decide to gobble up a data quality vendor like SilverCreek or Exeros in order to expand their own technology stack.

All in all 2008 I expect the next 18 months will an interesting time in the MDM market from a mergers and acquisitions perspective. After all, those silver tongued investment bankers will need something to do with their time now they can’t jam dodgy credit derivatives down gullible corporate treasurer’s throats any more.

del.icio.us:Snack Time?  digg:Snack Time?  reddit:Snack Time?  Y!:Snack Time?

Squaring the MDM circle

April 7, 2008

Jill Dyche raises an important point about the how companies are tackling MDM. She mentions the “random acts of MDM” that are done in isolation in a particular business area, or involving a particular data domain, which are unlikely to evolve into an enterprise-wide MDM solution.

The tricky issue that companies face is that MDM is a genuinely large-scale endeavour, and because we all know how well giant enterprise projects usually go, they are understandably reluctant to take on an enterprise-wide project. Instead they pick off an easier piece, such as one particular data type, or perhaps a broader set of master data types but only in a subset of the enterprise, say across one division. As Jill says, such isolated initiatives won’t in themselves magically grow into enterprise MDM. There is a further danger in disconnected initiatives. At this point the vendor technology out there is at very different stages of maturity depending on what kind of data you want to tackle, and on what scale. Some vendors have a well proven customer hub technology, but with limited experience in tackling product data (and may lack key functionality to do this e.g. attribute inheritance) and usually have very limited ideas about business process workflow and data governance support. Other vendors from the PIM or the analytic MDM world usually have much better business workflow support, yet may have limited scalability e.g. you it would be a brave person who tried doing a 100 million record customer hub using a PIM product. The vendors with a CDI heritage are adding more workflow capability, and the PIM and analytic MDM vendors are working on scalability, but these are works in progress rather than completed and tested features and functions. Hence separate initiatives may end up using different technologies due to the demands of a particular area, and it would be easy to end up with one technology to handle product data, another for customer data, and maybe another where analytics were the driver.

In my view you need to combine an enterprise-wide vision with a practical, bite-sized approach i.e. thing big but start small. You can build a broad enterprise strategy that encompasses data governance processes for example, even if you decide to build out actual master data hubs in a stepwise fashion, beginning with certain high value data domains or company divisions that can best benefit form improve master data. However you need to keep the big picture in mind in order to avoid (or minimise) duplicate technology investments that may prove hard to fit together. There are no magic bullets here, but enterprise architects need to put in place the processes and broad strategy that will lead to a better master data in the long term ,even if the technology to deliver across the enterprise is only partly here today. Setting up proper data governance, and getting business people committed to it, should have real benefits and will be valid efforts whatever technologies are deployed.

del.icio.us:Squaring the MDM circle  digg:Squaring the MDM circle  reddit:Squaring the MDM circle  Y!:Squaring the MDM circle

XML is not enough

March 28, 2008

I just read a particularly clear explanation of how XML contributes to helping with, but does not really solve, the problem of data integration. This is major issue as companies begin to deploy applications in the form of services, since as you bring elements of an application together via web services you usually also have to worry about how the data used by the application is going to be passed to another. There are just too many versions of XML, and insufficient semantic integration support, to just say “ah, we don’t need to worry about that - we are XML compliant”, yet this is exactly the marketing position of some vendors. As the article points out, a higher degree of semantic integration is needed. Master data management applications seek to provide this by establishing a repository of trusted information which has the necessary level of understanding to map the various definitions of “customer”, “product”, “fixed asset”, “location” etc together.

Whether you deploy such an application in a “co-existence” mode or “operational” mode is less important than going through the process of mapping together the competing definitions of master data strewn throughout any large company. Having a dial tone on my telephone enables me to phone someone in Argentina, but does not mean that we can communicate unless we also speak the same language. In the same way XML is a useful, but insufficient, building block in the path to data reconciliation in the enterprise. Only higher level semantic-based models are going to do that, and they will be hard work to implement given the amount of human interaction between different departments and company subsidiaries needed to resolve the differences that have built up over time.

del.icio.us:XML is not enough  digg:XML is not enough  reddit:XML is not enough  Y!:XML is not enough

MDM outside the box

February 27, 2008

MDM is all about connecting up and managing versions of what should be shared data. Generally people worry about their internal systems, since numerous distinct systems in an enterprise typically think that they “own” a particular piece of master data such as customer, product or asset. However, in addition to this it also important to consider external data i.e. data that you do not control within an enterprise but which you still need to relate to your own data. There is more of this than you might imagine, and MDM vendors would be well advised to take account of it. One example is Siperian’s recent link to Standard & Poor’s counter-party data, but there are many other cases. Purisma did such a job of linking to D&B data that D&B bought the company.

Other types of common external data are consumer data from Acxiom or Experian, patient data from IMS, financial and news data from Reuters and doubtless many more, some of which will be industry specific. Mapping such data back to internal systems is an important and non-trivial job e.g. D&B company data changes regularly, so it is more than just a one-off exercise.

MDM platform vendors would be well advised to consider building more such pre-built links to their platforms, as at present there are fairly few such links, yet customers surely will want these more and more as they deploy MDM more widely.

del.icio.us:MDM outside the box  digg:MDM outside the box  reddit:MDM outside the box  Y!:MDM outside the box

Shameless Self Promotion

February 11, 2008

There is an MDM whitepaper which you can download (free with registration) from the Bloor website:

https://www.bloor-research.com/research/white_paper/908/master_data_management.html

It is a high level overview of the MDM market, and discusses general trends and issues rather than getting into vendor specifics; it does include a new high level functionality model for MDM products. Unsolicited feedback thus far has included:

“Very comprehensive and detailed”
“Great Job”
“Very well written”
“Right on the money”
“One of the best papers I have read on MDM”.

Of course, I may be a little biased, but it may be worth a look…

Thanks to some readers of this blog who provided feedback on my early drafts; much appreciated.

del.icio.us:Shameless Self Promotion  digg:Shameless Self Promotion  reddit:Shameless Self Promotion  Y!:Shameless Self Promotion

The MDM Blues

January 31, 2008

After living in denial for some time, IBM have got the “multi domain” message about MDM which I have been bleating on about at length for years. They have just announced a repackaging of their MDM offerings under the banner “IBM Infosphere MDM Server”. This puts IBM firmly on the path of a server architecture that can deal with multiple types of MDM data in a consistent manner, not just customer and product but all the many other kinds of master data e.g. location, asset, contract, brand, financial profile, …..IBM has been sensibly enabling their MDM offerings in an SOA context, and MDM Server comes with 800 pre-packaged SOA services that can be invoked. IBM has bought high quality MDM technology and now at last has a strong vision of how to bring it all together.

However it is worth emphasising that this is a roadmap. For now there will remain the separate CDI hub technology (bought from DWL) and the PIM Hub technology (bought from Trigo). Over time these technologies will be integrated with common services, but this is a multi-release strategy. It is great news that IBM has finally realised that multi-domain is the right way to go, but prospects and customers need to reassure themselves about whether the roadmap meets their time horizons.

del.icio.us:The MDM Blues  digg:The MDM Blues  reddit:The MDM Blues  Y!:The MDM Blues

Orchestrating MDM Workflow

December 28, 2007

France is rarely associated with enterprise software innovation (test: name a French software company other than Business Objects) but in MDM there are two interesting vendors. I have already written about Amalto, but the more established French MDM player is Orchestra Networks. Founded in 2000, this company has been selling its wares in the French market since 2003, and has built up some solid customer references, mainly in the financial services arena but also with global names such as Sanofi Aventis and Kraft.

The great strength of their EBX technology is the elaborate support for complex business process workflow, an area neglected by most MDM vendors. For example a customer may have an international product code hierarchy, and distribute this to several regions. Each of the regional branches may make local amendments to this, so what happens when a new version of the international hierarchy is produced? EBX provides functionality to detect differences between versions or branches and to allow for merging of these versions, supporting both draft “project” master data and the production versions, keeping track of all changes and supporting the workflow rules to support the full life-cycle of master data creation and update.

Typically such functionality is delivered with only by PIM vendors (Kalido is an exception), yet EBX is fully multi-domain by design, so is not restricted to any one class of master data. This will give it an advantage in competitive situations with vendors who have historically designed their technology around one type of master data (customer or product) and are only now realising the need to support multiple domains.

So far Orchestra Networks has confined itself to France, but opens its first overseas office in London soon. The company has taken the time to build out its technology to a solid level of maturity, and has productive partnerships with Informatica (for data quality and ETL) and Software AG, who OEM EBX and sell it globally at the heart of their own MDM offering.

In my own experience of MDM projects, the handling of the business processes around creating and updating master data is a key issue, yet most hub vendors have virtually ignored it, assuming this is somehow “out of scope”. Hub vendors typically focus on system to system communication e.g. validating a new customer code by checking a repository, and perhaps suggesting possible matches if a similar name is found. This is technically demanding as it is near real-time. However human to system interaction is also important, especially outside the customer domain, where business processes can be much more complex. By providing sophisticated support for this workflow Orchestra Networks can venture into situations where CDI vendors cannot easily go, and as I have written previously there are plenty of real business problems in MDM beyond customer.

It will be interesting to see how Orchestra Networks fares as it ventures outside of France in 2008.

del.icio.us:Orchestrating MDM Workflow  digg:Orchestrating MDM Workflow  reddit:Orchestrating MDM Workflow  Y!:Orchestrating MDM Workflow