Tuesday, October 11, 2011

MDM - Making it Actionable/Transactional as you Define it

How useful is your MDM, really? Does it just sit there in a repository, waiting for your MDM team to update it?  One of the common criticisms of MDM projects is the magnitude of the project and the low ROI.  More than likely, you are in the middle of a project with great expectations of value.





When most people think of metadata, the scope is limited. It's a schema that defines  a virtual data set, for example. And a Data Master may include meaningful keywords and tags for identification. It may include a cross-reference in a lookup table. And maybe it includes definitions of what each element means and what unit of measure it is in. Then what? Then you have to add references to where the data ought to come from. But then what? You've spent quite a lot of  resources  defining this. Are you any better off than with  the ancient "Corporate Dictionary?"  How do you actually use it?

The most common ways to implement Master Data definitions are indicative of Big Projects:

1.       Define a data warehouse to store the data in, so that it is accessible in the form defined in the Data Master. Once the data warehouse is designed,  corresponding integration must be built to populate it from the appropriate sources, aggregating and transforming as needed, as often as necessary for minimal latency.

2.       Write web services to access the data from the sources and make them available as  Master Data sets.

When I talk about metadata, I think in terms of representing not only the data schemas but also the metadata that describes  where the data is,  what part of it is relevant, how it aligns with other data of interest, how you or the real or virtual destination (master)  needs to see it, and  how it must be converted, or mapped, to be meaningful to the destination.  Then there are the events that trigger data flows, and all the surrounding logic notifications, security, and a host of other things.  If you can capture all of this information as metadata, in reusable, separable  "layers,"  you will have a highly flexible and "actionable"  collection of metadata.

If you define a metadata  Master, say,  "Customer,"  for use corporate-wide, you will have several different sources that are in play to ensure that the various parts of the virtual "Customer" definition has the best information from the most appropriate sources.  Part may come from your ERP, part from Salesforce.com, and another part from an Oracle database. Does your Master definition encapsulate everything you need to use the data? That is, can your metadata be pumped onto a message bus?  Can it be  packaged as a web service?  As an ADO.net object? As a SharePoint external content type?  Does it incorporate the capabilities to perform CRUD (Create, Read, Update and Delete) operations at the endpoints? If one of the sources schemas changes, do you have to do anything to accommodate it? Do you even  need to know a source changed?

If I'm a programmer, I want to leverage the corporate Master Data for my programs and the users of my programs.  I can look up the data definitions, sources, etc., and use them, but that still requires a lot of work.  When the Master Data includes a full set of metadata, then all I have to do is invoke the web service or External Content Type in SharePoint, or  ADO.net and so on.  I simply select the Master I need and indicate how I want to use it.  I don't need to know what the various sources even are, and if the source changes, I won't need to make any changes, since the metadata will reflect what it needs to. And I can pass that selection process on tot the end user of my application or dashboard.

The diagram above shows the scope of metadata captured for MDM by Agile Integration Software. The metadata is generated from a GUI and has an atomic structure so that a change to any metadata can be made without impacting  the whole hierarchy of metadata. Using this type of metadata infrastructure, changes are absorbed without creating waves. Data is accessed directly from the original source , eliminating the need for a costly data warehouse to resolve virtual relationships across sources.




No comments:

Post a Comment