One of the common criticisms of MDM projects is the magnitude of the project and the low ROI. More than likely, you are in the middle of a project with great expectations of value.
Metadata and MDM
When most people think of metadata, the scope is limited. It's a schema that defines a virtual data set, for example. It may include a cross-reference in a lookup table. And maybe it includes definitions of what each element means and what unit of measure it is in.
Then you have to add references to where the data ought to come from.
But then what?
You've spent quite a lot of resources defining this. Are you any better off than with the ancient
"Corporate Dictionary?" How do you actually use it?
The most common ways to implement Master Data definitions are indicative of Big Projects:
1) Define a data warehouse to store the data in, so that it is accessible in the form defined in the Data Master. Once the data warehouse is designed, corresponding integration must be built to populate it from the appropriate sources, aggregating and transforming as needed, as often as necessary for minimal latency.
2) Write web services to access the data from the sources and make them available as Master Data sets.
When I talk about metadata, I think in terms of representing not only the data schemas but also the metadata that describes where the data is, what part of it is relevant, how it aligns with other data of interest, how you or the real or virtual destination (master) needs to see it, and how it must be converted, or mapped, to be meaningful to the destination.
Then there are the events that trigger data flows, and all the surrounding logic notifications, security, and a host of other things. If you can capture all of this information as metadata, in reusable, separable "layers," you will have a highly flexible and "actionable" collection of metadata.
If you define a metadata Master, say, "Customer," for use corporate-wide, you will have several different sources that are in play to ensure that the various parts of the virtual "Customer" definition has the best information from the most appropriate sources. Part may come from your ERP, part from Salesforce.com, and another part from an Oracle database.
· Does your Master definition encapsulate everything you
need to use the data?
· Can your metadata be pumped onto a message bus?
· Can it be packaged as a web service?
· As an ADO.net object?
· As a SharePoint external content type?
· Does it incorporate the capabilities to perform CRUD
(Create, Read, Update and Delete) operations at the
· If one of the sources schemas changes, do you have to do
anything to accommodate it?
· Do you even need to know a source changed?
If I'm a programmer, I want to leverage the corporate Master Data for my programs and the users of my programs. I can look up the data definitions, sources, etc., and use them, but that still requires a lot of work. When the Master Data includes a full set of metadata, then all I have to do is invoke the web service or External Content Type in SharePoint, or ADO.net and so on. I simply select the Master I need and indicate how I want to use it. I don't need to know what the various sources even are, and if the source changes, I won't need to make any changes, since the metadata will reflect what it needs to. And I can pass that selection process on tot the end user of my application or dashboard.
The diagram above shows the scope of metadata captured for MDM by Agile Integration Software. The metadata is generated from a GUI and has an atomic structure so that a change to any metadata can be made without impacting the whole hierarchy of metadata. Using this type of metadata infrastructure, changes are absorbed without creating waves. Data is accessed directly from the original source, eliminating the need for a costly data warehouse to resolve virtual relationships across sources.