Monday, June 20, 2016

The Dirty "Little" Secrets of Legacy Integration


The more I learn about integration products on the market, the more astounded I become.  Fortune XXX companies buy politically “safe” products from IBM, Informatica, SAP, and Oracle and launch into tens of millions of dollars’ worth of services to implement them. Egad!   They’d be better off with holes in their collective heads!

Remember the childrens’ story, The Emperor’s New Clothes? Isn’t it time for someone to come out and tell the real story?

Shouldn’t an enterprise grade integration solution simplify data integration instead of creating Rube Goldberg projects?  Does it really make sense to have to send personnel to intense training classes that take months?

Nine things that astound me about other enterprise grade integration products, along with how Enterprise Enabler makes it easier, ultimately reducing the time-to-value by 60% to 80%.

1.     Robust transformation engines are mostly non-existent. This means that anything beyond the simplest relationships and formulas must be hand coded. That’s a huge impediment to fast time-to-value. Enterprise Enabler has a powerful transformation engine that captures mapping, validation, federation, and business rules as metadata instructions through a single GUI without requiring programming. 

2.      Transformation engines cannot interact with the endpoints in their native state. This means there has to be a complete integration to get each source’s data into a compatible format before transformation. Enterprise Enabler's transformation engine receives data from multiple sources in their native formats, combines (federates) them live, and delivers results in any form, or by query.  

3.       Data federation is not applied to ETL, DV, or other modes directly from the sources. Each source is handled individually, and posted to a staging area in memory or database. Enterprise Enabler brings the data together logically "on-the-fly" without storing it in memory or anywhere, and passes it through the transformation engine to the destination or the virtual model. Sometimes for performance select data is cached and refreshed as required. 
  
4.       Many, if not most, endpoints are accessed via some standard like ODBC as opposed to using their native mode. This means that it is not possible to leverage special features of the native application, and negates the possibility of being a strong player in IoT. Enterprise Enabler accesses each source in its native format, enabling the execution to leverage various features specific to the endpoints at execution. Because of the robust proprietary endpoint connectors, called AppComms, Enterprise Enabler easily incorporates legacy ERPs with electronic instrumentation in a single integration step.

5.   Data Virtualization does not support “Write-Back” to the sources. (probably because of #4) Enterprise Enabler supports authorized end-user aware CRUD (Create, Read, Update, and Delete) write-back to source or sources when data is changed or entered from the calling application or dashboard. 

6.     Implementing an integration solution is a matter of working with a number of mostly stand-alone, disconnected tools each of which imposes rules for interaction. Enterprise Enabler is a single platform where everything is configured, tested, deployed, and monitored, with design-time validation, and embedded C# and VB code editors and compilers for outlier situations. A developer of DBA never needs to leave the Integrated Development Environment. 

7.      Various data integration modes (e.g., ETL, EAI, DV, SOA) are delivered by separate tools and do not offer reusability across those modes. With Enterprise Enabler, all modes, including complex integration patterns are configured within the same tool, leveraging and re-using metadata across modes. This means that an enterprise master virtual model can be easily re-purposed, with all the same logic, as an ETL. 

8.     Further, Enterprise Enabler has a data workflow engine, that serves as a composite application builder, with full visibility to the active state and process variables defined throughout the execution. 

9.   Finally, Enterprise Enabler's Integration Integrity Manager monitors endpopints for schema changes at integration touchpoints. When found, IIM traverses the metadata to determine the impact and notifies the owners of those objects.  

In short, none of the legacy integration platforms can hold up to the demands for agility that are essential for maintaining competitive advantage.

Friday, April 22, 2016

Top 20 Question you should ask when selecting a Logical Data Warehouse

If you are evaluating or researching a Logical Data Warehouse (LDW), it is likely that you are looking for a way to eliminate all the overhead and implementation time of your current Data Warehouse or to reduce the proliferation of databases and all the overhead associated with those. You may be looking to use it for one or more scenarios, for example:

  • Support Business Intelligence and Analytics with clean, fresh data consolidated live from multiple sources
  • Standardize usage of data throughout the enterprise with Data as a Service (DaaS)
  • Generate actionable and repeatable Master Data definitions
Logical Data Warehouse           

Classic Data Warehouse         

The following 20 questions will help you to make sure you won’t need to augment with additional tools once you are into your project.

Top 20 Questions to Ask When Selecting a LDW platform


Basic:

  1. Is your LDW agile? Because your requirements are constantly changing, and you need to be able to make changes in a matter of seconds or minutes
  2. Can the LDW connect directly to every source you need? You don’t want to have to invent custom programming to force feed your LDW. That defeats the purpose.
  3. Are all the consumption methods you need supported? ODBC, JDBC, OData, SharePoint BCS, SOAP, REST, Ado.Net, others
  4. Can you configure in-memory and on-disk caching and refresh for selected sources that do not need to be refreshed at each query? This improves performance and alleviates load on the source system. In many situations, you really don’t need all of the source data updated on every query if it doesn’t change much or often. The best platforms will have full ETL capabilities.
Ease of Use and Configurability:

  1. Can you configure, test, deploy, and monitor from a single platform?
  2. Does it have design time debugging? You don’t want to keep going in and out of the platform to test.
  3. Can you re-use all components, validations, rules, etc.?
  4. Is there numbered versioning on configurations, including the who/what and time stamp?
  5. Is there auto packaging and self-hosting of all services?
Enterprise Readiness:

  1. Can you do data validation on source data?
  2. Is there a full transformation engine for complex business rules about combining data?
  3. Is it available on-premise and IPaaS?
  4. Is there an Execution audit trail?
  5. Can a virtual data model be a source to another virtual data model?
  6. Is there write-back to sources with transaction rollback?
  7. Is there end user authentication for full CRUD (Create, Read, Update, Delete) for “actionable” dashboards and such?
  8. Does it handle streaming data?
  9. Are there multiple points for performance tuning?
  10. Does the Platform have individual user logins specifying the permissions for each? For example, maybe a DBA can view and modify a transformation, but not create one.
  11. Is there definitive data lineage available including validation rules, federation rules, etc.?
Want to learn more about the Logical Data Warehouse?  Download the whitepaper.

Friday, February 19, 2016

When Business Intelligence isn’t Intelligent Business (and what to do about it)


The great conference hall fell silent when the CFO stepped up to the front of the room.  He hadn’t called the whole IT team together since he had taken that position three years before.  The rumors and speculation of why he was doing this brought out everyone’s fears and excitement, not to mention creativity. There was even a rumor that he was going to leave the company and teach sky diving.  “There’s less risk in sky diving,” he often remarked.

“All of you know I’m the first person to encourage good Business Intelligence projects that can save us money or give us a competitive edge, but our track record with BI projects indicates that the cost simply isn’t justified in most cases. In fact, our track record is abysmal!

The soon-to-be sky diver turned on the projector. “Here we are. This is the bottom line.”


“Last quarter we approved twelve projects, and only one has been completed… three months later. We can’t pretend any longer, and it’s my job to say this.   These are all projects that theoretically could save us hundreds of thousands of dollars, but the data preparation is so complex and time consuming that that by the time we’re ready for the analytics, business drivers have shifted, which means that the data requirements have changed. I’m seriously tempted to get us out of the BI business.

“I expect everyone in IT to put some thought into this. My door is open, so bring me a solution.”

After the meeting, Marvin-the Millennial was already in the CFO’s office waiting for him.

“Hi, I’m Marvin. Fairly new here, but I have the answer. Remember Terri, the Data Warehouse architect, who…um..sort of disappeared a couple of months ago?”   CFO nods, “Very smart but, I must say, a little odd.”

"Well, her legacy is a couple of huge data warehouses that, as you know, sir, constitute the official repositories for all BI and BA dashboards. If I may be so bold, I believe these are the root cause of the long implementation times.  Since Terri left, I have been working with Enterprise Enabler, which is a data virtualization technology.”

“And what might that be?” interrupted CFO.

“Well, basically, instead of building ETL scripts and moving all the data into the Data Warehouse, you just grab the data from the original sources, live, with Enterprise Enabler resolving the BI queries on the fly and returning the data exactly the way it would if the data were physically stored in a database. You can avoid that whole classic data prep exercise and all the associated risk. On top of that, you can bring live data instead of data that is stale.”

“This whole story sounds like fiction to me. What if some of the data comes from systems like from  SAP instead of relational databases. That must require custom coding?”
“No – Out of the box.”
“Online data feeds?”
“Out of the box.”
“IOT, Big Data, flat files…”

“Same, same, same. Everything is configured from a single extensible platform, and the data relationships across these totally different sources are associated as easily as making a relationship across tables in a regular database. And by the way, these Virtual Data models can be used as Master Data definitions.

“You are not going to believe this, but yesterday I made a slide just like yours showing the data prep time with Enterprise Enabler.” Marvin unfolded his dog-eared slide on the table in front CFO.



“Come. I’ll show you what I’ve been working on. I’ve configured several virtual data models and set up various analytics dashboards using Tableau, Spotfire, and Power BI. Oh, and there’s another really cool thing! I can write back to the sources or take other actions from the dashboard based on decisions I conclude from playing around with the analytics.”

“Ok, Marvin, how are we going to get everyone trained on Enterprise Enabler?"
"They can start with www,stonebond.com"

"We are going to scrap any more Data Warehouse projects unless we need to capture the data for historic reasons. This will bring an important competitive advantage (unless our competitors discover Enterprise Enabler, too.)

Guess sky diving will have to wait. Let’s get this show on the road!”

Friday, August 14, 2015

Some People Just Don't Get Data Virtualization


“But where’s the data?”  Terri-the-architect, as she was often called, had definitely been around the block, and had plenty of successes under her belt. She grew up in the halcyon days of the data warehouse, proudly touting star schemas and cubes to anyone who would listen. As it became drudgery, she carried the mantle as it got heavier and heavier. Poor Terri is clearly still buried in the heavy-duty design, extension, and redesign of the massive data warehouse model. And she’s ETL-ing all over the place, which is always a messy proposition.  She often longed to be back in the days when she was working with the very latest technologies.
Marvin-the millennial, who was tagging along, nudged Jerry-the-Gen-Xer, trying not to emit a guffaw. Both quickly busied themselves on their cell phones. When they recovered their composure, Marvin tried an explanation. “It’s a little like a hologram. It looks and acts like it’s there, but in reality it’s only an illusion.”

Terri was frantically scanning the network to see where the database was. “Better not be up there in the cloud! You know that’s sensitive data we’re working with. No, you’re better off loading it into the data warehouse.  I can arrange for a team to get that done for you. We’ll even expedite the project, so you could have it in, say eight weeks.”
“Snicker, snicker.”

“Oh, good timing,” said Terri, “I was having my afternoon chocolate attack.” She stood up and walked all the way around her computer, even underneath it. “Ok, guys, where did you stash the data?” Clearly she was getting distraught. “Come on, is this some kind of a trick?"

“Hmm..Yes. Maybe Magic?” proposed Jerry. Marvin turned his back and madly double-thumbed his cell. His shoulders and head were shaking as he laughed silently.

“Ok. Here’s the scoop,” said Marvin.  Marvin was a self-proclaimed data scientist, and most people would agree that it fits his expertise.  He walked over to his cube and pulled up his latest analysis that he had set up in Spotfire. “Until a couple of weeks ago, the way we did this was that we had IT pull data from the data warehouse into a SQL database, and add the data from two or three other data sources. They set up something that ran every week to update all the data for me.”

“You’re talking about the ETL scripts that keep the data fresh,” Terri interrupted.  “But now there’s no database, and it looks like the ETL scripts aren’t anywhere either.” 

Marvin continued, “See this is data from SAP, Salesforce, Oracle, and even live data from the plant. I can make it sing and dance in Spotfire, without waiting a week to get new data. It’s always the latest and greatest!”  

Jerry added, “Yep, we bought this agile integration software called Enterprise Enabler® that does what’s called Data Virtualization.”

Terri interrupted Jerry before he could say any more. “Oh, so it IS in the cloud. You’ve virtualized the data warehouse into the cloud. Can’t do that. See what happens when I take a vacation? Everything goes caty-wampus!”

“Calm down, Terri. Let me finish. The data is NOT in the cloud at all. Enterprise Enabler grabs the data as it is needed directly from the sources. No data warehouse or database needed. It aligns it, and resolves Spotfire’s queries and returns it essentially to the display. No copies and no data moving anywhere.”

“Well, my word!” Terri exclaimed. “I’ve never seen anything like this before. Must take a lot of programming to get that to work.”

"That’s another cool thing, Terri," exclaimed Jerry. "Stone Bond’s Enterprise Enabler is a single platform that you use to configure these “virtual models,” and it stores all of the configuration as metadata. Again, no data is stored, unless, of course, you need to cache part of the data for a bit so as not to bring SAP to its knees. That’s configurable, too and we did it in two weeks.”
Terri seemed confused. “No, But where's the CODE? There HAS to be programming involved!"

Marvin and Jerry in unison, “Nope.”   Both exit stage left.

Terri sits down, exhausted. She knows she hasn’t kept up with the newest technologies, and she really misses the thrill of making successes out of them. She drifts off…

No one really knows what happened to Terri. She just disappeared. No one heard from her again.  But there were rumors of sightings late at night of a ghost-like lady with very white hair, madly searching the networks and mumbling something like, “Yoohoo! Code! Wheere aare yoooou? I’ll find you sooner or later.”
She awakes with a start. “Some people just don’t get it. No, Terri-the-architect is not going to disappear like that. Not me! So, where’s the documentation, so I can get started?”

Thursday, July 9, 2015

Agile ETL

It seems now that most enterprise architects are at least aware of the concept and meaning of data virtualization, something I, along with our team, had awaited many years. It really is a fairly significant mind-shift to eliminate dependence on the idea of staging databases and data warehouses in order to make federated data available as needed. Until the term “data virtualization” was coined and analysts began spreading the word, we were a bit stuck, since even they could not understand, or perhaps articulate what our integration platform, Enterprise Enabler®, actually does and how.

The interesting thing is that we started out applying the underlying concepts of live federation to ETL configurations. As far as I can tell, analysts haven’t grabbed onto this concept yet, but the power of this variation on DV makes it worth contemplating.  For now we’ll call this “Agile ETL,” or, I suppose it could be dubbed “Virtual ETL.” Yes, that sounds better. 

What is ETL? As everyone knows, it means:

Extract data from source. Transform it to the destination format. Load it directly to a destination application or to a staging database or data warehouse. For each source, the process is the same. Then when it’s needed elsewhere, the consuming application, dashboard, or other system queries the staging DB or Data Warehouse. So that’s EXTRACT. TRANSFORM. LOAD… Three distinct steps for each source, generally involving considerable custom programming.

In our book, “transformation” and “federation” are always done together, so, any time the “T” in ETL is performed, federation is also invoked if there are multiple sources involved. So Agile, or Virtual ETL inherently involves one or more sources, and is a point-to-point solution only as a special case, i.e., when there’s only one source.



The steps in Data Virtualization


First let’s look at Data Virtualization (DV). What is it that constitutes a DV? See the key elements in the diagram above:
  1. Data accessed live directly from sources
  2. ALL SOURCES are included (e.g., electronic devices, any application, data feed, bus, Big Data, lake, cloud-based, on premise, in the field.  Oh, AND, of course, all databases, and relational or hierarchical formats, which constitute the total domain of other DV software products unless considerable programming is involved.
  3. Data is federated, transformed, and validated live as it come from the sources. (This is not necessarily available without custom coding in competing products.)
  4. No data is stored en route, except where caching is applied for performance or impact on source systems
  5. The target entity or data model is defined up front, but can be easily modified.
  6. Each DV is packaged in one or many consumable and queryable formats.
  7. Each DV may include end user awareness with full CRUD (Create, Read, Update, and Delete) functionality, honoring security permissions. This “write-back” capability has huge implications for simplifying synchronization and enabling users of consuming applications to actually take action on data (e.g., updating or correcting data.) This is not a built-in capability for most DV platforms. 

Most people think of DV as only being for Business Intelligence and analytics. You can see that it is also a powerful tool any on-demand uses, such as portals, dashboards, Enterprise Service Layer (ESL), and a basis for Master Data Management (MDM).

Compare the path of Agile ETL?


Who says that ETL has to be clunky, just because that’s the way it grew up? Who says it must be one-to-one? When we started out more than twelve years ago, our objective was to get data from lots of different sources, combine them (federate) in the process, and deliver the data validated and however and whenever the destination application, data store, or electronic device required it.

Let’s look at what, in my book, constituted, and continues to define, Agile ETL. Hmmm…. Being a strong proponent of reusability, I’ll refer to the list above for brevity and clarity:
  1. Same as (1.) above. Data accessed live
  2. Same as (2.) above. ALL SOURCES
  3. See (3.) above. Data is federated, transformed, and validated live
  4. See (4.) above. No data is stored en route.
  5. Same list as SOURCES (2.) above ALL DESTINATIONS.
  6. Each Agile ETL is associated with one or more data workflows that include triggers, additional validations, business rules, multi-threaded logic, etc., essentially a configured composite application. Triggers are many, including web service call, so the waters do get muddy.
  7. Same as (7.) above.End user awareness for authorization to execute ETLs and/or the associated workflow process.

Since we started out with Agile ETL, the DV became a matter of adding the packaging as services along with complex query capabilities.

Since Enterprise Enabler is a single secure platform where every element of configuration is stored as metadata, you can readily see that reusability becomes natural, and that added benefits abound, such as monitoring for change and analyzing its impact, tracing the actual data lineage, and MDM.

Even if you decide to continue using Data Warehouse architecture rather than going the DV route, isn’t it at least time to add agility to your ETL?




Wednesday, March 18, 2015

Agile Integration Software Rescues the Dead Side of Bimodal IT Architecture


I’m sure you’ve heard of the recent high profile discussions about how to bring much-needed innovation to mature companies that are carrying the comforting ballast of old fashioned infrastructure.  That infrastructure is the greatest impediment to agility and innovation (unless you count the people and culture that go along with it). I first heard about “bimodal” at a local Gartner program a few weeks ago and found the concept both thrilling and disturbing.
The idea of bimodel divorces the reliable and stable back-office (Mode 1 “Core”) from all that is innovative (Mode 2 “Innovation”). This means that innovation is explicitly separate, with new, presumably agile, infrastructure to create new lines of business for generating new revenue streams, and to provide more contemporary modes of interacting with consumers and employees.
I’m a little concerned that a bi-modal declaration promotes an easy way out for Mode 1 laggards. Their management no longer have to worry about modernizing or even interacting with Mode 2. In approaching the problem this way, we are continuing to be enablers of the infrastructure and its management who are addicted and afraid to even try to wean from their brittle, ancient technologies and methodologies.  I suspect that part of the reason that we got to this point is their continued abdication of decisions and recommendations to vendors and consultants on the dead side of bimodal. With advancement generally limited to creative marketing and re-messaging the same 20-year-old technologies and ideas, the name-brand consultants in enterprise IT nominally grab the buzz but only deliver it on the periphery.

I definitely agree that relying on the Mode 1 (reliable, stable) IT is highly unlikely to bring  significant innovation, and I also believe that the best way to get real innovation underway is to completely separate it out, with different people, skill sets, management, and objectives. But if we imagine how this will play out, there is likely to be a complete bifurcation where the innovative side never is able to leverage the back-office functions. They will inevitably invent their own (less reliable and less stable) versions of back-office. What happens then? Mode 1 eventually dies on the vine? We regress to pre-1980 basics?  Business in general accepts worse performance on the backend functions?

(You didn't really try clicking that button, did you?)
It’s probably obvious that my take is that BOTH modes should advance aggressively, ‘though I do believe the innovative side should be unencumbered by the Old World. Mode 1 management should take this as a gauntlet to push hard to replace their integration infrastructure with Agile Integration Software, such as Stone Bond’s Enterprise Enabler®, which is a proven enterprise-ready framework that boosts agility, interacts with both Mode 1 and Mode 2 applications and data, and offers up to 90% reduction in time to production along with a huge reduction in tech debt. That is what is needed for companies to survive and enjoy a competitive advantage in the coming years.
You Mode 1 people do have a choice. You can continue as is and sit by waiting for your inevitable demise, or you can be the hero that solidly bridges Mode 2 and Mode 1. We are seeing this successfully implemented by forward-looking CIOs. So, find a leader and press GO!

Thursday, February 26, 2015

The Hyper-Converged Integration Platform

Frankly, I find it amazing that it has taken so long for the concept of convergence of integration to become a topic of discussion. In fact, it’s mind-boggling to me that almost all of the manifestations of integration functionality appeared on the scene as islands, with delineation only just now beginning to blur. ETL tools do ETL; EAI tools do transactions; SOA does web services; DV tools do data virtualization, and so on.

Stone Bond’s Enterprise Enabler® came on the scene ten years ago or so, as a platform with metadata shared across all “patterns,” rendering the classic integration patterns  essentially moot. If someone stepped into data integration, contemplating it as a general problem to be solved, they might identify these various patterns, but they would also quickly see that they are not mutually exclusive. There is clearly more overlap in the demands across these patterns than an observer of the evolution of data integration tools would support.
The providers of integration tools were much too hasty in solving the problem: not considering anything beyond the particular integration style at hand. It’s reminiscent of the custom programmed applications that are designed for a specific customer. Eventually it dawns on someone that this solves a problem for a large set of businesses. What happens? This nice, clean solution gets bells, whistles, and tweaks for the second customer and... Voila! It becomes a (usually lousy) “Product” that requires months of customization to implement. Now, think about how different the Product or the integration tools  would have been, had the initial design taken into consideration the superset of potential users and uses.

 
Click to Enlarge Picture
 
Whether you are physically moving data, packaging integrations as web services, or generating virtual data models for MDM or for querying, there are some critical key elements that are necessary to have at the core of the integration platform.
Let’s go back to the idea of hyper-converged integration platform. It is only possible if the overall design takes into account the essence of shared functionality across the characteristics that will, or may, be needed in every pattern. Even if you don’t know what the patterns will be, you do know that the platform should always be able to, for example,

  • Access data and write data to any kind of endpoint - live
  • Federate and align that data across multiple sources, whether they are the same or totally different
  • Align and  transform to also ensure the data makes sense to the receiving endpoint, whether physical or virtual
  • Apply business logic
  • Validate and filter data
  • Manage various modes of security
  • Apply workflow, error handling, and notification
  • Package the integration in many different ways
  • Scale up and out
  • Reuse as much configuration as possible

A hyper-converged integration platform has all of these capabilities, and as a single platform, all of the objects configured are reusable and available for more universal value. For example, an ETL that brings five data sources together and posts to a destination (no staging), can also be reused as a Data Virtualization model for live querying on demand.
Whatever mental picture you have of integration toolsets,  try thinking instead about an Enterprise Nervous System, with data flowing freely throughout the company exactly how and when it is needed.

Enterprise Enabler is a hyper-converged Integration platform, perhaps because the overall design came about from years of contemplating the essentials of integration as a whole. It’s easier to start out with a universal consolidated solution than to back into it from ten different, fully developed tools.
An integrated set of tools is highly unlikely to become a Converged Integration Platform, and will forego the powerful agility and elimination of tech debt that Enterprise Enabler can bring.