10/17/2005

Datapower: VC Lessons

IBM announced today that it was acquiring Datapower.  I’ve written another post on why I think this announcement is significant from an industry perspective, but given that I was an investor in Datapower, I thought I would also write a post about some of the venture capital aspects of the deal.

I invested in Datapower in early 2002 when the company had 6 employees and was based in a mouse infested former auto-body shop located between two housing projects.  Datapower was founded by Eugene Kuznetsov, a brilliant MIT engineer, who saw the promise and the challenges of XML messaging early on.

Like all venture deals, I learned a lot from my Datapower experience, but here are a few of the most important things I learned:

  1. Local presence matters.  I live and work on the west coast.   Datapower is in Boston.  When I first wanted to invest in Datapower my partners’ first reaction was “it’s too far away, you need a local partner”.  They were right.  I spent the next few months trying to find just the right partner.  Luckily Jeff Fagnan, who was then at Seed Capital (a fund I knew well) had already been looking at the space and quickly decided that he would like to join us in the investment.  Jeff proved to be an invaluable co-investor and ultimately got stuck with much of the day to day investment management chores that I could not effectively do.   It was an important lesson for me on the critical importance of having high quality local co-investors if you do a deal “out of market”.  Incidentally, Jeff left Seed early this year to become a partner at Altas and his first investment at Altas just happened to be in Datapower.  I suspect everyone at Atlas is happy with the IRR on that investment!
  2. Sometimes VCs should keep their mouths shut.  Just after Datapower had launched its first product, a performance oriented appliance, Eugene lobbied for the company to accelerate the launch a second security oriented product that had been planned for a quarter or two in the future.  At the time, I remember cautioning Eugene on the potential distractions and costs of having two immature products in the market at the same time.  Eugene lobbied hard to take the risk and thankfully he won the day.  I say thankfully because not only did the company land a $300K order that quarter for the security product, but it was able to establish significant mindshare in the security space well ahead of its competitors.  To this day the security space continues to have the most robust market demand and competitors that failed to quickly launch a security product suffered in the market.  The lesson for me in this was that VCs have to be careful not to micro-manage product development in a rapidly emerging market because demand can move very quickly and in unexpected ways.
  3. Shotgun Weddings Don’t Work.  Early on in the company’s life we were trying to recruit another local investor into the deal.  That investor had an entrepreneur-in-residence (EIR) that helped them with due diligence and really liked the deal.  The new investor made recruiting an interim Chairman/CEO a condition of their investment and there was an implicit understanding that they would feel most comfortable with their own EIR taking that role.  The existing team was not 100% comfortable with the EIR but felt pressured to take him on in order to secure the funding.  As it turned out, the EIR was the wrong person for the job and tension started to develop between the existing team and the EIR to the point where it became a major distraction for the company.  Ultimately, the board ended up hiring a new CEO who turned out to be a much better fit, but we almost blew it by not taking action earlier.  The lesson for me as an investor is that you should never insist on making a company hire a specific person a condition of investing as that dramatically raises the potential for conflict.  You are much better off investing in advance and helping the company recruit someone great that everyone is 100% confident in.
  4. VCs can indeed be very unethical.   Prior to raising his first significant round of venture financing, Eugene had raised a seed round from a few individuals and a couple of investment funds, one of whom was a reasonably well known VC fund.  The partner at this fund had a strategy of sprinkling small seed investments around the Boston-area and then trying to lead the first institutional rounds of any company that looked particularly promising.  In Datapower’s case, this partner invested a few hundred thousand dollars.  He also introduced Eugene to a technology executive affiliated with the fund that was currently in-between jobs and encouraged Eugene to involve the executive closely in the formulation of Datapower’s technology and market strategy.    Everything was ok until Eugene decided to raise his Series A financing.  At that point the VC fund submitted what was clearly a low-ball term sheet and pushed very hard to close it.   When Eugene objected to the terms and announced that he would try to generate some alternative offers to see if this was in fact “market” he found that he couldn’t get any traction with other Boston based VCs most of whom would either not meet with Eugene at all or who told him that they would not do the deal without also including the original VC (at the terms they had proposed).   Now I don’t know if the original VC had an active campaign to try and discourage other investors from doing the deal, but they obviously knew that new investors would not want to do the deal without them (if the original investors don’t invest that is typically a big warning flag that something is wrong) and used that leverage to try and get a better deal.   While to this day I use this situation as a classic example of why entrepreneurs shouldn’t have a VC in their seed round, if that was all there was too it there wouldn’t be much to write about.  However after Eugene rejected their term sheet and instead ultimately accepted mine, the VC in question went ahead and not only invested in a competitor, but installed the same executive that they had installed at Datapower at their new investment.  Within months, this competitor began spouting very similar marketing messages and appeared to be executing against a carbon copy of Datapower’s product and market roadmap.  This brazenly unethical behavior by the VC fund was absolutely stunning and so egregious that it almost was a caricature of what you expect an “evil VC” to do.  To add insult to injury, when the Series A investors in Datapower approached this fund, politely pointed out the rather obvious conflicts, and requested that the fund sell its shares back to the company or to other investors, the fund refused.  Luckily Eugene got the last laugh though.  The competitor the original VC fund invested in was recently sold in a transaction that reportedly didn’t even return capital handing many of its investors a substantial loss on their investment.  In contrast, Eugene is now a very deservedly wealthy man and all of his investors made a handsome return on their investments.  I guess good guys do sometimes win.

One last piece of trivia: I closed two investments on January 14, 2002.  (It’s highly unusual for a VC to close two investments the same day.)   The first was in a company called Cyanea and the second was in Datapower.  Both companies ended up being bought by IBM; Cyanea last summer and Datapower today.  While Cyanea generated a higher IRR, the difference in cash-on-cash return multiples between the two deals was less than 10%.  I have got to close two deals on the same day more often!

October 17, 2005 in Middleware, Software, Venture Capital | Permalink | Comments (7) | TrackBack

IBM Acquires Datapower, Software Will Never Be The Same

IBM announced today that it was acquiring Datapower, the pioneer of message aware networking.  As some may know, I invested in Datapower and given that I’ve written another post on some of the venture capital aspects of the deal, but I thought I would also write this post about the higher level significance of deal from an industry perspective as I think it is pretty interesting for anyone involved in software.

From an industry perspective, IBM’s announcement is significant for a few reasons:

  1. It represents very a powerful endorsement of the long term promise of message aware networking.
    Message aware networking involves shifting the processing of software messages away from applications (and their associated middleware) into specialized hardware devices.   These devices dramatically improve the security, performance, and manageability of software messages.  As I have written before, message aware networking is one of the top trends in the software industry, but up until recently most of the major technology companies had yet to make a commitment to the space.  However in just the past few months a number of tech heavyweights have weighed in on the space.   First, Cisco announced its AON line of message aware network equipment and then Intel surprisingly announced that it was getting back into the space when it acquired one of Datapower’s smaller competitors, Sarvega.  IBM’s move now marks the first major enterprise software vendor (and arguably the most influential one) to embrace the trend.  So in the space of just a few months, message aware networking has gone from the province of just of few enterprising start-ups to a major battle-zone between some of the tech industry’s biggest titans.  Much of this has to do with the growing realization that as software is broken into smaller and smaller pieces that are distributed further and further apart, that the messages between these software pieces are becoming an incredibly important.  In this environment “the message is becoming the software” to such an extent that the processing and handling of the messages is becoming as important if not more important than the application itself.  IBM’s entry into the space, with its vast stable of enterprise customers and huge enterprise “stack” will likely accelerate the adoption of message aware networking (and the Service Oriented Architectures that sit on top of it) and will put pressure on other software vendors to follow suit.
  2. It underscores the inevitable collision between enterprise software and enterprise networking vendors.
    Message aware networking sits in a supposed “no man’s land” in between enterprise software and networking. It looks a lot like networking because it requires high speed dedicated devices to process large numbers of standards-based messages, but it also looks a lot like software because it requires intelligent middleware to make content and context sensitive decisions.  Because message aware networking did not naturally fit into the networking space or the enterprise software space, the big guns in each space weren’t really sure what to do.  However with a potentially huge market at stake, neither side was prepared to concede the market to the other.  Ultimately, Cisco broke an uneasy truce and moved into the market with its AON products.  In this light IBM’s purchase of Datapower can be seen as a direct response to Cisco’s moves.  These moves and countermoves come despite the fact that Cisco and IBM are supposed to be the best of friends.  However, as I outlined in an earlier post, Cisco and IBM are destined to find themselves competing head-on much more frequently thanks in large part to the inexorable melding of the traditional networking world with the traditional middleware world.   Who knows, they might have even competed over Datapower.  This “battle of the stack” will likely be one of the most important enterprise computing stories of the next decade.
  3. It marks what is likely the beginning of a very aggressive push by IBM to develop a fully featured SOA “stack”.
    As a wise man once said “He who says A, must say B”.  In buying Datapower, IBM is making it clear that they intend to build to a robust stack of message oriented products.   As relatively “dumb” yet critically important message processors, Datapower’s products will likely serve as the foundation for a wide array of message oriented products, which will mostly be grouped under the Service Oriented Architecture (SOA) label.  With the foundation in place, IBM will likely add products with other features such as SOA management, BPEL-based business process management.  Datapower’s acquisition is critical because it secures IBM’s rear flank from attack by the networking vendors and allows them concentrate their full force on enterprise software related issues.

I admit, it’s a bit of a stretch to say that software will never be the same after IBM’s acquisition of Datapower, but I do think that the acquisition underscores the fact that some of the biggest names in technology now endorse the fundamentals tenants of message oriented networking and that this promises to help spur long term changes in not just the architecture of software programs but in the competitive positioning of the technology industry.

October 17, 2005 in Middleware, Software, Venture Capital, Web Services | Permalink | Comments (3) | TrackBack

08/10/2005

SOA Under The Radar: Recap

Last night I served on a panel of VCs at IBD's "Under the Radar: SOA Death Match".  The event featured 4 companies with products that were either directly or indirectly focused on enabling Service Oriented Architectures (SOA).  Each company presented for 6 minutes, then the panel of VCs asked 6 minutes of questions.  At the end of the event, the VC panel picked a "best in show" and the audience picked their own "people's choice".

Perhaps what I found most interesting about the conference was that you could actually get 75 people into a room on a Tuesday evening to discuss Service Oriented Architectures.   Sure this is Silicon Valley and there are lots of tech geeks that are always up to discuss the latest and greatest technology trends, but I remember in 2001/2002 when the mere mention of XML, SOAP, etc. brought puzzled stares from many in Silicon Valley.   I think it just shows that the whole concept of XML and SOA has reached mainstream acceptance, at least within technology circles, and really is destined to become an important and long term part the technology fabric.

In case you are interested, here's an overview of the 4 companies that presented:

Appistry:  Appistry was a bit of mis-match for the conference in that they are more of a application virtualization play than an SOA play.  I actually like the application virtualization space quite a bit, although many of the big players have already made acquisitions in the space so the amount of opportunity remaining for start-ups is limited.   That said, Appistry seemed to have a very solid product and several good reference customers.  They were a bit of a sentimental favorite for me given that the CEO was a former Wash U grad and they are located in Wash U's hometown of St. Louis (not exactly the tech start-up capital), but they clearly were at a disadvantage in the competition because SOA wasn't really their sweet spot.  I suspect they knew this and were really just looking to get some valley exposure for their business/fund-raising efforts, so they should have gotten an award for entrepreneurial pluck.

Blue Titan
:  Blue Titan's main product is a web services management platform that enables companies to provision, secure and manage lots of different web services.  Their main competitors are Amber Point and SOA Software (which was supposed to present at this conference but canceled at the last moment).   Blue's Titan's founder and CTO presented and he was probably the most engaging presenter of the evening.  Conceptually I like the web services management space a lot.  I actually funded a company in early 2001 to go after this space (Maaya), but I was *way* too early and I was lucky just to get my money back.  These days it looks as though the space is finally getting some traction, but the sales process is complicated by the fundamental architecture issues that come along with embracing SOA which means it's a technical sale that requires multiple sign-offs.  In one of the more humorous outcomes of the evening, Blue Titan actually won the "people's choice" award but finished last in the VC panel's voting.  I think we VCs were concerned with the difficult sales cycle that Blue Titan faces while the audience was more focused on the visionary nature of the product.  Blue Titan's CTO took the difference in stride and said that the vote just proved his belief that potential customers appreciated his business much better than potential VC investors.

Ipedo:   Ipedo is focused on Enterprise Information Integration (EII) which I like to call data abstraction.  They aren't really focused on SOA per se, but their technology is arguably critical to the enablement of SOAs.  Ipedo competes primarily with other start-ups, most notably Composite Software and MetaMatrix.   I like this space a lot and actually came very close to investing in the first round of Composite Software (which I still believe is the best company in the space) but wasn't able to get my partners over the goal line.  I believe Ipedo started out as more of an XML-database play, but they quickly (and correctly) realized that a more generalized EII platform had more long term promise.  One of the most interesting things the CEO mentioned in his presentation was that Ipedo had an office in Shanghai, that their Chinese operations were profitable on a stand-alone basis, and that they were seeing strong demand for their EII solutions in China.  Given that EII solutions are just now being adopted by many US corporations I would not have suspected that there was demand in China, but I think it just goes to show how quickly the software market is developing over there.  As it turns out, Ipedo ended up winning the VC panel award.  I think this had to do with the fact that Ipedo seemed to be addressing a more pragmatic and immediate business need (data integration) than SOAs, so in some senses it really isn't fair as that's really comparing apples and oranges.

Reactivity:  Reactivity is a Message Aware Networking company that sells a "software appliance" focused primarily on securing XML messages as they transit a company's network.   I funded one of their direct competitors, Datapower, so I am very familiar with the space.  XML appliances aren't theoretically required to build an SOA, but they provide a much more secure, reliable and manageable foundation for SOAs.  Reactivity has traditionally been focused almost exclusively on the security side of equation (many refer to their product as an XML firewall).  To their credit this has really turned out to be the near term sweet spot of the market, however I think Reactivity's early focus has allowed some of their competitors to pigeon hole them as only security focused which may hurt Reactivity as customers begin to look for broader XML message platforms.   The big news in this space has been Cisco's recent announcement of its AON initiative which I think will likely force other big networking and software players to seriously consider buying some of the start-ups in the space.  I asked the CEO about Cisco and he gave a very honest, straightforward and mature response about  Cisco's efforts which was very refreshing to hear from a start-up CEO.  Ultimately I think both Datapower and Reactivity will do well, as the space is growing quickly and strategically important to a number of companies.

August 10, 2005 in EAI, Middleware, Web Services | Permalink | Comments (0) | TrackBack

07/06/2005

AON: Why the IBM and Cisco Relationship Is Headed For A Break-Up

Cisco’s long awaited announcement of its Applications Oriented Networking (AON) products a couple weeks ago foreshadows a coming battle that may rip apart the cozy and long standing strategic alliance between IBM and Cisco.  Cisco and IBM have somehow defied the odds over the last few years to maintain a high-profile strategic alliance, one which many people felt would be over faster than Britney Spears' 1st marriage.  The two companies have partnered together on a myriad of initiatives from data center management, to security, to storage networking, seemingly secure in the knowledge that neither firm had any intentions of getting into the other’s core business.  However with the launch of AON it is now crystal clear that Cisco has designs on at least a portion of IBM’s core business and that IBM must respond before one of its crown jewels, its infrastructure software portfolio, is rapidly commoditized.

E Tu, Taf?
The primary architect of Cisco’s new AON strategy, Taf Anthias, is none other than the former head of IBM’s MQ Series middleware messaging platform.  What Taf has done at Cisco is to try and create networking devices that are not packet aware, but message aware.  As I have outlined before, message aware networking is one of the most important trends in software today.  The focus of message aware networking is to migrate basic tasks such as security, transformation, and message routing away from application servers and message brokers and into network devices.  This migration should not only theoretically increase performance and enhance flexibility but it should also create the foundation necessary to properly run complex, highly scaled, Service Oriented Architectures.

Same Problem, Different Perspective
The problem for networking companies, such as Cisco, has been that message aware networking is not a natural fit.  While it looks a lot like networking in that it needs dedicated devices to process a large volume of standards based information quickly, it also looks a lot like application software in that it is message, not packet, based, and you therefore need to understand the context and content of the message in order to be able to process it.  Cisco recognized this conundrum some time ago and instead of trying to turn some of its “packet heads” into application engineers, it hired Taf.

Infrastructure software vendors, such as IBM, face the reverse problem: message aware networking looks a lot like middleware message processing, but it requires a level of performance, security, flexibility and even dedicated hardware that makes it look a lot more like networking.

An Uneasy Truce Leads To A Long War
For the last couple years, both the networking companies and the infrastructure software companies have recognized message aware networking as a dangerous, but potentially lucrative “demilitarized zone” that separated their two industries.  Up until now, both sides have been content to let a small cadre of start-ups fight it out as none of the big boys wanted to risk upsetting the global order by making a major move into the space.

However with the explosion of interest in Service Oriented Architectures and the rapid adoption of XML-based messaging, it was only a matter of time before one of the big players made a move.   Now that Cisco has taken the first shot with AON,  IBM, and other infrastructure software players such as BEA, HP, CA, and Microsoft, must respond or they risk ceding a significant portion of their “value add” to Cisco and other networking vendors.

While there are many battles left to fight in the war for control over message aware networking, the first casualty will likely be the previously cozy relationship between IBM and Cisco as it's hard to partner with someone that has clearly made a strategic decision to try to destroy part of your core business.

July 6, 2005 in Middleware, Security | Permalink | Comments (4) | TrackBack

04/10/2005

Super Services, Process Portals and the Road to Composite Applications

Publicly accessible web services seem to be proliferating like rabbits days.   Not only are high profile early adopters such as Amazon.com, Ebay, Google and FedEx launching a plethora of new services, but an increasing number of more obscure firms are throwing their hats into the ring, offering everything from commodity futures prices to bible quotes.

Super Services
Theoretically, this large pool of publicly accessible web services should foster the creation of a new class of “super services”.  Super services simply combine several different web services into one master service.  They can be custom-designed to serve the needs of a specific company or be repackaged and offered to the public as yet another service.  In fact, there are already some interesting examples of enterprising developers stringing together a few web services to create a rudimentary websites which themselves could be exposed as super services such as this "mashup" of Amazon/Google/Yahoo, this mixing of Flickr and the US Government's zip code database, and this combination of Google Maps and Craigslist.

Unfortunately, creating a true super service is much harder than these early examples might suggest.  To create super services developers must not only link web services at a semantic and programmatic level but they must also find a way to successfully orchestrate a business process across these services in an orderly enough fashion that a basic level of performance and transactional integrity is maintained.   Luckily, emerging business process orchestration technologies, most prominently BPEL, provide a standardized mechanism for creating the process logic underpinning super services.   However, while adding BPEL to the mix has tremendous benefits it also makes the act of building super services even more complex and less accessible.

Process Portals
In recognition of both the increasing number of web services and the increasing complexity of linking them together, a new crop of start-ups has emerged including such companies as eSigmaBindingpoint, Xmethods, and Strike Iron.  Initially these start-ups appear to have the rather mundane goal of creating directories of publicly available web services or even libraries of proprietary web services (such as Strike Iron and Xignite have done), but dig a bit deeper and you realize that their ambitions may extend much further.

Take eSigma for example.  I had the opportunity to chat with its founder, Troy Haaland, the other day.  As Troy explained, the simple portal-like interface of eSigma actually hides an increasingly complex infrastructure.  Right now, at the core of this infrastructure is a fully functioning UDDI directory.  All of the services you can browse via the portal are actually formally registered in the UDDI directory making them programmatically discoverable.   The goal is to link this directory core to a higher level process management capability via a BPEL-based visual authoring/scripting platform.  Not only would such a platform allow enterprising developers to easily create and, theoretically re-sell, their own super services, but more importantly it would allow enterprises to create composite applications that exist solely in the “cloud”.  Such “cloud based” composite applications could then be used a back-bone of inter-enterprise applications.

In this way, what appear at first to be simple directories may ultimately be transformed into Process Portals, or sites that not only centralize web services meta-data, but host a set of custom-designed super-services and composite applications as well as the visual authoring tools needed to create them.

The Road Ahead
While this is clearly a long term vision, there are indications that elements of this vision may be closer at hand than one might imagine.  Within the enterprise, there are already a number of products, from companies such as Amberpoint, Blue Titan, and Digital Evolution vying to manage the low-level provisioning and performance of intra-enterprise web services.  As the number of web services multiplies within an enterprise, a directory infrastructure is a logical next step (indeed some products have already taken this step) and some kind of orchestration layer will also clearly be necessary if enterprises want to foster re-usability and enable the creation of super services.   In some ways then, the writing is on the wall: Process Portals are an inevitable result of the increasing number of web services.  The key questions outstanding then are: 1. Will these portals first make their presence felt inside the enterprise as packaged applications or outside the firewall as publicly accessible Process Portals?  2.  Will de novo start-ups be best positioned to own this space or will the pre-existing web services management products “grow” into this space? and 3. Just when exactly will this space generate enough revenue to make it interesting from an investment standpoint?

April 10, 2005 in EAI, Middleware, Software, Web Services | Permalink | Comments (0) | TrackBack

03/17/2005

Software's Top 10 2005 Trends: #1 XML

XML, eXtensible Markup Language, is everywhere.  It serves as the foundation for just about every data exchange and interface standard created in the past 5 years.  It is imbedded into the core of just about every application that has been built in the last few years.  And it is at the heart of almost every significant trend in the software industry from Service Oriented Architectures, to Message Aware Networking, to Composite Applications, to Data Abstraction.

For all its importance though, XML has always played second fiddle to HTML.  However 2005 may well be remembered as the year in which XML eclipses HTML in terms of overall importance to the web.   That’s because XML is now the de facto language of machines-to-machine interaction on the web and such interaction is exploding thanks to adoption of web services and the proliferation of web-capable devices.

In some respects XML is not very impressive.  On its face, it is a highly simplistic and very expensive way to represent data structures and interfaces.  However the last decade’s massive improvement in raw compute power has made XML a much less expensive technology and its simplicity has allowed legions of HTML programmers to easily graduate to the supposedly more complex world of data and service representations.

Now XML is evolving to the point where many new XML-based standards seek to embed within XML aspects previously only associated with complied code, such as business logic and state.   In this way XML messages are now becoming free standing bits of code and integral parts of applications.  In essence, XML messages are becoming software.

For software VCs, XML does not present many direct investment opportunities, but rather colors almost every opportunity they look at.  The existence of a universal machine-to-machine interface and data standard has huge implications for everything from middleware, to databases, to applications.  Of course at some point there will be something better than XML created, there always is, and that may create a whole new set of investment opportunities, but until then software VCs that invest without a deep understanding of the context, benefits and drawbacks of XML are shooting in the dark.

For a complete list of Software's Top 10 2005 trends click here.

March 17, 2005 in Middleware, Software | Permalink | Comments (0) | TrackBack

03/14/2005

Software's Top 10 2005 Trends: #4 Service Oriented Architectures

Service Oriented Architectures (SOAs) are all the rage these days.  Almost every software vendor is putting out some kind of SOA-related marketing spin and trade magazines buzz with the pros and cons of various approaches to building and deploying SOAs.

Despite all the talk, very few companies have actually built and deployed a robust SOA yet.  That’s because building an SOA requires not just the creation numerous new web services interfaces but often requires significant re-architecting of existing systems.

2005 should see some serious progress on the SOA front though.   Over the past two years, many companies have successfully laid the foundation for SOAs by building out a small portfolio of independent web services.  With this foundation in place, constructing a full fledged SOA is not only possible now, but increasingly necessary as companies seek a cohesive way to manage their web services portfolios.

While SOAs have been cast as a sort of high-level application integration panacea, they will create some serious problems of their own in areas such as performance, security, and manageability.  While a new class of SOA management platforms from companies such as Blue Titan, Amberpoint, and SOA Software have sprung up to meet these challenges, there’s a strong argument that such SOA management capabilities should be built directly into the application server platforms which already have many robust low level management capabilities.  In addition to the app server players other folks, such as the directory management and network players players are also staking a claim to space as evidenced by Oblix's recent acquisition of Confluent, CA's acquisition of Adjoin, and HP's acquisition of Talking Blocks.  Which group ultimately comes out of top will be one of the more interesting stories to watch in the SOA space.

For a complete list of Software's Top 10 2005 trends click here.

March 14, 2005 in Middleware | Permalink | Comments (1) | TrackBack

03/10/2005

Software's Top 10 2005 Trends: #5 Message Aware Networking

Message Aware Networking sits in the midst of a kind of “no man’s land” in between networking and software.  It looks a lot like networking because it requires high speed dedicated devices to process large numbers of messages, but it also looks a lot like software because it requires intelligent middleware to make content and context sensitive decisions.

With the number of messages proliferating rapidly thanks to the rapid adoption of loosely coupled applications that utilize XML-based messaging standards, the need for message aware networking products is growing rapidly.  This need will only grow faster as companies begin to deploy composite applications and inter-enterprise applications.

Several start-ups have taken the lead in defining this space including Datapower, Sarvega, and Reactivity.  While I am admittedly heavily biased, I’d have to say that Datapower is the clear leader for now thanks to its head start and deep technology base, but there will likely be many twists and turns in this space in the next few years.

Right now the hottest part of the Message Aware Networking space is the “security gateway” space in which edge devices basically scan incoming XML messages to make sure that they are kosher from a variety of perspectives.  As the volume of messages increases, other aspects such as performance, routing, and management features will become increasingly important as well.

If it is true that the message is becoming the software, then over time message aware networking equipment will perhaps become as important if not more important than application servers.  The high stakes involved raises the question as to which of the big elephants will make the first move to stake their claim to the space.  Given that the space doesn’t fall cleanly into either the networking camp or the software camp there’s an opportunity for either side to make a move.  On the networking side, you have Cisco and Juniper being the most likely candidates while on the software side you have IBM and Microsoft.  The interesting thing to consider is that IBM and Microsoft have both had very cordial relationships with Cisco to date, but if either of these companies were to make an aggressive move into message aware networking (which I believe they must do) one would suspect that their relationships with Cisco could quickly turn a bit chilly.

Whatever the case, 2005 should see increasing adoption and acceptance of message aware networking as well as at least some preliminary moves by the elephants, who can’t afford to sit idly by and watch this potentially huge space get claimed by someone else.

For a complete list of Software's Top 10 2005 trends click here.

March 10, 2005 in Middleware | Permalink | Comments (3) | TrackBack

03/07/2005

Software's Top 10 2005 Trends: #8 Composite Applications

Software applications are gradually getting pulled apart thanks to greater bandwidth, lower latency, open standards and generic interfaces.  Composite applications represent the logical end of this current evolution.

While the term “composite application” has rapidly become a kind of marketing catch-all term for any kind of next generation EAI or web service technology, the most straight forward definition of composite applications is that they are applications created by loosely coupling several different services and data stores via standardized message layers.  Theoretically, the component parts of a composite application can be mixed and matched, much like lego blocks, allowing developers to create an wide variety of applications with a relatively small set of services.

To some extent, composite applications attempt to finally fulfill the promise of re-useable components, but do so at a much higher “service” layer.  Given all of the failed promises of the component “revolution”, there is a lot of justified skepticism that composite applications will be able to fulfill much of their hype, however web-based composite applications clearly have significant potential.

The key question then is whether composite applications will be limited to a small set of “super services”, ones that simply combine and/or repurpose a set of high level web services, or if composite application architectures will be practical for highly granular enterprise applications that must incorporate a wide variety of highly specialized services.

2005 should see increasing awareness and adoption of the concept and potential of composite applications, however actual implementation should lag as companies have yet to implement many of the web services and messaging infrastructure improvements that must be in place before composite applications can become a reality.

For a complete list of Software's Top 10 2005 trends click here.

March 7, 2005 in Middleware | Permalink | Comments (3) | TrackBack

12/22/2004

REST vs. SOAP: Which SOA Is More Popular?

While much of the hype around Service Oriented Architectures (SOA) revolves around SOAP and its request/response custom-API based brethren, REST-based SOAs are quietly, but quickly, permeating the web to such an extent that one wonders whether SOAP-based SOA’s have already been permanently eclipsed.

Ironically, part of the power of REST-based SOA’s is that very few people realize they are using them.  The stealth nature of REST-based SOA’s has much to do with the fact that they typically only use simple URLs and small snippets of HTML to accomplish their mission whereas SOAP-based SOAs use a complex series of standards, IDEs, and custom APIs.  Indeed many people who “program” with REST-based SOA’s don’t even know they are in fact programming a web service.  To most of them, they are simply cutting and pasting some text into a web page.

One of the most basic examples of a REST-based SOA is Amazon’s affiliate network (or for that matter any e-commerce affiliate network).  Amazon’s affiliate partners simply paste a small snippet of code into their website and then become part of a distributed web service in which they display Amazon’s goods and Amazon then pays commissions on any sales that occur as the result of their referrals.  This service, now with tens of thousands of participants, is all done by leveraging the existing web infrastructure.  Amazon does offer more complex SOAP-based web services, but these services are have been adopted by only a small set of their affiliates due to their complexity.

Another great example of a REST-based SOA is Google’s AdSense network. To the right of this article you can see some Google Ads that (hopefully) are relevant to the context of this article. By simply adding about 10 lines of HTML code to this site, not only have these context sensitive ads been placed there, but this site has been linked into a highly complex distributed paid placement ad network.   In fact, so far this month, I have earned over $43 just by placing these 10 lines of code on my web site.  To be fair, Google does use a fair amount of Java Script to actually enable the service, but from the endpoint’s perspective (mine) the service only requires a small snippet of code that requires no programming knowledge to use.

It is this ease of use and simplicity that sets REST-based SOAs apart from their more complex and admittedly robust SOAP-based cousins. As a result of this relative simplicity, REST-based SOAs are bound to see much wider adoption throughout the web than SOAP-based SOAs.

December 22, 2004 in Middleware | Permalink | Comments (6) | TrackBack

10/05/2004

The Message Is The Software

Back in the 1990’s Sun Microsystems famously coined the term “the network is the computer” in an effort to illustrate that distributed computing, enabled by networks, was destined to triumph over monolithic CPUs. A similar and perhaps more important revolution is now underway in the software world. In this revolution, monolithic compiled binaries are rapidly being replaced by fragments of distributed code held together by increasingly robust message systems. Far from simply relaying information, these message systems are rapidly evolving into “stateful” clouds. As these vast, intelligent, clouds evolve they are becoming, in many ways, the heart and soul of modern software.

Falling to Pieces
The fragmentation of software binaries is well-worn trend that started with some of the early component models and accelerated dramatically thanks to the adoption of the J2EE and .Net component models. With the advent of Web Services and Service Oriented Architectures (SOAs) such fragmentation has accelerated once again. As software fragments and distributes, the role of messaging systems increases in importance for it is these messaging systems that provide the virtual “glue” necessary to hold a distributed application together.

You’ve Come a Long Way Baby
Describing message systems a merely “glue” might have been appropriate back in the days of EDI, but today’s messaging systems are far more complex and capable than their predecessors. Increasingly these systems are not just intermediaries, but an integral and inextricable part of business processes.

The foundation of the modern message system is XML. This simple, yet powerful standard has leveraged its web-based heritage to become the de-facto foundation of almost every major message standard proposed and/or adopted in the last several years.

XML’s power lies not just in its accessibility and ubiquity but in its flexibility and extensibility. Despite its advanced capabilities, early implementations of XML-based message standards tended to simply replicate existing EDI/ANSI standards and thus treat messages as mere “data Sherpa’s” limited to hauling structured data back and forth between applications. However, as the true power of XML has become apparent, next-generation XML-based message standards have tried to incorporate some more advanced capabilities.

The Rise of “Stateful” Messages
Two of the most powerful types of next generation XML-based messages standards are “transgenic” and “stateful” standards. Transgenic XML standards encapsulate text-based code fragments within an XML message. These code fragments can then be uploaded into binaries during run time. For example, it’s possible to map XML elements to java objects and then upload those elements, via a parser, directly into a Java runtime environment.

This is at once both an incredibly powerful and an incredibly scary capability. It is powerful because it allows text-based messages to modify run-time code making it possible to do “on-the-fly” updates of user interfaces, business logic, or what have you. It also allows business logic to “travel” with data payloads which can ensure consistent execution (e.g. encapsulating the formula necessary to calculate a complex derivative within a message about that derivative). It is a scary because it could potentially turn an innocuous looking XML message into the mother-of-all Trojan horses by enabling hackers to attack and change the business logic of programs while they are still running.

Given the risks of transgenic XML, most next-generation XML standards are avoiding such capabilities and instead focusing on “stateful” standards. Stateful XML standards provide mechanisms for embedding/ammending not just the state of a particular operation into a message but often the business logic necessary to complete that operation, and even the larger business process context of that operation. By embedding state and business logic within a message, these standards create a truly “decoupled” and asynchronous software environment in which the message truly becomes the central focus of a software system.

BPEL in the Vanguard
One emerging example of an XML-based “stateful” message standard is Business Process Execution Language (BPEL). At one level, BPEL is simply a standard that defines how business partners interact on a particular business process. However, BPEL can also be used as a “stateful” standard in which the business process is both defined and managed by the message itself. As the BPEL spec itself says:

“It is also possible to use BPEL4WS to define an executable business process. The logic and state of the process determine the nature and sequence of the Web Service interactions conducted at each business partner, and thus the interaction protocols. While a BPEL4WS process definition is not required to be complete from a private implementation point of view, the language effectively defines a portable execution format for business processes that rely exclusively on Web Service resources and XML data. Moreover, such processes execute and interact with their partners in a consistent way regardless of the supporting platform or programming model used by the implementation of the hosting environment.”

While BPEL is obviously still in the early stages of becoming a “stateful” standard, it’s not hard to imagine later versions of the standard explicitly amending messages “on the fly” with state, data, and process information thus conferring to messages many of the same capabilities of complied binaries.

The Intelligent Cloud
As messages begin to take on more of the capabilities and responsibilities traditionally assigned to compiled binaries, the supporting messaging infrastructure must necessarily become more secure and sophisticated. The combination of massive numbers of “stateful” messages with a sophisticated infrastructure effectively creates an intelligent messaging “cloud”. Inside this cloud, messages can be routed, modified, and secured with minimal endpoint interactions. Ultimately, interactions between messages and even the creation of new messages can be accomplished within this cloud all based on pre-defined “stateful” standards and without the need for pre-compiled business logic or processes.

Cloud of Opportunity
For venture investors, the emergence of this intelligent cloud and the migration towards messaging and away from complied binaries offer a multitude of interesting investment opportunities. Clearly there will be increasing demand for intelligent message processing software and equipment. To that end, a nucleus of XML-aware networking equipment companies, such as Datapower and Reactivity, have already emerged as have some standards based message brokers (such as Collaxa which was recently purchase by Oracle). New companies are likely to emerge focused on brokering messages associated with emerging “stateful” standards and still others may find ways to acceptably secure and control “transgenic” messaging.

As these new companies emerge they will help cement the transition away from binary-centric software towards message-centric software and in doing so they will confirm what we can already see today: that the message is the software.

October 5, 2004 in EAI, Middleware | Permalink | Comments (3) | TrackBack

07/29/2004

Application Management Merger Mania

The game of musical chairs in the application management space just got a bit harder today as IBM announced that it had acquired one of the leading players in the space, Cyanea. The Cyanea acquisition is just the latest deal in the space which has seen Veritas buy Precise , Mercury buy Performant , and ASF buy Dirig. Only two independent companies of note are now left in the space, Wily and Altaworks.

Houston, We Have An Application Problem
Why all the interest in application management? Because, as many companies have unpleasantly discovered, getting distributed component-based applications to work correctly is a major pain in the neck. Unlike traditional mainframe applications, distributed applications are composed of many small pieces of software. What’s more, these pieces of software are often distributed across several servers. Throw in some web services and you can have an “application” that spans multiple computers in multiple locations. This may sound cool, but when something goes wrong in such complex system even if you are a rocket scientist it’s almost impossible to figure out what code is actually “broken”.

By monitoring the inner-workings of applications, often down the method and thread level, application management programs attempt to not only figure out what, if anything is broken in a distributed application, but they also attempt to identify resource and performance problems before they end up taking an application down.

The Great Debate: Horizontal vs. Vertical
One major problem for application management software is that there are a lot of factors that can affect application performance outside of the application code itself. Even if the code is perfect, problems with other parts of the technology stack such as database resources, network performance, message brokers, etc. can still seriously affect application performance.

Given the inter-dependence of all these items, the holy grail of application management (and for that matter systems management in general) has always been to build a holistic map of all the hardware, software, and network resources associated with a particular application and to, somehow, build a management solution that can identify the actual root cause of any particular problem no matter where it lies in the stack.

Unfortunately, like most IT holy grails such as universal object libraries, consistent semantics, and stable Windows machines, the vision of a completely unified application management stack is a long way from reality.

In the interim, vendors have generally decided to focus on either horizontal or vertical management strategies. Horizontal strategies stress the importance of following a transaction “in-flight” as it flows throughout its life-cycle, no matter what platforms it may decide to travel on. To support this strategy, vendors must make their software compatible with as many application servers as possible including modern ones (such as J2EE and .NET) and legacy ones (CICS and IMS). The horizontal view is particularly important inside large companies with complex legacy systems as most of their new distributed applications must still interact regularly with legacy platforms.

Other vendors a pursuing a vertical strategy of trying to link together information from the database, network, and application layers in order to derive a view of all of the technology components that affect a particular application. This strategy is better suited to “self-contained” applications that don’t interact with legacy platforms.

The reality is that for most Global 2000 corporations, horizontal solutions are much more practical given the topographical and political realities in those organizations. Most of those companies still have lots of legacy applications and they generally have very complex IT infrastructures. This means that distributed applications not only have to play nice with other platforms, but managerial and budgetary control over IT resources is often widely dispersed throughout the organization. While it might be nice in theory to instrument all of the databases in a company with a particular application management platform, just try telling the database administrators that they are going to be forced to use the same tool as the application managers. In general, that’s just not going to happen.

Sayonara Cyanea
As it happens, Cyanea was pursuing a horizontal strategy. It had a unique “probe/repository” architecture that makes it easily extensible to multiple platforms and it was the first player in the space to support IBM’s venerable CICS and IMS mainframe “app servers”. Given this, plus IBM’s early investment and reseller relationship it’s really not surprising that IBM decided to bite the bullet and buy the rest of the company it didn’t already own.

For me personally, the acquisition wasn’t surprising because I was actually the first investor in Cyanea and had seen the IBM relationship grow in size and importance first hand. While it’s a bit bittersweet to seen one of my promising investments swallowed up by Big Blue just as it is hitting its stride, I must admit that the ample return on investment provides me with more than a little comfort.

Wither Wily?
One of the big remaining questions following the Cyanea deal is what will become of Wily. Wily was the pioneer in the space and has always been the largest player (though Cyanea was rapidly catching up to them). It’s rumored that Wily turned down a $100M buy-out offer from Mercury in early 2003 before Mercury bought Performant for $22.5M (that may just be some good underground marketing on Wily’s part though).

On the one hand, Cyanea’s sale looks like good news for Wily. Not only does it leave Wily as the only substantive independent player in the space, but it removes a competitor that was increasingly beating it in competitive bake-offs.

On the other hand, with IBM buying Cyanea and integrating it more closely into its product lines, Wily will now face all-out competition from IBM, a platform that supposedly accounts for a majority of their sales. In addition, Veritas recently signed a wide ranging partnership with BEA making Precise the recommended application management solution for WebLogic. Thus with Cyanea at IBM and Precise at BEA, Wily faces the unappetizing prospect of having to face “in-house” competition for every WebSphere and WebLogic sale. In addition, Wily’s core product architecture, which relies on code “wrapping” to instrument it, is dated and not readily extensible outside of J2EE environments.

Despite this, some have suggested an IPO is imminent, but that does not seem likely until Wily figures out a growth story beyond J2EE. Fortunately, on the M&A front there are a few large players that have yet to make a major move in the space, most notably HP, Oracle, SAP, and Sun, so Wily may yet have an opportunity to make it to the alter on time. Whatever Wily decides to do they will have to do it quickly as they are now going from a situation of being top dog to underdog against some of the strongest software sales forces in the business, which is not an appealing prospect not matter how you look at it.

Many Miles Still To Travel
For the application management space in general, the consolidation of the independent players into the major platform players, represents a logical and necessary industry evolution. While the industry is still a ways away from the holy grail of unified management it is making steady progress. Attention will now likely shift towards integrating a few other pieces of disparate infrastructure software, such as business activity management, dependency mapping tools and cluster management tools into the overall application management framework. While each step will take the industry closer to management nirvana, new technologies and corresponding challenges will undoubtedly emerge and thus push the goal further out into the future.

July 29, 2004 in Middleware, Network Management, Operations Management | Permalink | Comments (1) | TrackBack

07/13/2004

DIM: Hijacking IM for Data Transport

Move over teenagers, the heaviest users of instant messaging are about to become computers themselves. In the beginning, IM communication was strictly a human-to-human affair. A few years ago companies starting sending alerts (and increasingly spam) via IM making it a computer-to-human affair. Now, with the advent of Data over Instant Messaging (DIM) technology, IM is rapidly set to become a computer-to-computer affair.

Why send data over IM? One reason is that IM infrastructures have solved a lot of tough technical problems such as firewall traversal, multi-protocol transformation, and real-time presence management. Sending messages over these networks allows applications to leverage the investments made to solve these tough problems. Another reason is that many companies already have IM “friendly” infrastructures which means that all the necessary firewall ports are open, the clients are already certified and installed, and operations infrastructure like logging, back-up, and even high-availability are already in place. Thus by using IM for computer-to-computer communication, developers are able to “hijack” all the valuable investment made in IM and use it for a purpose that its creators likely never intended.

Of course, DIM-based communications have many of the same drawbacks that human-to-human IM has. Because IM is a real-time “fire and forget” system, DIM lacks many of the hard-core transaction capabilities that most Enterprise Application Integration (EAI) solutions incorporate. Thus you wouldn’t want to rely on DIM for mission critical transactions management. In fact, a full blown EAI system with a rich work flow capability, rules-based message management and semantic mapping capabilities is more capable and reliable than DIM for just about everything.

However, a full blown EAI system will also cost you millions and take at least 6 months to get up and running. With DIM, the infrastructure is already in place, so not only is the time to deploy radically accelerated, but the overall cost of the installation is also dramatically lower. In addition, because DIM is a relatively simple, lightweight technology it is comparatively easy to integrate into applications, especially desktop applications. DIM is just one of the low-end EAI technologies I have written about in the past that threaten to give the traditional “high-end” EAI vendors a run for their money.

To see a good example of DIM in action you need to look no further Castbridge’s Data Messenger product. Castbridge just released the 2.0 Beta of their product and it is chock-full of DIM goodies. The Castbridge product essentially allows other applications to instant message each other both inside and outside the firewall. Most customers use the technology to link desktop applications together (such as linking two Excel spreadsheets over the Internet) but the platform itself can be integrated into just about any application or database out there.

Castbridge’s customers are putting the technology to use in some very innovative ways. For example, the Singapore Police Department is using Castbridge’s DIM technology as a way to quickly and easily share security information during major events (trade shows, parades, etc.). In the past, each agency had its own systems for collecting and reporting information on any activity (e.g. “Man arrested for chewing gum at entrance”) during a major event. While each agency had a representative in the overall command center, the only way to share information was by yelling across the room to a colleague. With Castbridge, each agency simply enters their data into a standard Excel spreadsheet. The Castbridge technology sends instant messages to all the other spreadsheets as soon as new data is entered effectively keeping everyone instantly up-to-date on the current security status and dramatically reducing the possibility for miscommunication. This problem is not unique. In fact, some F-16s almost shot down the Gov. of Kentucky’s plane over Washington DC recently because the FAA controllers had no easy way of notifying the Homeland Security Department and NORAD about that plane, so it sounds like the US government could use Castbridge’s solution as well.

There are a myriad of other uses for DIM-like technology for everything from keeping sales forecasts up-to-date, to keeping inventory and financial information current. On Wall Street, where spreadsheets abound and real-time communication is paramount, use cases for this technology are rampant. Syndicate desks could create real-time distributed order books, while fixed income desks could give clients “live” lists of inventory and derivative traders could ensure that their pricing models instantaneously incorporate the latest data.

The strong potential for DIM on Wall Street is probably why one of the biggest vendors of traditional IM technology to Wall Street firms, IM Logic, recently announced it’s own DIM product called IM Linkage which is designed explicitly to help Wall Street firms leverage DIM.

As DIM starts to see wider adoption it will be interesting to see how the major IM networks respond. On the one hand they probably won’t take kindly to the idea of computers “hijacking” their networks to send data around the world (hard to monetize that kind of traffic) but on the other hand they may seem DIM as a new revenue source where they can possibly take a cut of license sales in return for certifying DIM-apps on their networks.

However things evolve, you can be sure of one thing: DIM-based applications are here to stay and their impact will be felt by everyone from traditional EAI vendors to application owners, to IM networks. Let the data messaging games begin!

July 13, 2004 in EAI, Middleware | Permalink | Comments (2) | TrackBack

04/07/2004

The Data Abstraction Layer: Software Architecture’s Great Frontier

Abstraction has meaningfully permeated almost every layer of modern software architecture except for the data layer. This lack of abstraction has led to a myriad of data-related problems perhaps the most of important of which is significant data duplication and inconsistency throughout most large enterprises. Companies have generally responded to these problems by building elaborate and expensive enterprise application integration (EAI) infrastructures to try and synchronize data throughout an enterprise and/or cleanse it of inconsistencies. But these infrastructures simply perpetuate the status quo and do nothing to address the root cause of all this confusion: the lack of true abstraction at the data layer. Fortunately, the status quo may soon be changing thanks to a new generation technologies designed to create a persistent “data abstraction layer” that sits between databases and applications. This Data Abstraction Layer could greatly reduce the need for costly EAI infrastructures while significantly increasing the productivity and flexibility of application development.

Too Many Damn Databases
In an ideal world, companies would have just one master database. However if you take a look inside any large company’s data center, you will quickly realize one thing: they have way too many damn databases. Why would companies have hundreds of databases when they know that having multiple databases is causing huge integration and consistency problems? Simply put, because they have hundreds of applications and these applications have been programmed in a way that pre-ordains each one has to have a separate database.

Why would application programmers pre-ordain that their applications must have dedicated databases? Because of the three S’s: speed, security and schemas. Each of these factors drives the need for dedicated databases in their own way:

1. Speed: Performance is a critically important facet of almost every application. Programmers often spend countless hours optimizing their code to ensure proper performance. However, one of the biggest potential performance bottlenecks for many applications is the database. Given this, programmers often insist on their own dedicated database (and often their own dedicated hardware) to ensure that the database can be optimized, in terms of caching, connections, etc., for their particular application.
2. Security: Keeping data secure, even inside the firewall, has always been of paramount importance to data owners. In addition, new privacy regulations, such as HIPPA, have made it critically important for companies to protect data from being used in ways that violate customer privacy. When choosing between creating a new database or risking a potential security or privacy issue, most architects will simply take the safe path and create their own database. Such access control measures have the additional benefit of enhancing performance as they generally limit database load.
3. Schemas: The database schema is the essentially the embodiment of an application’s data model. Poorly designed schemas can create major performance problems and can greatly limit the flexibility of an application to add features. As a result, most application architects spend a significant amount of time optimizing schemas for each particular application. With each schema heavily optimized for a particular application it is often impossible for applications to share schemas which in turn makes it logical to give each application its own database.

Taken together, the three S’s effectively guarantee that the utopian vision of a single master database for all applications will remain a fantasy for some time. The reality is that the 3 S’s (not to mention pragmatic realities such as mergers & acquisitions and internal politics) virtually guarantee that large companies will continue to have hundreds if not thousands of separate databases.

This situation appears to leave most companies in a terrible quandary: while they’d like to reduce the number of databases they have in order to reduce their problems with inconsistent and duplicative data, the three S’s basically dictate that this is near next to impossible.

Master Database = Major Headache
Unwilling to accept such a fate, in the 1990’s companies began to come up with “work arounds” to this problem. One of the most popular involved the establishment of “master databases” or databases “of record”. These uber databases typically contained some of the most commonly duplicated data, such as customer contact information. The idea was that these master databases would contain the sole “live” copy of this data. Every other database that had this information would simply subscribe to the master database. That way, if a record was updated in the master database, the updates would cascade down to all the subordinate databases. While not eliminating the duplication of data, master databases at least kept important data consistent.

The major drawback with this approach is that in order to ensure proper propagation of the updates it is usually necessary to install a complex EAI infrastructure as this infrastructure provides the publish & subscribe “bus” that links all of the master/servant databases together. However, in addition to being expensive and time consuming to install, EAI infrastructures must be constantly maintained because slight changes to schemas or access controls can often disrupt them.

Thus, many companies that turned to EAI to solve their data problems have unwittingly created an additional expensive albatross that they must spend significant amounts of time and money on just to maintain. The combination of these complex EAI infrastructures with the already fragmented database infrastructure has created what amount’s to a Rube-Goldberg like IT architecture within many companies which is incredibly expensive to maintain, troubleshoot, and expand. With so many interconnections and inter-dependencies, companies often find themselves reluctant to innovate as new technologies or applications might threaten the very delicate balance they have established in their existing infrastructure.

So the good news is that by using EAI it is possible to eliminate some data consistency problems, but the bad news is that the use of EAI often results in a complex and expensive infrastructure that can even reduce overall IT innovation. EAI’s fundamental failing is that rather than offering a truly innovative solution to the data problem, it simply “paves the cow path” by trying to incrementally enhance the existing flawed infrastructure.

The Way Out: Abstraction
In recognition of this fundamental failure, a large number of start-ups have been working on new technologies that might better solve these problems. While these start-ups are pursuing a variety of different technologies, a common theme that binds them is their embracement of “abstraction” as the key to solving data consistency and duplication problems.

Abstraction is one of the most basic principles of information technology and it underpins much of the advances in programming languages and technical architectures that have occurred in the past 20 years. One particular area in which abstraction has been applied with great success is in the definition of interfaces between the “layers” of an architecture. For example, by defining a standardize protocol (HTTP) and a standardized language (HTML), it has been possible to abstract much of the presentation layer from the application layer. This abstraction allows programmers working on the presentation layer to be blissfully unaware of and uncoordinated with the programmers working on the application layer. Even within the network layer, technologies such as DNS or NAT rely on simple but highly effective implementations of the principle of abstraction to drive dramatic improvements in the network infrastructure.

Despite all of its benefits, abstraction has not yet seen wide use in the data layer. In what looks like the dark ages compared to the presentation layer, programmers must often “hard code” to specific database schemas, data stores, and even network locations. They must also often use database-specific access control mechanisms and tokens.

This medieval behavior is primarily due to one of the Three S’s: speed. Generally speaking, the more abstract and architecture, the more processing cycles required. Given the premium that many architects place on database performance, they have been highly reluctant to employ any technologies which might compromise performance.

However as Moore’s Law continues its steady advance, performance concerns are becoming less pronounced and as a result architects are increasing willing to consider “expensive” technologies such as abstraction, especially if they can help address data consistency and duplication problems.

The Many Faces of Abstraction
How exactly can abstraction solve these problems? It solves them by applying the principles of abstraction in several key areas including:

1. Security Abstraction: To preserve security and speed, database access has traditionally been carefully regulated. Database administrators typically “hard code” access control provisions by tying them to specific applications and/or users. Using abstraction, access control can be centralized and managed in-between the data layer and the application layer. This mediated access control frees programmers and database administrators from having to worry about coordinating with each other. It also provides for centralized management of data privacy and security issues.
2. Schema Abstraction: Rather than having programmers hard code to database schemas associated with a specific databases, abstraction technologies enable them to code to virtual schemas that sit between the application and database layers. These virtual schemas may map to multiple tables in multiple different databases but the application programmer remains blissfully unaware of the details. Some virtual schemas also theoretically have the advantage of being infinitely extensible thereby allowing programmers to easily modify their data model without having to redo their database schemas.
3. Query/Update Abstraction: Once security and schemas have been abstracted it is possible to bring abstraction down to the level of individual queries and updates. Today queries and updates must be directed at specific databases and they must often have knowledge of how that data is stored and indexed within each database. Using abstraction to pre-process queries as they pass from the application layer to the data layer, it is possible for applications to generate federated or composite queries/updates. While applications view these composite queries/updates as a single request, they may in fact require multiple operations in multiple databases. For example, a single query to retrieve a list of a customer’s last 10 purchases may be broken down into 3 separate queries: one to a customer database, one to an orders database and one to a shipping database.

The Data Abstraction Layer
With security, schemas and queries abstracted what starts to develop is a true data abstraction layer. This layer sits between the data layer and the application layer and decouples them once and for all freeing programmers from having to worry about the intimate details of databases and freeing database administrators from maintaining hundreds of bi-lateral relationships with individual applications.

With this layer fully in place, the need for complicated EAI infrastructures starts to decline dramatically. Rather than replicating master databases through elaborate “pumps” and “busses”, these master databases are simply allowed to stand on their own. Programmers creating new data models and schemas select existing data from a library of abstracted elements. Queries/updates are pre-processed at the data abstraction layer which determines access privileges and then federates the request across the appropriate databases.

Data Servers: Infrastructure’s Next Big Thing?
With so much work to be done at the data abstraction layer, the potential for a whole new class of infrastructure software called Data Servers, seems distinctly possible. Similar to the role application servers play in the application layer, Data Servers manage all of the abstractions and interfaces between the actual resources in the data layer and a set of generic APIs/standards for accessing them. In this way, the data servers virtually create the ever elusive “single master database”. From the programmer’s perspective this database appears to have a unified access control and schema design, but it still allows the actual data layer to be highly optimized in terms of resource allocation, physical partitioning and maintenance.

The promise is that with data servers in place, there will be little if any rationale for replicating data across an organization as all databases can be accessed from all applications. By reducing the need for replication, data servers will not only reduce the need for expensive EAI infrastructures but they will reduce the actual duplication of data. Reducing the duplication of data will naturally lead to reduced problems with data consistency.

Today however, the promise of data servers remains just that, a promise. There remain a number of very tough challenges to overcome before Data Servers truly can be the “one database to rule them all”. Just a couple of these challenges include:

1. Distributed Two-Phase Commits: Complex transactions are typically consummated via a “two phase commit” process that ensures the ACID properties of a transaction are not compromised. While simply querying or reading a database does not typically require a two phase commit, writing to one typically does. Data servers theoretically need to be able to break-up a write into several smaller writes, in essence they need to be able to distribute a transaction across multiple databases while still being able to ensure a two-phase commit. There is general agreement in the computer science world that, right now at least, it is almost impossible to consummate a distributed two-phase commit with absolute certainty. Some start-ups are developing “work arounds” that cheat by only guaranteeing a two-phase commit with one database and letting the others fend for the themselves, but this kind of compromise will not be acceptable over the long term.
2. Semantic Schema Mapping: Having truly abstracted schemas that enable programmers to reuse existing data elements and map them into their new schemas sounds great in theory, but it is very difficult to pull off in the real world where two programmers can look at the same data and easily come up with totally different definitions for it. Past attempts at similar programming reuse and standardization, such as object libraries, have had very poor results. To ensure that data is not needlessly replicated, technology that incorporates semantic analysis as well as intelligent pattern recognition will be needed to ensure that programmers do not unwittingly create a second customer database simply because they were unaware that one already existed.

Despite these potential problems and many others, the race to build the data abstraction layer is definitely “on”. Led by a fleet of nimble start-ups, companies are moving quickly develop different pieces of the data abstraction layer. For example, a whole class of companies such as Composite Software, Metamatrix, and Pantero are trying to build query/update federation engines that enable federated reads and writes to databases. On the schema abstraction front, not many companies have made dramatic progress but companies such as Contivo are trying to create meta-data management systems which ultimately seek to enable the semantic integration of data schemas, while XML database companies such as Ipedo and Mark Logic continue to push forward the concept of infinitely extensible schemas.

Ultimately, the creation of a true Data Server will require a mix of technologies from a variety companies. The market opportunity for whichever company successfully assembles all of the different piece parts is likely to be enormous though, perhaps equal to or larger than the application server market. This large market opportunity combined with the continued data management pains of companies around the world suggests that the vision of the universal Data Server may become a reality sooner than many people think which will teach us once again to never underestimate the power of abstraction.

April 7, 2004 in Database, EAI, Middleware | Permalink | Comments (6) | TrackBack

02/15/2004

"Low-end" EAI Is Where The Action's At

While most of the attention in the Enterprise Application Integration (EAI) space has been focused on the development of high-end features such as business activity monitoring and business process management, some of the most interesting innovations are actually occurring at the low-end of the EAI market.

High-End For a Reason
Historically there has been no such thing as “low-end” EAI. EAI, by its very nature, is a complex, costly and technically demanding space that generally involves integrating high-value, high volume transactions systems. In such a demanding environment, failure is simply not an option. Thus, EAI software has typically been engineered, sold, and installed as a high-end product.

“High-end” is of course just another way of saying “very expensive” and EAI surely is that. The average EAI project supposedly costs $500,000 and that’s just to integrate two systems. Try to integrate multiple systems and you are soon easily talking about budgets in the millions of dollars.

Ideally, there would be a way to offer high-end EAI software at low prices, but unfortunately the economics simply don’t work. First off, the software engineering effort required to ensure a failsafe environment for high-volume transaction systems is non-trivial and therefore quite costly. Second, the infrastructure that vendors must build to sell, install, and service such high-end software is inherently expensive.

Thus, the very idea of low-end/low-priced EAI software was thought to be a pipedream and any vendor that was crazy enough to sell their software for $10,000 instead of $500,000 was thought to be on a fast path to going out of business.

A Volkswagen vs. A BMW
Despite the conventional wisdom that “low-end EAI software” is an uneconomic oxy-moron, there are in fact an increasing number of start-ups quietly pursuing this space. These start-ups believe they will be successful not because they are trying to replicate high-end EAI at a lower price, but because they are creating a new “low-end” market by offering a different product to an entirely different, and potentially much larger, market.

To be specific, these low-end EAI vendors differ from their high-end compatriots in several important aspects:

1. Focused on Data Sharing vs. Transactions: High-end EAI vendors have traditionally been focused on building failsafe, ACID-compliant, transaction systems that can handle a corporation’s most important and sensitive data. Low-end vendors do not even attempt to manage transactions, they simply enable basic data sharing between applications without guarantees, roll-backs or any other fancy features. Such software is much less robust than high-end offerings, but it’s also much less complicated and therefore easier to build and support.
2. User vs. Developer Centric: High-end EAI products are generally designed to be manipulated and administered by developers. They have extensive API’s, scripting languages and even visual development environments. Low-end EAI vendors are designing their products to be used by end-users or at worst, business analysts. By eliminating the need for skilled developers, the low-end software significant reduces set-up and maintenance costs.
3. Hijacking vs. building: Most high-end EAI products come with their own extensive messaging infrastructures that have been painstakingly built by their developers. In contrast, low-end EAI vendors try to “hijack” or leverage existing infrastructures, such as the web or instant messaging, to support their products.
4. Indirect vs. Direct: Selling big expensive software is a difficult and complex task. That’s why hig