Tuesday, December 16, 2008
MapReduce-scale Analytics Change Business Intelligence Landscape as Enterprises Mine Ever-Expanding Data Sets
Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Greenplum.
Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to BriefingsDirect. Today, we present a sponsored podcast discussion on the architectural response to a significant and fast-growing class of new computing challenges. We will be discussing how Internet-scale data sets and Web-scale analytics have placed a different set of requirements on software infrastructure and data processing techniques.
Following the lead of such Web-scale innovators as Google, and through the leveraging of powerful performance characteristics of parallel computing on top of industry-standard hardware, we are now focusing on how MapReduce approaches are changing business intelligence (BI) and the data-management game.
More types of companies and organizations are seeking new inferences and insights across a variety of massive datasets -- some into the petabyte scale. How can all this data be shifted and analyzed quickly, and how can we deliver the results to an inclusive class of business-focused users?
We'll answer some of these questions and look deeply at how these new technologies will produce the payback from cloud computing and massive data mining and BI activities. We'll discover how the results can quickly reach the hands of more decision makers and strategists across more types of businesses.
While the challenge is great, the new value for managing these largest data sets effectively offers deep and powerful new tools for business and for social and economic progress.
To provide an in-depth look at how parallelism, modern data infrastructure, and MapReduce technologies come together, we welcome Tim O’Reilly, CEO and founder of O’Reilly Media, and a top influencer and thought leader in the blogosphere. Welcome, Tim.
Tim O’Reilly: Hi, thanks for having me.
Gardner: We're also joined by Jim Kobielus, senior analyst at Forrester Research. Thank you, Jim.
Jim Kobielus: Hi, Dana. Hi, everybody.
Gardner: Also, Scott Yara, president and co-founder at Greenplum. Welcome, Scott.
Scott Yara: Thank you.
Gardner: We're still dealing with oceans of data, even though we have harsh economic times. We see reduction in some industries, of course, but the amount of data and need for analytics across the Internet is still growing rapidly. BI has become a killer application over the past few years, and we're now extending that beyond enterprise-class computing into cloud-class computing.
I want to go to Jim Kobielus first. Jim, why has this taken place now? What is happening in the world that is simultaneously creating these huge data sets, but also making necessary even better analytics across more businesses?
Kobielus: Thanks, Dana. A number of things are happening or have been happening over the past several years, and the trend continues to grow. In terms of the data sets, it’s becoming ever more massive for analytics. It’s equivalent to Moore’s Law, in the sense that every several years, the size of the average data warehouse or data mart grows by an order of magnitude.
In the early 1990s or the mid 1990s, the average data warehouse was in gigabytes. Now, in the mid to late 2000s, it's in the terabytes. Pretty soon, in the next several years, the average data warehouse will be in the petabyte range. That’s at least a thousand times larger than the current middle-of-the-road data warehouse.
Why are data warehouses bulking up so rapidly? One key thing is that organizations, especially in tough times when they're trying to cut costs, continue to consolidate a lot of disparate data sets into fewer data centers, onto fewer servers, and into fewer data warehouses that become ever-more important for their BI and advanced analytics.
What we're seeing is that more data warehouses are becoming enterprise data warehouses and are becoming multi-domain and multi-subject. You used to have tactical data marts, one for your customer data, one for your product data, one for your finance data, and so forth. Now, the enterprise data warehouse is becoming the be all and end all -- one hub for all of those sets.
What that means is that you have a lot of data coming together that never needed to come together before. Also, the data warehouse is becoming more than a data warehouse. It's becoming a full-fledged content warehouse, not just structured relational data, but unstructured and semi-structured data -- from XML, from your enterprise content management (ECM) system, from the Web, from various formats, and so forth. It's coming together and converging into your warehouse environment. That’s like the bottom of the iceberg that’s coming up, you're seeing it now, and it's coming into your warehouse.
Also, because of the Web 2.0 world and social networking, a lot of the customer and market intelligence that you need is out there in blogs, RSS feeds, and various formats. Increasingly, that is the data that enterprises are trying to mine to look for customers, marketing opportunities, cross-sell opportunities, and clickstream analysis. That’s a massive amount of data that’s coming together in warehouses, and it's going to continue to grow in the foreseeable future.
Gardner: Let’s go to Tim O’Reilly. Tim, from your perspective, what has changed over the past 10 or 20 years that makes these datasets so important?
Long-term perspective
O'Reilly: If you look at what I would call Web 2.0 in a long-term historical perspective, in one sense it's a story about the evolution of computing.
In the first age of computing, business models were dominated by hardware. In the second age, they were dominated by software. What started to happen in the 1990s, underneath everybody’s nose, but not understood and seen, was the commodification of software via open industry standards. Open source started to create new business models around data, and, in particular, around network applications that built huge data sets through user participation. That’s the essence of what I call Web 2.0.
Look at Google. It's a BI company, based on massive data sets, where, first of all, they are spidering all the activity off of the Web, and that’s one layer. Then, they do this detailed analysis of the link structure of that Web, and that’s another layer. Then, they start saying, "Well, what else can we find? They start looking at click stream data. They start looking at browsing history, and where people go afterward. Think of all the data. Then, they deliver service against that.
That’s the essence of Web 2.0, building a massive data set, doing real-time analytics against it, and then figuring out what services you can deliver. What’s happening today is that movement is transferring from the consumer Web into business. People are starting to realize, "Oh, the companies that are doing better are better with their data."
A great example of that is Wal-Mart. You can think of Wal-Mart as a Web 2.0 company. They've got end-to-end analytics in the same way that Google does, except they're doing it with stuff. Somebody takes something off the shelf at Wal-Mart and rings it up. Wal-Mart knows, and it sends a signal downstream to the supplier.
We need to understand that this move to real-time understanding of data at massive scale is going to become more and more important as the lever of competitive advantage -- not just in computer businesses, but in all businesses. Data warehousing and analytics aren't just something that you do in the back office and it's a nice-to-have. It's the very essence of competitive advantage moving forward.
When we think about where this is going, we first have to understand that everybody is connected all the time via applications, and this is accelerating, for example, via mobile. The need for real-time analytics against massive data sets is universal.
Look at some of the things that are happening on the phone. Okay, where am I? What data is relevant to me right now, because you know where I am? Speech recognition is starting to come into focus on the phone. Again, it's a massive data problem, integrating not only speech recognition, but also local dialogs. Oh, wait, local again, you start to see some cross connections between data streams that will help you do better.
Even in the case of starting with someone from Nuance about why Google is able to do some interesting things in the particular domain of search and speech recognition, it’s because they're able to cross-correlate two different data sets -- the speech data set and the search data set. They say, "Okay, yeah, when somebody says that, they are most likely looking for this, because we know that. When they type, they also are most likely looking for that." So this idea of cross-correlation between data sets is starting to come up more and more.
This is a real frontier of competitive advantage. You look at the way that new technologies are being explored by startups. So many of the advantages are in data.
A great example is the company where I'm on the board. It's called Wesabe. They're a personal finance application. People upload their bank statements or give Wesabe information to upload their bank statements. Wesabe is able to do customer analytics for these guys, and say, "Oh, you spent so much on groceries." But, more than that, they're able to say, "The average person who shops at Safeway, spends this much. The average person who shops at Lucky spends this much in your area." Again, it's a massive data problem. That’s the heart of their application.
Now, you think the banks are going to get clued into this and they are going to start to say, "Well, what services can we offer?" Phone companies: "What services can we offer against our data?"
One thing that’s going to happen is the migration of all the BI competencies from the back office to the front office, from being something that you do and generate reports from, to something that you actually generate real-time services from. In order to do that, you've absolutely got to have high performance at massive scale.
Second, a lot of these data sets are not the old-fashion data sets where it was simply structured data.
Gardner: Let’s go to Scott Yara. Scott, we need this transformation. We need this competitive differentiation and new, innovative business approaches by more real-time analytics across larger sets and more diverse sets of content and inference. What’s the approach on the solution side? What technologies are being brought to bear, and how can we start dealing with this at the time and scale that’s required?
A big shift
Yara: Sure. For Greenplum, one of the more interesting aspects of what’s going on is that big technology concepts and ideas that have really been around for two or three decades are being brought to bear, because of the big shift that Tim alludes to, and we are big believers. We're now entering this new cycle, where companies are going to be defined by their ability to capture and make use of the data and the user contributions that are coming from their customers and community. That is really being able to make parallel computing a reality.
We look at the other major computing trend today, and it’s a very mainstream thing like virtualization. Well, virtualization itself was born on the mainframe well over 30 years ago. So, why is virtualization today, in 2008, so important?
Well, it took this intersection of major trends. You had x86 and, as Tim mentioned, the commoditization of both hardware and software, and x86 and multi-core machines became incredibly cheap. At the same time, you had a high-level business trend, an industry trend. The rising cost of data centers and power became so significant that CIOs had to think about the efficiency of their data centers and their infrastructure and what could lower the cost of computing.
If you look at running applications on a much cheaper and much more efficient set of commodity systems and consolidating applications through virtualization, that would be a really compelling thing, and we've seen a multi-billion dollar industry born of that.
You're seeing the same thing here, because business is now driven by Web 2.0, by the success of Google, and by their own use and actions of the Web realizing how important data is to their own businesses. That’s become a very big driver, because it turns out that parallel computing, combined with commodity hardware, is a very disruptive platform for doing large-scale data analysis.
The fact that you can take very, very cheap machines, as Google has shown -- off-the-shelf PCs -- and with the right software, combine them to hundreds, thousands and tens of thousands of systems to deliver analytics at a scale that people couldn’t do before. It’s that confluence and that intersection of market factors that's actually making this whole thing possible.
While parallel computing has been around for 30 years, the timing has become such that it’s now having an opportunity to become really mainstream. Google has become a thought leader in how to do this, and there are a lot of companies creating technologies and models that are emblematic of that.
But, at the end of the day, the focus is in software that is purpose-built to provide parallelism out of the box. This allows companies to sift through huge amounts of data, whether structured or unstructured data. All the fault tolerance, all the parallelism, all those things that you need are done in software, so that you choose off-the-shelf hardware from HP, IBM, Dell, and white-box systems. That’s a model that's as disruptive a shift as client-server and symmetric multiprocessing (SMP) computing was on the mainframe.
Gardner: Jim Kobielus, speak to this point of moving the analytic results, the fruits of this impressive engine and architectural shift from the back office to the front office. This requires quite a shift in tools. We're not going to have those front-office folks writing long SQL queries. They're not going to study up on some of the traditional ways that we interact with data.
What’s in the offing for development, so developers can create applications that target this data now that’s in a format that we can get out and is cross-pollinated in huge data sets that are themselves diverse? What’s in store for app dev, and what’s in store for the people that are looking for a graphical way to get into the business strategist type of user?
Self-service paradigm
Kobielus: One thing we're seeing in the front-end app development is, to take Tim’s point even further, it’s very much becoming more of a Web 2.0 user-centric, self-service development paradigm for analytics.
Look at the ongoing evolution of the online analytical processing (OLAP) market, for example. Things that are going on in terms of user self service, development of data mining, advanced analytic applications within their browser, and within their spreadsheet. They can pull data from various warehouses and marts, and online transaction processing (OLTP) systems, but in a visual, intuitive paradigm.
That can catch a lot of that information in the front-end -- in other words, on the desktop or in the mobile device -- and allows the user to graphically build ever-richer reports and dashboards, and then be able to share that all out to the others in their teams. You can build a growing and collective analytical knowledge base that can be shared. That whole paradigm is coming to the fore.
At Forrester, we published a number of reports on it. Recently, Boris Evelson and I looked at the next generation of OLAP technology. One very important initiative to look at is what Microsoft is doing with Project Gemini. They're still working on that, but they demoed it a couple of months ago at their BI show.
The front office is the actual end user, and power users are the ones who are going to do the bulk of the BI and analytics application development in this new paradigm. This will mean that for the traditional high priesthood of data modelers and developers and data mining specialists, more and more of this development will be offloaded from them, so they can do more sophisticated statistical analysis, and so forth.
The front office will do the bulk of the development. The back office -- in other words, the traditional IT data-modeling professionals -- will be there. They'll be setting the policies and they'll be providing the tooling that the end users and the power users will use to build applications that are personalized to their needs.
So IT then will define the best practices, and they'll provide the tooling. They'll provide general coaching and governance around all of the user-centric development that will go on. That’s what’s going to happen.
It’s not just Microsoft. You can look at the OLAP tooling, more user-centric in-memory spreadsheet-centric approaches that IBM, Cognos, Oracle, and others are rolling out or have already rolled out in their product sets. This is where it’s all going.
Gardner: Tim O’Reilly, in the past, when we've opened up more technological power to more people, we've often encountered much greater innovation, unpredictably so. Should we expect some sort of a wisdom-of-crowd effect to come into play, when we take more of these data sets and analytic tools and make them available?
O'Reilly: There's a distinction between the wisdom of crowds and collective intelligence. The wisdom-of-crowds thesis, as expounded by Surowiecki, is that if you get a whole bunch of people independently, really independently, to weigh in on some subject, their average guess is better than any individual expert's. That’s really about a certain kind of quantitative stuff.
But, there's also a machine-learning approach in which you're not necessarily looking for the average, but you're finding different kinds of meaning in data. I think it’s important to distinguish those two.
Google realized that there was meaning in links that every other search engine of the day was throwing away. This was a way of harnessing collective intelligence, but it wasn’t just the wisdom of crowds. This was actually an insight into the structure of the data and the meaning that was hidden in it.
The breakthroughs are coming from the ability of people to discern meaning in data. That meaning sometimes is very difficult to extract, but the more data you have, the better you can be at it.
A great example of this recently is from the last election. Nate Silver, who ran 538.com, was uncannily accurate in calling the results of the election. The reason he was able to do that was that he looked at everybody’s polls, but didn’t just say, "Well, I'm just going to take the average of them." He used all kinds of deep thinking to understand, "Well, what’s the bias in this one. What’s the bias in that one?" And, he was able to develop an algorithm in which he weighted these things differently.
Gardner: I suppose it’s important for us to take the ability to influence the algorithms that target these advanced data sets and put them into the hands of the people that are closer to the real business issues.
More tools are critical
O'Reilly: That’s absolutely true. Getting more tools for handling larger and more complex data sets, and in particular, being able to mix data sets, is critical.
One of the things that Nate did that nobody else did was that he took everybody’s polls and then created a meta-poll.
Another example is really interesting. You guys probably are familiar with the Netflix Challenge, where Netflix has put up a healthy sum of money to whomever can improve their recommendation algorithm by 10 percent. What’s interesting is that people seem to be stuck at about 8 percent, and they haven’t been able to get the last couple of percent.
It occurred to me in a conversation I was having last night that the breakthroughs will come, not by getting a better algorithm against the Netflix data set, but by understanding some other data set that, when mixed with the Netflix data set, will give better predicted results.
Again, that tells us something about the future of data mining and the future of business intelligence. It is larger, more complex, and more diverse data sets in which you are able to extract meaning in new ways.
One other thing. You were talking earlier about the democratization of these tools. One thing I don’t want to pass by is a comment that was made recently by Joe Hellerstein, who is a computer science professor at UC Berkeley. It was one of those real wake-up-and-smell-the-coffee moments. He said that at Berkeley, every freshman student in CS is now being taught Hadoop. SQL is an elective for seniors. You say, "Whoa, that is a fundamental change in our thinking."
That’s why I think what Greenplum is doing is really interesting, trying to marry the old BI world of SQL with the new business intelligence world of these loose, unstructured data sets that are often analyzed with a MapReduce kind of approach. Can we bring the best of these things together?
That fits with this idea of crossing data sets being one of the new competencies that people are going to have to get better at.
Kobielus: If I can butt in here just one moment, I want to tie into something that Tim just said, that I said a little bit earlier. One important thing is that when you add more data sets to say your analytic environment, it gives you the potential to see more cross-correlations among different entities or domains. So, that’s one of the value props for an all-encompassing or more multi-domain enterprise data warehouse.
Before, you had these subject-specific marts -- customer data here, product data there, finance data there -- and you didn’t have any easy way to cross-correlate them. When you bring them altogether into common repository, implementing common dimensions and hierarchies, and conforming with common metadata, it makes it a whole lot easier for the data miners, the power users, and the end users, to build the applications that can tie it altogether.
There is the "aha" moment. "Aha, I didn’t realize all these hooked up in these various ways." You can extract more meaning by bringing it all together into a unified, enterprise data warehouse.
Gardner: To you, Scott Yara. There's a great emphasis here on bringing together different data sets from disparate sources, with entirely different technologies underlying them. It's not a trivial problem. It’s not a matter of scale necessarily.
What do you see as the potential? What is Greenplum working on to allow folks to mix and match in such a way that the analytics can be innovative and game-changing in a harsh economic environment?
Price/performance improvement
Yara: A couple of things. One, I definitely agree with the assertion that analysis gets easier the more data you have. Whether those are heterogeneous data sets or just the scale of data that people can collect, it's fundamentally easier, cheaper.
In general, these businesses are pretty smart. The executives, analysts, or people that are driving business know that their data is valuable and that insight in improving customer experience through data is key. It’s just really hard and expensive, and that has made it prohibitive for a long, long time.
Now, we're talking about using parallel computing techniques, open-source software, and commodity hardware. It’s literally a 10- to 100-fold improvement in price performance. When the cost of data analysis comes down 10 to 100 times, that’s when new things become possible.
O'Reilly: Absolutely.
Yara: We see lots of customers now from the New York Stock Exchange. These are all businesses that are across vertical industries, but are all affected by the Web and network computing at some level.
Algorithmic trading is driving financial services in a way that we haven’t seen before. They're processing billions of trades every day. Whether it's security, surveillance, or real-time support that they need to provide to very large trading companies, that ability to mine and sift through billions of transactions on a real-time basis is acute.
We were sitting down with one of our large telecom customers yesterday, and there was this convergence that Tim’s talking about. You've got companies with very large mobile carrier businesses. They're broadband service providers, fixed-line service providers, and Internet companies.
Today, the kind of basic personalization that companies like Amazon, eBay, or Google do, telecom carriers are just at the beginning of trying to do that. They have to aggregate the consumer event stream from all these disparate communication systems, and it’s at massive scale.
Greenplum is solely focused on making that happen and mixing the modalities of data, as Tim suggested. Whether it’s unstructured data, whether those are things that exist in legacy databases, or whether you want to mix and match SQL or MapReduce, fundamentally you need to make it easy for businesses to do those things. That’s starting to happen.
Gardner: I suppose part of the new environment that we are in economically is that incremental change is probably not going to cut it. We need to find new forms of revenue and be able to attain them at a very low cost, upfront if possible, and be transformative in how we can take our businesses out through the public networks to reach more customers and give them more value.
Now that we've established that we have these data sets, we can combine them to a certain degree, and that will improve over time. What are the ways in which companies can start actually making money in new ways using these technologies?
Apple’s Genius comes to mind for me as a way of saying, "Okay, you pick a song in your iTunes library, and we're going to use our data and our analytics, and come back with some suggestions on what you might like as a result of that." Again, this is sort of a first go at this, but it opens my eyes to a lot of other types of business development opportunities. Any thoughts on this, Tim O’Reilly?
O'Reilly: In general, as I said earlier, this is the frontier of competitive advantage. Sure, iTunes’ has Genius, but it's the same thing with Netflix recommendations. Amazon has been doing this for years. It's part of their competitive advantage. I mentioned earlier how this is starting to be a force in areas like banking. Think about phone companies and all of the opportunities for new local services.
Not only that, one of my pet hobbyhorses is that phone companies have this call-history database, but they're not building new services for users against it. Your phone still only remembers the last few people that you called. Why can’t I do a search against somebody I talked to three months ago. "Who the heck was that? Was it a guy from this company?" You should be able to search that. They've got the data.
So, as I said earlier, the frontier is turning the back office into new user-facing services, and having the analytics in place to be able to do that meaningfully at scale in real-time. This applies to supply chains. It applies to any business that has data that gets better through user interaction.
This is the lesson of the Web. We saw it first in Web applications. I gave you the example earlier of Wal-Mart. They realized, "Oh, wait a minute. Every time somebody buys something, it’s a vote." That’s the same point that Wesabe is trying to exploit. A credit card statement is a voting list.
I went to this restaurant once. That doesn’t necessarily mean anything. If I go back every week, that may mean something. I spent on average this much. It’s going up. That means something. I spend on average this much. It’s going down, and that means something. So, finding meaning in the data that I already have, how could this be useful not just me but to my users, to my customers, and the services could I build.
This is the frontier, particularly in the world that we are entering, in which computing is going mobile, because so many of the mobile services are fundamentally going to be driven by BI. You need to be able to say in real-time or close to real-time, "This is the relevant data set for this person based on where they are right now."
Needed: future view
Kobielus: I want to underline what Tim just said. Traditionally, data warehouses existed to provide you with perfect hindsight on the customer -- historical data, massive historical data, hopefully on the customer, and that 360 degree view of everything about the customer and everything they have ever done in the past, back to the dawn of recorded time.
Now, it’s coming down to managing that customer relationship and evolving and growing with that relationship. You have to have not so much a past or historical view, but a future view on that customer. You need to know that customer and where they are going better than they know themselves.
In other words, that’s where the killer app of the online recommendation engine becomes critical. Then, the data warehouse, as the platform for recommendation engines, can take both the historical data that persists, but also can take the continuing streams of real-time event data on pricing, on customer interaction in various channels -- be it on the Web or over the phone or whatever -- customer transactions that are going on now, and things and events that are going on in the customer social network.
Then, you feed that all into a recommendation engine, which is a predictive-analytics model running inside the data warehouse. That can optimize that customer’s interaction at every touch point. Let’s say they're dealing with a call-center person live. The call-center person knows exactly how the world looks to that customer right now and has a really good sense for what that customer might need now or might need in three month, six months, or a year, in terms of new services or products, because other customers like them are doing similar things.
It can have recommendations being generated and scripted for the call-center agent in real-time saying, "You know what we think. We recommend that you upgrade to the following service plan because, it provides you with these features that you will find useful in your lifestyle, blah, blah, blah."
In other words, it's understanding the customer in their future, in their possible future, and suggesting things to the customers that they themselves didn’t realize until you suggested them. That’s the future of analytics, and competitive advantage.
O'Reilly: I couldn’t agree more.
Gardner: Scott Yara, we've been discussing this with a little bit of a business-to-consumer (B2C) flavor. In the business-to-business (B2B) world many things are equal in a commoditized market, with traditional types of products and services.
An advantage might be that, as a supplier, I'm going to give you analytics that I can derive from data sets that you might not have access to. I might provide analytical results to you as a business partner free of charge, but as an enticement for you to continue to do business with me, when I don’t have any other way to differentiate. What do you see are some of the scenarios possible on the B2B side?
Yara: You don’t have to look much further than what Salesforce.com is doing. In a lot of ways, they're pioneering what it means to be an enterprise technology company that sells services, and ultimately data, back to their customers. By creating a common platform, where applications can be built, they are very much thinking about how the data is being aggregated on the platforms in use, not by their individual customers, but in aggregate.
You're going to see lots of cases where for traditional businesses that are selling services and products to other businesses, the aggregation of data is going to be interesting and relevant. At the same time, you have companies where even the internal analysis of their data is something they haven’t been able to do before.
We were talking about Google, which is an amazing company. They have this big vision to organize the world’s information. What the rest of the business world is finding out is that while it’s a great vision and they have a lot of data, they only have a small fraction of the overall data in the world. Telecommunication companies, financial stock exchange, retail companies, have all of this real-world data that's not being indexed or organized by Google. These companies actually have access to amazing amounts of information about the customers and businesses.
They are saying, "Why can’t we, at the point of interaction -- like eBay, Amazon, or some of these recommended engines -- start to take some of this aggregate information and turn it into improving businesses in the way that the Web companies have done so successfully. That’s going to be true for B2C businesses, as well as for B2B companies.
We're just at the beginning of that. That’s fundamentally what’s so exciting about Greenplum and where we're headed.
Gardner: Jim Kobielus, who does this make sense for right away? Some companies might be a little skeptical. They're going to have to think about this. But where is the low-lying fruit, where are the no-brainer applications for this approach to data and analytics?
Kobielus: No-brainers -- I always hate that term. It sounds like I am condescending, but low-hanging fruit should be one of those "aha!" opportunities that everybody realizes intuitively. You don’t have to explain to them, so in a sense it's a no-brainer. It’s call center -- customer-contact center.
The customer-contact center is where you touch the customer, and where you hopefully initiate, cultivate, nurture, maintain, and grow the customer relationship. It's one of the many places where you do that. There are people in your organization who are in that front-line capacity.
It doesn’t have to be just people. It could be automated programs through your Website that need to be empowered continuously with the full customer context -- the history of that customer's interactions, the customer’s current state, current sentiment and feelings, and with a full context on the customer’s likely future evolution. So, really it's the call center.
In fact, I cover data warehousing for Forrester. I talk to the data warehousing vendors and their customers about in database analytics, where they are selling this capability right now into real-world deployment. The customer call center is, far and away -- with a bullet -- the number one place for inline analytics to drive the customer interaction in a multi-channel fashion.
Gardner: How about you, Tim O’Reilly. Where are some of the hot verticals and early adopters likely to be on this?
O'Reilly: I've already said several times, mobile apps of various kinds are probably highest on the list. But, I'm a big fan of supply chain. There's a lot to be done there, and there's a huge amount of data. There already is a BI infrastructure, but it hasn’t really been tuned to think about it as a customer-facing application. It's really more a back-office or planning tool.
There are enormous opportunities in media, if you want to put it that way. If you think about the amount of money that’s spent on polling and the power of integrating actual data, rather than stated preference, I think it's huge.
How do we actually figure out what people are going to do? There is great marketing study. I forget who told this story, but it was about a consumer product. They showed examples of different colors. It was a boom box or something like that.
They said, "How many of you think white is the cool color, how many of you think black, how many, blah, blah, blah?" All the people voted, and then they had piles of the boom boxes by the door that the people took as their thank you gift. What they said and what they did were completely at variance.
One of the things that’s possible today is that, increasingly, we are able to see what people actually do, rather than what they say they will do or think they will do.
Gardner: We're just about out of time. Scott Yara, what’s your advice for those folks who are just getting their heads wrapped around this on how to get started? It’s not a trivial activity. It does require a great deal of concerted effort across multiple aspects of IT, perhaps more so than in the past. How do you get started, what should you be doing to get ready?
Yara: That’s one of the real advantages. In sort of a orthogonal way, the ability to create new businesses online in the age of Web 2.0 has been fundamentally cheaper and faster. Doing something disruptive inside of business with their data has to be a fundamentally cheaper and easier thing. So not starting with the big vision of where they need to go, and starting with something tactical -- whether it lives in the call center or at some departmental application -- is the best way to get going.
There are technologies, services, and people now that you can actually peel off a real project, and you can deliver real value right away.
I agree with Tim. We're going to see a lot of activity in the mobility and telecommunication space. These companies are just realizing this. If you think about the kind of personalization that you get with almost every major Internet site today, what’s level of personalization you get from your carrier, relative to how much data that they have? You're going to see lots of telecom companies do things with data that will have real value.
One of our customers was saying that in the traditional old data warehousing world, where it was back office, the service level agreement (SLA) was that when a call got placed and logged, it just needed to make its way into the warehouse seven days later. Seven days from the point of origination of a call, it would make itself into a back-office warehouse.
Those are the kinds of things that are going to change, if we are going to really provide mobility, locality, and recommendation services to customer.
It's having a clear idea of the first application that can benefit from data. Call centers are going to be a good area to provide the service representation of a profile of a customer and be able to change the experience. I think we are going to see those things.
So, they're tractable problems. Starting small is what held back enterprise data warehousing before, where they were looking at these huge investments of people and capital and infrastructure. I think that’s really changing.
Gardner: I am afraid we have to leave it there. We've been discussing new approaches to managing data, processing data, mixing data types and sets, and extracting real-time business results from that. We've looked at tools and we've looked at some of the verticals in business advantages.
I want to thank our panel. We've been joined today by Tim O’Reilly, the CEO and founder of O’Reilly Media. Thank you Tim.
O'Reilly: Glad to do it.
Gardner: Jim Kobielus, Forrester senior analyst. Thank you Jim.
Kobielus: Dana, always a pleasure.
Gardner: Scott Yara, president and co-founder of Greenplum. Appreciate it, Scott.
Yara: Great. Thanks everybody.
Gardner: This is Dana Gardner, principal analyst at Interarbor Solutions. You've been listening to a sponsored BriefingsDirect podcast. Thanks, and come back next time.
Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Greenplum.
Transcript of BriefingsDirect podcast on new computing challenges and solutions in data processing and data management. Copyright Interarbor Solutions, LLC, 2005-2008. All rights reserved.
Wednesday, August 29, 2007
SaaS Providers Increasingly Require 'Ecology' Solutions from Infrastructure Vendors
Edited transcript of BriefingsDirect[TM] podcast with Progress Software's Colleen Smith on SaaS, recorded
Listen to the podcast here. If you'd like to learn more about BriefingsDirect B2B informational podcasts, or to become a sponsor of this or other B2B podcasts, contact Interarbor Solutions at 603-528-2435.
Dana Gardner: Hi, this is Dana Gardner, principal analyst at Interarbor Solutions, and you're listening to BriefingsDirect. Today, a podcast discussion about Software as a Service (SaaS), the burgeoning marketplace for off-the-wire business and consumer applications, and the infrastructure that's required for those delivering these applications and services to thrive and prosper.
To help us sort out this market and the needs for infrastructure, we're joined by Colleen Smith, managing director of Software as a Service for Progress Software. Welcome to the show, Colleen.
Colleen Smith: Thanks, Dana.
Smith: I was lucky. Progress had started to look at the application service provider (ASP) model back in the early 2000-2001 time frame to figure out whether there was an opportunity for some of the small ISVs who were using the Progress technology to become more of an application service provider. When I joined the company two years ago, I was basically asked to figure out how to build more of a SaaS partner program and look at ways in which we could work with our partners.
We basically stepped back and said, "All right, let’s look at a number of different areas," one being the technology enablement and how to build applications to go to market with SaaS. We also added a couple of other things, because we felt that one of the biggest challenges traditional software vendors had was around the business model, the go-to-market strategy, sales enablement, and figuring out ways in which we could actually help them to be more successful in this new business model. We were thinking of it more as a business model and not just as a technology.
Smith: Right. We’re an infrastructure provider, but we also look very carefully at this channel, which happens to be ISVs that bring our infrastructure products to market. We wanted to make sure they could be successful with SaaS. Sure, there are the technical components of multi-tenancy, being able to have a Web-based access, and being able to drive policy configuration and personalization.
More importantly, we work with a lot with our partners or these ISVs to make sure they realize that this requires different marketing. It requires a different sales and business model, because clearly there are financial implications in terms of cash flows. There are also a lot of things they need to think about in terms of who is the target market.
We've helped them focus on looking at new markets and going down-market. Our partners have always focused very much on the mid-market, but SaaS has enabled them to target some very niche verticals and go down into the "S" of SMB (small and medium business).
What's different between what was then conceived of as the ASP market and what we now call the SaaS market?
Smith: When I look back at the ASP market and what was going on, it was much more about the hosting. Everybody said that if you just take a business application and host it, you can be an application service provider. What they didn’t realize was that the folks who were trying to do the hosting really had no domain knowledge in terms of the business application. They didn’t understand how to focus on managing the business processes. They focused on getting the application up and running and hosting it.
There were a couple of problems with that. Number one, the applications were built for on premises. They weren’t built to be used by multiple customers. The other thing is that the people who had built those applications weren’t necessarily the ones who were doing the hosting. It was the hosting vendors who figured they could just load it up and run it.
SaaS and ASP in concept are still the same. The application is going to be housed and managed centrally, hosted somewhere, and run by different customers. The biggest difference is that the people who are now managing, building, and deploying those applications are more of the ISVs, who understand what it takes to run and manage that business process.
On the software side of it, there is much more of a focus on business-process automation, and the people who are building, deploying, and running those applications have a good, solid knowledge of the business itself. The second thing is that the applications are now architected specifically to be able to run for multiple customers, and it’s not a separate implementation for each customer.
The economy of scale is what killed a lot of hosting providers back in the ASP days and ran them out of business. They were just doing an implementation for every customer, as opposed to a single implementation that can now be used by multiple customers -- personalized and managed. The people who use the application run and use it differently, but the implementation is pretty much the same for all customers.
Smith: Prior to coming to Progress about two years ago, I spent five years as an industry analyst. I was looking at the ASPs back in those days and understood what was going on around the market, but I also looked at the overall infrastructure market. Prior to that, I spent 15 years in the enterprise-application space.
What's interesting is my background. I started in the mainframe days, saw the transition to client-server, then saw the transition to the Internet. So, I watched the different transitions in how the software industry has grown over the last 20 years or so. I even started off my career at EDS doing service bureau and outsourcing. I've seen it come full circle.
When I joined Progress, I said, "Let’s look at this new business model, SaaS, and figure out how an infrastructure company can make a play in terms of being part of this new business model through using our technology, but at the same time working with a lot of ISVs who are out there trying to figure out how to make that transition.
Smith: Yes. We’ve been in business 25 years, and I would say for a 23 of those years our go-to-market has been to work with ISVs who take our technology, build applications on top of it, and bring it to market. We also have a direct arm, but a long-standing portion of what we’ve done over the years is to provide the software infrastructure for application partners or ISVs to be able to bring their applications to market, build them, and deploy them.
Smith: I do agree with you. I think the timing is right. There are a bunch of reasons why. Number one, the Web is finally viewed as a business platform. Seven or 10 years ago, the Web wasn't viewed as the way in which business applications were going to be run and managed. Because of that, as I mentioned before, a lot of traditional ISVs have been selling to the upper end of the mid-market, or to large enterprises. Those have been the folks buying business applications.
The "S" in SMB really couldn’t do a number of things. They couldn’t afford the dedicated IT staff to manage and maintain the applications. They didn’t necessarily have the infrastructure and the technology to run these business applications. A lot of business applications are much too complex and require too much manpower to manage and maintain the app.
A couple of things have happened. One, the price of computing has come down. People now have access via web browser to business applications.
The other thing, one we’ve all seen, is that ISVs realize there’s a whole new market. There’s that long tail, if you will, of the software market that allows them to be able to go after new people. In the past, software just wasn’t accessible to them, and now there’s a whole new market opportunity.
We stress to our ISVs, "You can continue to be in the traditional software business for your core market and the market that you’ve been going after, but there’s a whole new opportunity for you to look at new markets, whether they be the low-end of your current market, adjacent markets, or even new geographic territories."
Throughout
On the other side of the equation, on the supply side of how these ISVs can deliver, there’s a new support ecology available to them. They don’t have to create their own data centers themselves. They can find partners. We’ve heard a lot about Amazon, for example, and there are others, of course. These ISVs can focus on what they do well, which is their software, their logic, and then also take advantage of some hosting.
Tell us a little bit from your vantage point as a software infrastructure and tools provider how that ecology works when it comes to these hosting options?
Smith: You’re exactly right. Back in the ASP days, it was all about hosting. I’m not saying that in the SaaS world hosting isn’t important, because it absolutely is. What has changed over the last 7 to 10 years is that now you look at it in terms more of an ecosystem.
You’ve got your infrastructure providers, your application providers, and your hosting and managed-service providers. The biggest change that I have seen now is that each realizes they have a role to play, they have a core expertise, and that through building of this ecosystem and through partnerships you can be much more successful in being able to lower your deployment cost, but still being able to target and go after these new markets.
I look at our ISVs and our ecosystem within SaaS. We use partners like OpSource, for example, to be able to do some of the hosting and managed services. Our ISVs are the ones with the business and domain expertise, and know what their business application is. They know their particular vertical niche and they know how to best deploy, manage, and build out business processes to be able to support it. What we provide in the equation is the underlying infrastructure that helps them to develop, deploy, integrate, and manage and monitor their business applications.
Together, what we have is more of an ecosystem that’s able to go out and lower the overall total cost, because each one of us is playing our role in the system. Because it’s a partnership, the pricing and licensing is all done based upon what we call the shared risk/shared reward model.
Smith: Something like that.
Can you address the opportunity for applications to be decomposed into services as a service, and also what new requirements there are for the infrastructure to support that?
Smith: Sure. What happens early on in a market is that we see lot of these niche, vertical, best-of-breed or single applications or components in the first wave of coming out to market with SaaS. So, whether it's in the legal sector, healthcare sector, or financial services, they say, "Here’s one specific business application -- mortgage applications, loan applications, or patient billing." What slowly happening is being able to start integrating business processes and offer them out to the community.
If you look at financial services, instead of just being able to offer loan applications, there’s now a whole suite of different types of business services or business components. As long as somebody who’s part of the financial services arena has the ability to integrate those different business processes and offer them out to their community, they basically have become more of a business service provider.
What we’re seeing is that you no longer are an application vendor like you were in the traditional business model. If you do this right and you use the underlying technology of governance, policy enforcement, integration, and development, then you can build out a whole service delivery environment or platform, where you can now offer multiple business services to a community that might be very vertical-based.
We see a lot of this happening in financial services, healthcare, the legal sector, and even in agriculture, which needs to now manage and maintain a lot of different business processes because of federal regulations, mad-cow disease, and all the other reasons people have to manage and monitor business processes a lot more thoroughly.
They also have more flexibility in how they create, in that they can exploit reuse more generally. They can also shop around for business services that might be on the market and available to them through the ecology. So, there seems again to be a two-prong benefit: one in how they can deliver, but also in how they can aggregate and create. Does that sound right?
Smith: Yes, and what that comes down to is the winner in all this is the end customer, who is looking for a single business-service provider who knows their business, whether they know healthcare, legal, or any business sector, and they’re able to provide a number of different business services for them. The big challenge that most large organizations had was integration, because they’d go to one vendor and buy a business app, then go to another vendor and buy a different business app.
In the SaaS world you could run into the same thing, if you’re going out to all these different SaaS providers. But, if you start to think in terms of those SaaS providers participating in this ecosystem that’s much more based upon who the end customer is, the end customer can end up benefiting. They’ll be able to go to a single business-service provider.
Maybe that business provider has built those services, or maybe they haven’t, but they’re able to pull in these multiple business services, Web services, or whatever technology they’ve been built in, and offer them out to the community. So, the end customer, who might be that smaller business user, can now have a single point where they go to access a number of different business services.
Smith: Exactly. The biggest challenge people have about going after that long tail is the fact that you’re really talking about millions of markets of a dozen. It’s very difficult to get your cost model to a point where you’re able to go after all of those millions.
But, if you really think about it and you build the business application that’s very specific to what they need to do, and you’ve built them based upon the small services, then the customer chooses which business services they need to run their business. You’re offering a spectrum of different services, because you understand what their marketplace looks like and what vertical they’re in.
Now, Progress Software has made some acquisitions and has focused quite a bit of energy and investment on this data issue -- the real time, management, semantic issues around not just data availability, but about how things are termed, labeled, and classified, application by application, instance by instance, and even site by site. Help us understand how Progress is helping the ISVs in the SaaS environment deal with somebody’s complex data and semantic issues.
Smith: When people initially thought of integration, they thought about point to point, and it was more about at the business process level. We realized was that if you’re not addressing data-level integration semantic issues, then you’re not going to solve all of the problems that customers have.
We made an acquisition last year of a company called Pantero, and we have built a product that we’ve termed Data Xtend Semantic Integrator. It's all about looking at the semantics of the data and being able to match and manage that data from one system to another. So, it’s just another product that we’ve added to our infrastructure, and it allows customers or our ISVs to look at all different levels. We’ve got integration at a business-process level, at the enterprise service bus (ESB) level. We now have integration at the data level. We also have capabilities to govern and monitor and manage Web services.
Progress continues to add different technologies into our environment to support what’s going on in terms of all of the different integration and Web services challenges happening in the industry. Our basic focus is to continue to add software infrastructure components to support the needs of large enterprises, as well as to support the needs of business-service providers who are trying to offer these integrated business applications to their customers.
If you think about it, large enterprises can do a lot of this themselves and can buy the infrastructure to build, integrate, and manage. In a lot of ways, the SMB requires these services providers or these business services providers to be able to do all of that integration, and they expect that that integration will just be handled for them, and that’s what they’re really looking for.
So, that’s the big challenge of whether ISVs are going to become business-service providers or whether they’re going to partner with business-service providers. They almost become the manufacturer, if you will, of these small-business components that larger business-service providers will pull into their environment. So, you might see a different breed. We might move away from the traditional systems integrators and you might see more business-service providers focusing on supporting the customers. The ISVs are actually building some of the components that are used, but they aren’t necessarily going to be the service providers.
Colleen, I wonder if you could put on your analyst hat again for a minute and try to forecast how that market might shake out.
Smith: I think the SaaS market, in general, is really still in its nascency, and there are a lot of things that have yet to happen. But, the good news is this isn’t just a fad. We see a fundamental change in terms of the business model.
What I say a lot is that if we think in terms of the software industry over the last 20 years, we’ve come a long way in terms of building partnerships, and in terms of how systems integrators and service providers work with ISVs. What I see being the success of SaaS is that if we continue to enhance that model, it's going to be about hosting providers, working more closely with system integrators and ISVs. The only way that the end customer is going to win in this is if we get into a business model where there is that shared risk and shared reward, but the customer pays for only what they need to use.
It's going to come down to pricing models. It still has to come down to some building of ecosystems out there, where everybody knows their role and plays that role, but doesn’t necessarily try to do the other person’s role. There are still a lot of things happening.
I believe it’s going to be vertically focused. I don’t think this is going to be a horizontal play. We’ve seen a lot of success in vertical business expertise. There's going to be content, business applications, data, and services. If all of those can be offered in a single environment through a single service provider, the customer will end up winning.
Smith: Absolutely. We’ve always worked with a lot of our partners and told them, "Figure out where your niche is, and, if you can be the best at your niche, you can be successful." We aren’t necessarily talking about creating the next SAP, but if you can be really successful within your specific niche area, then your customers are going to value your service.
In the SaaS model, there are two S's. One is the software part and the other part is the service. If you have business domain expertise, they’re going to look at you as a partner and they’re going to ask you to help to run, manage, and grow with their business. That’s the other part. If you’re focusing on SMB, you also have to help those small organizations figure out how they can scale to become large organizations. So, it’s your opportunity as well as your responsibility to make sure that you can scale with them.
Smith: Yes. I often talk about it. You’re moving from develop-package-ship into develop-build-service-deploy. It’s much more about the ways in which you can deploy and service the customer, as opposed to just packaging, shipping, and saying, "Well, now it’s somebody else’s responsibility."
Smith: Yes, because the switching costs are lower. It’s not necessarily happening in the industry today that everybody is dumping their SaaS application to go to another one, but you have to be able to service your customer. If they’re not getting the service that they require from you, they will look elsewhere.
Smith: I’ve got a couple of interesting examples in terms of ways in which you can think specifically about accessing the long tail, but being able to target a whole new set of users.
One of our partners is in the K-12 education area. They had a traditional business application that sold to faculty and to the student administration systems. They figured out that they could offer a business application that can now be accessed by parents over the Web. So, they re-architected their application for multiple school districts. They now allow parents to go in and track absenteeism, as well how students are doing in terms of grades, and things like that.
They've completely re-architected and re-thought the way in which they’re building and deploying business applications for K-12. That’s an interesting example of thinking about SaaS and thinking in terms of a new market, as opposed to just looking at large universities or schools that can afford a system.
They’re selling at the state level and saying, "Here is a state-wide student administration system that can now be used at all schools. Even if they don’t have a large IT staff, all they need is access via the Internet." That’s interesting in terms of the education. This is one of my small partners called Skyward, and they’re located in
Another interesting area is in library management systems. They've been around for a while, and we’ve got a partner, Keystone, who has focused on a very small niche in terms of braille and applications for the visually impaired. What they’ve done is to rethink the way in which you can have access to books and to look up books available at libraries. And, it can all be done via the Web.
They’re actually coining themselves as the Netflix, if you will, of libraries, because what you can now do is use the Internet to look for availability and access to different books that might be available, not only in your town, but statewide, and have those books actually shipped to your house.
So, they've re-thought the way in which library systems are built and used, and are able to bring in access from the Internet, and, in this particular instance, can allow handicapped individuals access information right from their homes.
These are just two examples where you start to think about who the user of a system is. It’s a very traditional backend accounting and business management system, but it’s now being used and serviced to expand their market, as well as just to be able to have a service for some small end users, who, in the past, wouldn’t have had access to these types of technologies.
Smith: It's looking at new markets, being able to target now at a statewide level, as opposed to a smaller individual school or individual library. So, it’s a new market opportunity, but they’ve been able to take something they’ve been doing for years with domain expertise and really expand their opportunity. What we see is that there are tremendous opportunities for ISVs out there, if they step back and think in terms of what they know, what's business domain expertise they have, and where they could provide a better level of service to consumers or customers.
Gardner: So, depending on the type of organization you are, you have the opportunity to scale up -- if that makes sense -- to scale down, perhaps to do both, and then, as we mentioned earlier vis-Ã -vis globalization, in a sense, scale sideways.
Smith: Exactly, we’ve had other partners who have looked at this as an opportunity. I've got a law firm software company in the
They’re still running their main headquarters out of the
Smith: Thank you, Dana.
Listen to the podcast here. Produced as a courtesy of Interarbor Solutions: analysis, consulting and rich new-media content production.
If any of our listeners are interested in learning more about BriefingsDirect B2B informational podcasts or to become a sponsor of this or other B2B podcasts, please fill free to contact Interarbor Solutions at 603-528-2435.
Transcript of Dana Gardner’s Podcast on SaaS with Colleen Smith of Progress Software. Copyright Interarbor Solutions, LLC, 2005-2007. All rights reserved.
Monday, August 13, 2007
BriefingsDirect SOA Insights Analysts on IBM’s Telelogic Deal and Open Source ESBs
Listen to the podcast here. If you'd like to learn more about BriefingsDirect B2B informational podcasts, or to become a sponsor of this or other B2B podcasts, contact Interarbor Solutions at 603-528-2435.
Dana Gardner: Hello, and welcome to the latest BriefingsDirect SOA Insights Edition, Vol. 20, a weekly discussion and dissection of Services Oriented Architecture (SOA) related news and events with a panel of industry analysts and guests. I’m your host and moderator, Dana Gardner, principal analyst at Interarbor Solutions.
Our panel this week consists of Jim Kobielus, principal analyst at Current Analysis. Welcome back, Jim.
Jim Kobielus: Thanks, Dana. Hi, everybody.
Todd Biske: Thanks Dana.
Brad Shimmin: Thanks for having me, Dana. I focus on application infrastructure and spend most of my time thinking about middleware and SOA.
Dave Linthicum: It’s Dave Linthicum.
As I was saying, we’re just getting into our topics, and we’re going to look at some of the IBM news this week that came out of the Rational Division. There was the intent to buy Telelogic, a Swedish firm that’s got a lot of product across requirements, tests, QA, architecture and modeling, as well as embedded and system development. So, we’ll talk a little bit about that.
We’ll look at some of the announcements out of the Rational Developer Conference, including the Jazz Community, an innovative commercial/open-source community approach to development. Also, one of the more interesting product announcements was Rational Asset Manager, essentially a design-time metadata repository that can be used in conjunction with an operational or run-time registry and repository for a lifecycle approach to services.
What’s more, we’ll go around the table and look at what research folks have been conducting this week. That might include a look at WS02, the IONA Artix announcement, and some announcements also from BEA.
So, let’s start with the Telelogic acquisition. Telelogic is a publicly held company in
Why don’t we start with Jim Kobielus? IBM and embedded -- we haven't seen anything along those lines for a while. They jettisoned the Rational embedded drive before IBM had acquired Rational, but now they seem to have a new-found interest in embedded. That's increasingly focused on end-to-end development, recognizing that the entertainment and media sectors are going to be creating devices different from a personal computer. IBM doesn’t want to miss out on the opportunity to leverage its back-end systems vis-Ã -vis these new types of devices. Any thoughts, Jim?
Kobielus: Clearly, IBM over the last few years has made stronger moves into the appliance space and really helped to define it. Obviously, with the DataPower acquisition a few years ago and increasingly taking their data warehousing appliances -- I think they’ve already revamped that entire product family -- they can scale from very small, or relatively small, data warehouses to very large ones.
IBM is trying to fuse their chief software products and technologies with their hardware engineering expertise to develop products for various markets. Embedded operating systems, embedded application components, and so forth, are key to all that.
So, when I saw that they were acquiring Telelogic, it made perfect sense. IBM is very much bringing together the hardware and software worlds into appliances, both for the business and potentially for the consumer market.
I think they see a wide-open opportunity in embedded that is still a fairly wide-open, roll-your-own test, roll-your-own tools, even roll-your-own IDE or real-time operating system. So, I wonder if anyone else has some thoughts on the embedded angle on this?
Shimmin: Well, Dana, not so much about the embedded angle, sorry to say, but perhaps something that might transition us into talking about Rational. When I look at what IBM is doing here -- and this is pursuant to what Jim was just saying about their synergies between hardware and software -- I see IBM doing two things.
One is renewing the focus on software development on the design time and development time side of things. Then, I see them taking their expertise in hardware and putting the two together to build what a lot of the software companies or platform companies really wish they had in the space, and that is the ability to do two things: create well-performing software and actually have software that performs well in the production environment. They’re going to be pretty well positioned to take advantage of those two things.
They seem to be involved now more and more with getting very close to the application, almost to the point of being in a position to be dominant in terms of custom application development. Do you have any sense from your perspective, Brad, about the whole notion of the shift in the definition of applications and how to create them?
Shimmin: Absolutely, Dana. It’s funny. Both IBM and Oracle are the two companies leading the chart with what you just said there. They’re taking tentative baby steps, because this is a pretty daunting task they’ve undertaken. That is, as you said, dive deep into a given vertical and a given set of business processes within that vertical to provide literally out-of-the-box functionality. You’ve got your data models. You’ve got your actual BPEL processes. You’ve got everything that surrounds whatever it is you need to actually get some sort of process up and running within their environments.
Maybe five years ago or so, platform vendors really focused on the foundation infrastructure of what these applications run on. They’ve realized in maybe the last six months that the success that they’re going to enjoy is going to come from their ability to save their customers money from the exorbitant consulting and professional engagement fees that usually come hand-in-hand with rolling out software like this.
So, what they're doing is pretty cool, and I applaud them for it. As I was saying a second ago it’s a pretty long row to hoe and it’s going to take a lot of energy on Oracle’s and IBM’s part to actually fulfill this and to say, “Okay, we’re going to give you a set of business processes with all the data you need to set them up and get them running for your vertical.”
Shimmin: I think IBM is actually leaning that way and they have a very strong ecosystem already in place.
Biske: I have not evaluated it myself. Momentum’s CEO, Jeff Schneider, said maybe we’ll see some more activity on that product. It’s interesting, because I’ve yet to run into an organization that’s really leveraging some of the EA-specific products that are out there. They still tend to do 90 percent of what they need to do in Visio, rather than some of the specialized tools in that area, but, as the discipline matures, we’re going to see a lot more of that.
I want to talk a little more about this notion of it bringing IBM closer to the application business that you first commented about in your blog. At least, that was the first time that that was brought to my attention.
I have a friend from college who has worked in the embedded space pretty much since we graduated, and he contacted me recently about applying SOA to some of the work that he was doing. It surprised me a little bit, because it’s not a space that I’ve had to deal with. I’ve typically been in big IT and big enterprises. It opened my eyes that their environment for developing is maturing as well. They are getting to higher levels of abstraction and taking advantage of the same types of programming models a typical enterprise developer is.
I thought they were going to be so focused on performance and real-time behavior of these systems, that they may not have a strong interest in some of the things that Web service standards and XML have to offer, because of the tradeoffs from a performance standpoint. He said, “No, we are looking at all of that.”
Now, with IBM making this acquisition of a company that’s dealt in the embedded space, it really shows that development is still development. IBM is now recognizing that it’s not all just about “build whatever you want.” We are getting more specialized, and maybe the right way to get into the applications market is to create specialized tools for particular vertical domains, rather than providing the applications themselves.
It’s definitely something to pay attention to, as we go along. They did it on the tool side; they did it on the software side with the Webify acquisition last year; and I would guess that we’ll continue to see more in this direction.
Linthicum: What’s missing in the space is a holistic design tool around SOA. I looked at Telelogic at the EA Conference in
Linthicum: I did look at it from the online piece, and I think that it’s going to have value in this space as well. The folks at IBM are not dumb. They’re out in the back, trying to figure out how all this stuff is going to fit together. They want to have not only the mega-stack in terms of deploying technology and development technology, but the mega-stack in terms of the design time stuff, including holistic enterprise architecture, asset management, service management, and SOA governance.
So, it’s going to be very difficult not to see IBM in almost all the larger SOA implementations out there, once they have a critical mass of tools. They’re investing right now. They see this as a long term strategy and a way to gain revenue 10-15 years down the line. I think they’re making some smart moves. I would have acquired Telelogic as well, if I were IBM.
Linthicum: Oh, it’s a bargain. They did very well buying it, and they’re going to reap a lot of benefits from it. This is the right move from IBM and the investors are going to love that three or four years down the line.
Kobielus: That’s exactly what I’ve seen too, Dana. Rational is becoming the crown jewel within IBM. In my area of focus, master data management (MDM), the Rational tool has become the primary master data modeling and domain modeling tool for all of IBM’s MDM products. I agree. It was probably their most important acquisition in the last 10 years.
One last item in the IBM news. They announced this Jazz community, jazz.org, and it’s essentially an open environment, which people can join and help contribute to the development of Rational products. This is somewhat of a trial balloon with IBM saying, “Wow. Look, how successful Eclipse was as a governance environment and a community development force in the market. How can we take what was good about Eclipse, but apply it to commercial product development, not just open-source development?”
It strikes me that if the companies who partner with IBM that have a vested interest in how their products relate to the Rational products contribute and help define the Rational products, then the same model could be applied to other commercial aspects within IBM, and they could then perhaps even take the model to other products. Has anyone had a chance to look at Jazz? Do you think that this is a wacky idea, or do you think it will get traction?
Shimmin: I looked at it a little bit. This is not the first time this has been done, and it certainly won’t be the last. As you said, they saw the success of Eclipse and they saw that it was an environment that fostered innovation. As we all know, it’s very hard for large closed-source vendors to innovate quickly, while maintaining a customer base accustomed to once-a-year big upgrades, punctuated with little patch here and there.
I look at what IBM has done, as well as TIBCO, Sun, BEA, and Red Hat -- even though they’re not closed source -- and I see them using the tools that the open-source community fostered in order to collaborate over a large-scale network of developers, and they’re applying it, just as you said, to closed source. There are many benefits to that.
First and foremost is a quicker turnaround on bug fixes and getting to a GA. When you’re dealing with the traditional closed-source development cycle, you build your software, you send it out to maybe 10 trusted customers. They hammer on it a little, and you have your own internal people hammering on it, and that’s maybe a three month venture.
Using 10 customers, who have their own jobs to do and don’t give this a lot of shrift, sets you up for failure. That’s why we see so many post-release patches going out. What this is going to do, if it succeeds and can be applied to closed source, is let these large vendors get their code out quicker in a much more tip-top, enterprise-ready fashion.
Shimmin: Absolutely. IBM is taking a tiered approach to it, and some of the others have too. As we were saying with these trusted customers and partners, partners in particular are going to play a big role in this, but they would get access to source code at deeper levels. Folks that maybe are smaller customers or just interested parties, who want to make this product go forward, will have more limited access, unlike a traditional open source. They’ll have more limited access to the details of the source code.
Shimmin: Should we be thanking Microsoft, Novell and SUSE for that a little bit?
Shimmin: Absolutely.
Biske: It’s an interesting idea. I wonder how different it is from some of the efforts already going on. Clearly, we have commercial efforts built on open-source products. The key question here involves open-source products that are not available for free. If developers are working on it and they can build it and use it, how is that model going to come together if they say, “No, you only have a license to run it in a development mode and nothing more than that?” Is that going to be followed or not? Are they going to bother to enforce it, or do they really know in the long run it’s going to take the same direction that Eclipse did.
IBM has a commercial version of Eclipse, but largely people just go with the free product, because they can still include all the plug-ins that they need to. If they need to buy add-ins to it, they’re okay with that. So, they can focus their attention on Eclipse and create a framework, and can plug-in commercial components as they need to.
The other risk that they take is that the community is just going to look at this and think they’re just looking for free work. trying to take advantage of developers who just love to code and could care less whether they’re getting paid for it or not.
Shimmin: There are no selfless acts. Right, guys? When I scroll some message boards for these development efforts, I see people in enterprises saying, “I need access to this API because I need to extend the product to work with something I’ve built in-house.” I think that’s the kind of work that’s going to drive us forward.
[[[Speaker:]]] It’s the same phenomenon as the developers who are employed by a vendor like IBM. They quite often work excessive unpaid overtime, just because they’re committed to their jobs, their products, etc. In a sense, now you’re roping in the partner ecosystem as well. They’re putting in essentially unpaid overtime to help out the mother-ship vendor get its products debugged and developed.
Okay, let’s move on to some other topics. WS02 announced ESB 1, which is largely based on the Apache Synapse ESB. I want to make a disclosure that WS02 is a client of mine, and we should consider that as I present comments. I wonder if anyone else took a look at this, and had some thoughts on, “Wow. Yet another ESB and yet another open-source support maintenance business model entrant in the SOA ecology?”
Shimmin: Dana, I talked to them briefly about this before they released it. Like you, I saw this as, “Oh, yes, here’s another one, and maybe Red Hat should worry.” But, I don’t think anyone else is going to worry, except maybe the pure play ESB vendors like
They’re focusing on what everybody in this space is trying to focus on, performance. Everyone has realized that ESB is at its level of maturity. You need to really be focusing on availability, reliance, reliability -- the “ilities” -- of deployment. This is the third vendor in the last month -- this would include
Shimmin: I got the same vibe from them too, but I feel as if every vendor with an ESB these days feels the same way. They realize that, as with databases, you’re going into a heterogeneous environment regardless, and most likely inter-departmental, inter-company you’re going to have multiple ESBs and different messaging platforms that need to interoperate.
Biske: It doesn’t bother me. I’d rather see a lot more in the open-source space. They’ve got the freedom to keep it more focused on some of the target areas. In the case of WS02, they really are focused more on what I call the middle capabilities, rather than on service development and execution capabilities. You see a lot of the commercial ESBs going in the direction of giving you an orchestration platform and a composition platform. All of it is about building new services, and not about connecting existing consumers and existing service providers.
Some of these open-source ones are keeping it a little bit more constrained and targeted. Now, with the open-source model, if an enterprise needs to augment that for their particular needs it gives them the ability to do so. I’ve run into a few clients who are looking at some of these products and have a potential need to do that. The openness is a plus for them.
I don’t necessarily see it as too many ESBs out there. The market naturally will shake them out on its own. This is just the way these product spaces work. I don’t view it as an SOA bad thing.
Biske: Absolutely. At Momentum, we’re vendor neutral, and I know Dave talked in a recent podcast about the importance, when dealing with system integrators, of having one that is not closely tied to a particular partner, unless your company has already decided that you’re married to this particular vendor. Then, it makes sense.
If you’re looking for your pure systems integrator, you have got to select solutions that are in the best interests of the client, not the best interests of the consulting company or the partner ecosystem around this. Having open-source products gives us a lot more flexibility in meeting the needs of the clients.
Linthicum: I don’t put as much value as everybody else does into the open-source equation. In fact, my client base -- and it’s Global 2000 and government folks -- are indeed buying ESBs and other things based on the notion of having access to the source code. I just can’t imagine in my lifetime that they would ever want to become a product development vendor and would have the skill set to actually maintain a middleware product, having built a bunch of those in my lifetime. I’m a little skeptical about the ultimate value there. To me, open-source needs to have marketing value. I’m not sure it’s going to have a lot of technical value going forward.
My larger concern with the number of ESBs out there is that, in many instances, these have a tendency to be queuing systems with service interfaces on top of them. Therefore, they’re more information- or data- oriented than transaction-oriented. That has a tendency to limit some of the emerging patterns I’m seeing within SOA. People are looking basically to automate these high-end business transactional systems well beyond just data consumption and production.
Most of the ESBs don’t really address that. They basically become queuing systems with a nice interface on them. To Todd’s point, they do have orchestration layers and other development technology. Some of the higher ESBs out there definitely have that capability. That seems to be nice, but it still seems to be limited by the underlying infrastructure that they’re selling.
I’m concerned about the number of the ESBs in the market, and I’m concerned about the ability of those ESBs to deliver the ultimate value as the SOA we’re building becomes much more sophisticated.
Shimmin: This week I looked at two things primarily. One was the release AmberPoint made regarding their SOA validation system and their SOA management system. They have two products and they just versioned them to 6. I found what they were talking about very interesting, because, as I was just saying a second ago, they too are focusing on performance.
What they’ve gotten at is their desire to be in a run-time environment, because, as you guys know, they are like the Switzerland of SOA run-time governance in our industry. They are focusing on being able to scale their platform, and they have a number of partnerships and potential partnerships with hardware manufacturers, going back to our earlier discussion about IBM and their DataPower acquisition.
They seem to have this idea of, “Well, we’re run-time governance only, and we have the capability to be design-time governance as well, but we’re not going to get into that space.” I found that interesting, because they’ve just announced this as a part of this pre-flight check that they do. This is similar to the type of the service that
In the AmberPoint product, but a little bit more predominantly positioned, they actually go look at that WSDL as it relates to other processes across the entire application and looks for any interdependencies, broken relationships, and anything within the production environment that may cause a problem.
I think about companies like CA, HP, and IBM, who all really are trying to come at the same problem, but from the design and development side, and I think of AmberPoint coming at it from the run-time side. I feel like, “Well, why don’t you guys just get together? Let’s put these two notions together and make it so that we have a more coherent lifecycle management of our codes for SOA implementations that starts in design and ends in run-time.”
Shimmin: It’s a combination. When I talked to AmberPoint about that very question, Dana, we were talking about it with regards to this throttling technology they have, where they can look at PKIs and SOAs that have been defined inside the Registry Repository and tell the process to slow down on Thursday afternoons, for example, because I need these other process to take priority on that date and time.
I said to them, “Well, gosh, I know there are a lot of tools out there at designed/development time – a lot of BPM products for example -- let me establish these and stick them in a registry. You just pick those up automatically and start executing them.” They said, no, because the schema and the amount of data that comes from this is not really enough to do it.
Shimmin: Right.
Kobielus: One of the glues, of course, is a common registry and repository infrastructure between the design-time and the run-time environments, but just as important is the wetware, as you indicated, a common governance environment with roles and workflows. People who are doing the design and optimization of the Web services and the people who are administering the services in a run-time can be collaborating on a ongoing basis.
The common rules and policies of this common infrastructure, be it AmberPoint on the run time or tons of other vendors on the design time, they can all share. So, it’s a bit of the registry and it’s a bit of the governance human administrative workflow.
Biske: It’s interesting that you bring that up, Jim, because one of the things that I wanted to come back to was the Rational Asset Manager. It seems to be typical of IBM that in their SOA offerings they’ve got three of everything. If the registry repository is the key, and I agree with you that this is really the unifying component on some of these things that are dealing with policy on the metadata, they’ve got Rational Asset Manager, which is effectively a metadata repository, and then they’ve got WebSphere Registry Repository, which is another metadata repository.
I’ve had conversations with them going back a couple of years on this subject, asking “How are you going bridge those, and what’s
Brad hit on the other point, in his conversation with AmberPoint, that it’s not the fact that you have a metadata repository out there, but it’s the information that’s going into it. Until there’s some standard level of policy domain languages that these tools can leverage, you’re going to still see people just building their own fiefdoms and saying, “Well, you’ve got to have either AmberPoint everywhere or HP everywhere -- or whatever your management system of choice is -- to be able to do some of these things, like throttling across systems, and some of the run-time policy enforcement. It does need to bridge all the way back into the development tools.” So, again, IBM’s in a great position, providing products in all of those spaces, but it’s going to be difficult to pull off.
Linthicum: It’s possible for the newer offering, but they will have some pushback, given the fact that every domain is extremely different. One challenge that people have, when they try to get out in the market with this kind of stuff, is that ultimately what they think they’re implementing is different than what’s actually being implemented. I see a huge chasm between the perceptions.
For example, I was at the Gartner event this week to do a talk on ROI, and I got a chance to wander around and talk to a number of the people who are pushing in the market, both customers and users. There’s a very different perception as to what vendors think the problems are and what the problems are that users are actually experiencing. There’s going to be a bit of a sobering [[[-- come to Jesus --]]] that’s going to happen over the next year or so, when these guys push out there.
Linthicum: In the user community, all the problem domains I’m seeing, definitely in my client base, are unique. There doesn’t seem to be any one set of solution patterns you can apply across the board. You see bits and pieces of a stack and how simplistic those problem domains are. The vendors don’t see that when they design SOA in general. They typically give you the same stack and the same problem description. I’m not seeing a consistent problem description out there to work with the client.
Linthicum: You can’t do that, and that’s the problem people are running into right now. I saw the same thing back in the integration days, back in the AI days. We tried to take one problem domain, all the spaghetti code, and put it through a single hub. While that was applicable to a small percentage of the problem domains, it wasn’t widely applicable. It wasn’t applicable to all problem domains. SOA is even more complex than that. Vendors are going to find that they’re missing the boat in terms of understanding the needs of the people they’re trying to serve.
Linthicum: Dana, it’s all about people. As I’m getting further along in this stuff and learning more about it, I find that the people issues are really the core of all this all. I can solve any problem with technology, and probably everybody in this space can do the same thing. However, getting people aligned with how that’s going to happen and setting them up for success is the ultimate challenge right now, and the vendors need to understand that.
Shimmin: Back to our example of AmberPoint. If I have a nice BPM tool that lets me find my PKIs and SOAs, and if it’s not in my best interest to include that additional data the additional artifacts that AmberPoint is going to need to make that throttling automatic, why am I going to do it? I won’t.
Kobielus: That explains why I’m seeing more emphasis in the SOA space on pre-built domain models, essentially solutions that package up the rules, the best practices templates, the workflows, the policies, and so forth, for a particular problem domain, be it in the MDM space or be it in the ESB space. The customers are demanding these accelerators so they don’t need to hire people who are smart enough to build all that stuff from scratch. If the vendor and their partner ecosystem have already frozen all that expertise into the solution, the customer can be productive from day one. So, the customer could have a little bit longer to go find the appropriate smart people, wherever they happened to be, whether in
Shimmin: That’s why you look at the Rational and Jazz announcements this week and you see the reality that IBM is seeing, which is that, although we want to have this deep knowledge that’s in-house, that’s not likely to happen. You need to go where the talent is. So, the software is hopefully bridging those gaps.
Back to what you were saying earlier, Dana, about even Lotus being able to play in this, I think what’s going to become a much more predominant paradigm is that sort of telepresence for the entire life cycle of software for all involved in that.
Biske: I have a little bit different take on this. I still feel that it’s not that we need better developers, but that the technologists need to become more business aware and more business savvy.
There are a lot of businesses that may be looking offshore for a lot of these efforts, strictly from a cost reduction effort, but that doesn’t necessarily mean they’re going to get any better technical solutions than they would have with resources in-house. It just may mean that they’re going to get it less expensively, and even that is debatable.
So, they haven’t really improved things from that standpoint. In terms of business agility, they’ve reduced their cost, but is the IT solutions actually helping the business any more than it did before when all the work was being done internally? The only thing that’s really going to help push it along is to get people who are both knowledgeable about the business, as well as knowledgeable about the technology, and being able to bring those two worlds together.
Shimmin: I don’t think BPEL is the only way to do that, but that seems like that’s all vendors are focusing on right now.
Maybe we should be thinking about this as small pods or teams, where you’ve got a business person who is savvy, you’ve got a technologist who is savvy. And, then you’ve got a facilitator, a person who is very good at motivating and communicating, and create three-person pods to approach this, rather than think you’re going to get it in just one individual.
Kobielus:[[[???]]] Right, because these are all separate domains of complexity. You can be really astute on business issues, if you focus on that and you continue to refresh your understanding and the nuances of all of that day in and day out. Likewise, all the technology areas are themselves entirely stand-alone spheres of complexity. Imagine one person trying to juggle those different spheres of complexity all day, every day, and do a good job of it. That’s really hard from a wetware perspective.
Shimmin: Do you guys know the notion of extreme programming? The idea is to have these micro teams with two people who are always working on a given aspect of a project. Why not have one of them be the IT technologist and the other one be the business analyst?
Gardner: There’s an opportunity here for someone like a McKinsey to come in and start analyzing the organizational dynamics of approaching SOA, what sort of teams should be put together, and what sort of people should have certain skills. This whole notion of one person doing it seems to me as farfetched. These teams could be much more capable and much more distributed. You could push them into different activities within this problem set and they won’t necessarily have to be physically there.
Biske: Companies that have started to practice user-centered design and some formal usability practices are probably in a much better position. One thing you find in doing that is that you immediately have to get out of this customer-supplier relationship and into a team environment, as you describe, for developing solutions for business users. They’re part of the team. They’re not a customer.
Pointing to other potential groups, someone like Patricia Seybold Group, and their focus on customer innovation, some of those concepts really need to be brought in here to stop viewing IT as a supplier to the business and instead as a partner and working from a team standpoint.
You’re absolutely right that the T can be built by creating a team, rather than looking for one superstar individual that understands it all, because there aren’t too many of them.
Kobielus: My pleasure.
Biske: Thank you.
Shimmin: It’s been a pleasure everyone, thank you.
Listen to the podcast here.
Produced as a courtesy of Interarbor Solutions: analysis, consulting and rich new-media content production. If any of our listeners are interested in learning more about BriefingsDirect B2B informational podcasts or to become a sponsor of this or other B2B podcasts, please fill free to contact Interarbor Solutions at 603-528-2435.
Transcript of Dana Gardner’s BriefingsDirect SOA Insights Edition, Vol. 20. Copyright Interarbor Solutions, LLC, 2005-2007. All rights reserved.