BriefingsDirect Transcripts

Wednesday, December 14, 2011

Case Study: How SEGA Europe Uses VMware to Standardize Cloud Environment for Globally Distributed Game Development

Transcript of a BriefingsDirect podcast on how SEGA Europe has moved to a more secure and scalable VMware cloud solution for its worldwide development efforts.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: VMware.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you're listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on how a major game developer in Europe is successfully leveraging the hybrid cloud model.

We’ll le

arn how SEGA Europe is standardizing its cloud infrastructure across its on-premises operations, as well as with a public cloud provider. The result is a managed and orchestrated hybrid environment to test and develop multimedia games, one that dynamically scales productively to the many performance requirements at hand.

We’re joined by a systems architect with SEGA in London to learn more about how the hybrid approach to multiple, complementary cloud instances is meeting SEGA’s critical development requirements in a new way. [Disclosure: VMware is a sponsor of BriefingsDirect podcasts.]

Please join me now in welcoming Francis Hart, Systems Architect at SEGA Europe. Welcome to the podcast, Francis.

Francis Hart: Hi.

Gardner: We’re all very familiar with the amazing video games that are being created nowadays. And SEGA of course is particularly well-known for the Sonic the Hedgehog franchise going back a number of years, and I have to tell you, Francis, my son is a big fan of those games.

But I'm curious about how, behind the scenes, these games are made. How they come into being and what are some of the critical requirements that you have from a systems architecture perspective when developing these games?

Hart: We have a lot of development studios across the world. We're working on multiple

projects. We need to ensure that we supply them with a highly scalable and reliable solution in order to test, develop, and produce the game and the code in time.

Gardner: And how many developers are you working with there at SEGA Europe?

Hart: We have a number of different development studios. We’re probably looking at thousands of individual developers across the world.

Gardner: For those folks who are not familiar with the process, there is the creation of the code, there is the test and debug, and builds. It's quite complicated. There's a lot going on, many different moving parts. How did you start approaching that from your IT environment, from building the right infrastructure to support that?

Targeting testing

Hart: One of the first areas we targeted very early on was the last process in those steps, the testing, arguably one of the most time-consuming processes within the development cycle. It happens pretty much all the way through as well to ensure that the game itself behaves as it should, it’s tested, and the customer gets the end-user experience they require.

The biggest technical goal that we had for this is being able to move large amounts of data, un-compiled code, from different testing offices around the world to the staff. Historically we had some major issues in securely moving that data around, and this is what we started looking into cloud solutions for this.

Gardner: How did you use to do it? What was the old fashion way?

Hart: For very, very large game builds, and we're talking game builds above 10 gigabytes, it ended up being couriered within the country and then overnight file transfer outside of the country. So, very old school methods.

We needed both to secure that up to make sure we understood where the game builds were, and also to understand exactly which version each of the testing offices was using. So it’s gaining control, but also providing more security.

Gardner: Clearly one of the requirements here is to manage large files rapidly across geographic distances, but with security and management control, governance, and so forth. But as I understand, you're also dealing with this sort of peak-and-trough issue about the infrastructure itself. You need to ramp up a lot of servers to do the build, but then they sit there essentially unproductive between the builds. How did you flatten that out or manage the requirements around the workload support?

We work on the idea of having a central platform for a lot of these systems. Using virtualization to do that allowed us to scale off at certain times.

Hart: Typically, in the early stages of development, there is a fair amount of testing going on, and it tends to be quite small -- the number of staff involved in it and the number of build iterations. Going on, when the game reaches to the end of its product life-cycle, we’re talking multiple game iterations a day and the game size has gotten very large at that point. The number of people involved in the testing to meet the deadlines and get the game shipped on date is into the hundreds and hundreds of staff.

Gardner: How has virtualization and moving your workloads into different locations evolved over the years?

Hart: We work on the idea of having a central platform for a lot of these systems. Using virtualization to do that allowed us to scale off at certain times. Historically, we always had an on-premise VMware platform to do this. Very recently, we’ve been looking at ways to use that resource within a cloud to cut down from some of Capex loading but also remain a little bit more agile with some of the larger titles, especially online games that are coming around.

Gardner: Right. So we’re seeing a lot more of the role-play games (RPG) types of games, games themselves in the cloud. That must influence what you're doing in terms of thinking about your future direction.

Hart: Absolutely. We’ve been looking at things like the hybrid cloud model with VMware as a development platform for our developers. That's really what we're working on now. We've got a number of games in the pipeline that have been developed on the hybrid cloud platform. It gives the developers a platform that is exactly the same and mirrored to what it would eventually be in the online space through ISPs like Colt, which should be hosting the virtual cloud platform.

Gardner: So if the end destination for the runtime, or the operational runtime, for the game is going to be the cloud, it makes sense to live "of, for, and by" the cloud, I suppose. It’s more complementary. It’s always going to be there, right?

Gaining cost benefits

Hart: Yes. And one of the benefits we're seeing in the VMware offering is that regardless of what data center in the world is the standard platform, it also allows us to leverage multiple ISPs, and hopefully gain some cost benefits from that.

Gardner: Francis, tell me a little bit about the pilot project. No one is going to jump up and put their mission-critical activities into a cloud environment, especially a hybrid environment, overnight. So the crawl-walk-run approach seems to be the most prudent way. Tell me a little bit about what your goals were and what you've been able to attain even in a pilot setting?

Hart: Very early on we were in discussions with Colt and also VMware to understand what technology stack they were bringing into the cloud. We started doing a proof of concept with VMware and a professional services company, and together we were able to come over a proof of concept to distribute our game testing code, which previously was a very old-school distribution system. So anything better would improve the process.

There wasn't too much risk to the company. So we saw the opportunity to have a hybrid cloud set up to allow us to have an internal cloud system to distribute the codes to the majority of UK game testers and to leverage high bandwidth between all of our sites.

For the game testing studios around Europe and the world, we could use a hosted version of the same service which was up on the Colt Virtual Cloud Director (VCD) platform to supply this to trusted testing studios.

Doing this allows us to manage it at one location and simply clone the same system to another cloud data center.

Gardner: When you approach this hybrid cloud model, it’s one thing to be able to technically do that, to have the standardization and to have the products in place that will support the workloads and the virtualization continuity, the similar environment. But what about managing that? What about having a view into what’s going on so that you know what aspects of the activity and requirements are being met and where? It must involve quite a bit of management?

Hart: Yes. Also the virtual cloud environment of vCloud Director has a web portal that allows you to manage a lot of this configuration in a central way. We’re also using VMware Cloud Connector, which is a product that allows you to move the apps between different cloud data centers. And doing this allows us to manage it at one location and simply clone the same system to another cloud data center.

In that regard, the configuration very much was in a single place for us in the way that we designed the proof of concept. It actually helped things, and the previous process wasn’t ideal anyway. So it was a dramatic improvement.

Gardner: Well, let’s dig into that a bit. What were some of the metrics of success, even on your pilots? I understand that you’re going to be expanding on that, but are there data points that we can look to whether it’s reduction in cost for servers, operation, security, time to development and test? What were some of the salient paybacks of doing development in this manner?

Hart: One of the immediate benefits was around the design process. It's very obvious that we were tightening up security within our build delivery to the testing studios. Nothing was with a courier on a bike anymore, but within a secured transaction between the two offices.

Risk greatly reduced

Also from a security perspective, we understood exactly what game assets and builds were in each location. So it really helped the product development teams to understand what was where and who was using what, and so from a risk point of view it’s greatly reduced.

In terms of stats and the amount of data throughput, it’s pretty large, and we’ve been moving terabytes pretty much weekly nowadays. Now we’re going completely live with the distribution network.

So it’s been a massive success. All of the UK testing studios are using the build delivery system day to day, and for the European ones we’ve got about half the testing studios on board that build delivery system now, and it’s transparent to them.

Gardner: Francis, in moving to a hybrid environment, in practical terms, was there anything that appeared, that crept in, that you weren’t anticipating? Was there something about this that caught you by surprise -- either good or bad?

Hart: Not particularly. VMware was very good at allowing us to understand the technology and that's one of the benefits of working with a professional services reseller. In terms of gotchas, there weren't too many. There were a lot of good surprises that came up and allowed us to open the door to a lot of other VMware technologies.

There were a lot of good surprises that came up and allowed us to open the door to a lot of other VMware technologies.

Now, we're also looking at alternating a lot of processes within vCenter Orchestrator and other VMware products. They really gave us a good stepping stone into the VMware catalogue, rather than just vSphere, which we were using previously. That was very handy for us.

Gardner: I’d like to just pause here for a second. Your use of vSphere -- and I believe you’re on 4.1 if my notes are correct -- has gotten you to a fairly high level of virtualization. That must have been an important stepping stone to be able to have the dynamic ability to ramp up and down your environments, your support infrastructure, but also skills. I imagine there must have been a comfort zone with virtualization that you needed to have in order to move into the cloud level, too.

Hart: Absolutely. We already have a fair footprint in Amazon Web Services (AWS), and it was a massive skill jump that we needed to train members of the staff in order to use that environment. With the VMware environment, as you said, we already have a large amount of skill set using vSphere. We have a large team that supports our corporate infrastructure and we've actually got VMware in our co-located public environment as well. So it was very, very assuring that the skills were immediately transferable.

Gardner: Let’s get back to what you’re going to be doing, now that this pilot has been successful. You’ve had some success with meeting your requirements, also getting some benefits that you weren't anticipating and that all important security control and governance aspect. What’s the next step? Where did you go with your initial stepping stone into hybrid cloud? How are you going to get into that run mode now that you've sort of walked and crawled?

Game release

Hart: As I mentioned before, the first part was dealing with the end of the process, and that was the testing and the game release process. Now, we’re going to be working back from that. The next big area that we’re actively involved in is getting our developers to develop online games within the hybrid environment.

So they’re designing the game and the game’s back-end servers to be optimal within the VMware environment. And then, also pushing from staging to live is a very simple process using the Cloud Connector.

Gardner: Well, that sounds a lot like what we know in the business as platform as a service (PaaS) where you are actually accomplishing much, if not all, of the development, test and deploy cycle -- the life-cycle of the applications in the cloud.

Hart: Absolutely. We're restructuring and redesigning the IT systems within SEGA to be more of a development operations team to provide a service to the developers and to the company.

Gardner: Great. I really appreciate your sharing your story with us, Francis. Now that you've done this a bit, any words of wisdom, 20/20 hindsight, that you might share with others who are considering moving more aggressively into private cloud, hybrid cloud, and ultimately perhaps the full PaaS value?

The next big area that we’re actively involved in is getting our developers to develop online games within the hybrid environment.

Hart: Just get some hands-on experience and play with the cloud stack from VMware. It’s inexpensive to have a go and just get to know the technology stack.

Gardner: Thanks. You've been listening to a sponsored podcast discussion on how a major game developer, SEGA, is leveraging the hybrid cloud model using the VMware cloud stack.

I’d like to thank our guest, Francis Hart, System Architect at SEGA Europe, based in London. Thanks again so much, Francis.

Hart: Thank you.

Gardner: This is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks to our audience for joining us as well, and come back next time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: VMware.

Transcript of a BriefingsDirect podcast on how SEGA Europe has moved to a more secure and scalable VMware cloud solution for its worldwide development efforts. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in:

Monday, December 12, 2011

Efficient Data Center Transformation Requires Consolidation and Standardization Across Critical IT Tasks

Transcript of a sponsored podcast discussion in conjunction with an HP video series on the best practices for developing a common roadmap for DCT.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: HP.

For more information on The HUB, HP's video series on data center transformation, go to www.hp.com/go/thehub.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you’re listening to BriefingsDirect.Today, we present a sponsored podcast discussion on quick and proven ways to attain significantly improved IT operations and efficiency.

We'll hear from a panel of HP experts on some of their most effective methods for fostering

consolidation and standardization across critical IT tasks and management. This is the second in a series of podcast on data center transformation (DCT) best practices and is presented in conjunction with a complementary video series. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

Here today we will specifically explore building quick data center project wins, leveraging project tracking and scorecards, as well as developing a common roadmap for both facilities and IT infrastructure. You don’t need to go very far in IT to find people who are diligently working to do more with less, even as they're working to transform and modernize their environments.

One way to keep the interest high and those operating and investment budgets in place is to show fast results and then use that to prime the pump for even more improvement and even more funding with perhaps even growing budgets.

With us now to explain how these solutions can drive successful data center transformation is our panel, Duncan Campbell, Vice President of Marketing for HP Converged Infrastructure and small to medium-sized businesses (SMBs); Randy Lawton, Practice Principal for Americas West Data Center Transformation & Cloud Infrastructure Consulting at HP, and Larry Hinman, Critical Facilities Consulting Director and Worldwide Practice Leader

for HP Critical Facility Services and HP Technology Services. Welcome to you all.

Let's go first to Duncan Campbell on communicating an ongoing stream of positive results, why that’s important and necessary to set the stage for an ongoing virtuous adoption cycle for data center transformation and converged infrastructure projects.

Duncan Campbell: You bet, Dana. We've seen that when a customer is successful in breaking down a large project into a set of quick wins, there are some very positive outcomes from that.

Breeds confidence

Number one, it breeds confidence, and this is a confidence that is actually felt within the

organization, within the IT team, and into the business as well. So it builds confidence both inside and outside the organization.

The other key benefit is that when you can manifest these quick wins in terms of some specific return on investment (ROI) business outcome, that also translates very nicely as well and gets a lot of key attention, which I think has some downstream benefits that actually help out the team in multiple ways.

Gardner: I suppose it's not only getting these quick wins, but effectively communicating them well. People really need to know about them.

Campbell: Right. So this is one of the things that some of the real leaders in IT realize. It's not just about attracting the best talent and executing well, but it's about marketing the team’s results as well.

One of the benefits in that is that you can actually break down these projects just in terms of some specific type of wins. That might be around standardization, and you can see a lot of wins there. You can quickly consolidate to blades. You can look at virtualization types of quick wins, as well as some automation quick wins.

We would advocate that customers think about this in terms of almost a step-by-step approach, knocking that down, getting those quick wins, and then marketing this in some very tangible ways that resonate very strongly.

Gardner: When you start to develop a cycle of recognition, incentives, and buy-in, I suppose we could also start to see some sort of a virtuous adoption cycle, whereby that sets you up for more interest, an easier time evangelizing, and so on.

Campbell: That’s exactly right. A virtuous cycle is well put. That allows really the team to get the additional green light to go to the next step in terms of their blueprint that they are trying to execute on. It gets a green light also in terms of additional dollars and, in some cases, additional headcount to add to their team as well.

What this does is, and I like this term the virtuous cycle, not only allow you to attract key talent, but it really allows you to retain folks. That means you're getting the best team possible to duplicate that, to get those additional wins, and it really does indeed become a virtuous cycle.

Gardner: I suppose one last positive benefit here might be that, as enterprises adopt more of what we call social networking and social media, the ability for the rank and file, those users involved with these products and services, can start to be your best word-of-mouth marketing internally.

TCO savings

Campbell: That’s right. A good example is where we have been able to see a significant total cost of ownership (TCO) type of savings with one of our customers, McKesson, that in fact was taking one of these consolidated approaches with all their development tools. They saw a considerable savings, both in terms of dollars, over $12.9 million, as well as a percentage of TCO savings that was upwards of 50 percent.

When you see tangible exciting numbers like that, that does grab people’s attention and, you bet, it becomes part of the whole social-media fabric and people want to go to a winner. Success breeds success here.

Gardner: Thank you. Next, we're going to go to Randy Lawton and hear some more about why tracking scorecards and managing expectations through proven data and metrics also contributes to a successful ongoing DCT activity.

Randy, why is it so important to know your baseline tracks and then measure them each and every step along the way?

Randy Lawton: Thank you, Dana. Many of the transformation programs we engage in with

our customers are substantially complex and span many facets of the IT organization. They often involve other vendors and service providers in the customer organization.

So there’s a tremendous amount of detail to pull together and organize in these complex engagements and initiatives. We find that there’s really no way to do that, unless you have a good way of capturing the data that’s necessary for a baseline.

It’s important to note that we manage these programs through a series of phases in our methodology. The first phase is strategy and analysis. During that phase, we typically run a discovery on all IT assets that would include the data center, servers, storage, the network environment, and the applications that run on those environments.

During the course of the last few years, our services unit has made investments in a number of tools that help with the capture and management of the data, the scorecarding, and the analytics.

From that, we bridge into the second phase, which is architect and validate, where we begin to solution out and develop the strategies for a future-state design that includes the standardization and consolidation approaches, and on that begin to assemble the business case. In a detailed design, we build out those specifications and begin to create the data that determines what the future-state transformation is.

Then, through the implementation phase, we have detailed scorecards that are required to be tracked to show progress of the application teams and infrastructure teams that contribute to the program in order to guarantee success and provide visibility to all the stakeholders as part of the program, before we turn everything over to operations.

During the course of the last few years, our services unit has made investments in a number of tools that help with the capture and management of the data, the scorecarding, and the analytics through each of the phases of these programs. We believe that helps offer a competitive advantage for us and helps enable more rapid achievement of the programs from our customer perspective.

Gardner: As we heard from Duncan about why it’s important to demonstrate wins, I sense that organizations are really data driven now more than ever. It seems important to have actual metrics in place and be able to prove your work each step of the way.

Complex engagements

Lawton: That’s very true. In these complex engagements, it’s normally some time before there are quick-win type of achievements that are really notable.

For example, in the HP IT transformation program we undertook over several years back through 2008, we were building six new data centers so that we could consolidate 185 worldwide. So it was some period of time from the beginning of the program until the point where we moved the first application into production.

All along the way we were scorecarding the progress on the build-out of the data centers. Then, it was the build-out of the compute infrastructure within the data centers. And then it was a matter of being able to show the scorecarding against the applications, as we could get them into the next generation data centers.

If we didn't have the ability to show and demonstrate the progress along the way, I think our stakeholders would have lost patience or would not have felt that the momentum of the program was going on the kind of track that was required. With some of these tools and approaches and the scorecarding, we were able to demonstrate the progress and keep very visible to management the movements and momentum of the program.During the course of the last few years, our services unit has made investments in a number of tools that help with the capture and management of the data, the scorecarding, and the analytics.

If we didn't have the ability to show and demonstrate the progress along the way, I think our stakeholders would have lost patience or would not have felt that the momentum of the program was going on the kind of track that was required.

Gardner: Randy, I know that many organizations are diligent about the scorecarding across all sorts of different business activities and metrics. Have you noticed in some of these engagements that these readouts and feedback in the IT and data center transformation activities are somehow joined with other business metrics? Is there an executive scorecard level that these feed into to give more of a holistic overview? Is this something that works in tandem with other scorecarding activities in a typical corporation?

Lawton: It absolutely is, Dana. Often in these kind of programs there are business activities and projects that are going on within the business units. There are application projects that work into the program and then there are the infrastructure components that all have to be fit together at some level.

What we typically see is that the business will be reporting its set of metrics, each of the application areas will be reporting their metrics, and it’s typically from the infrastructure perspective where we pull together all of the application and infrastructure activities and sometimes the business metrics as well.

We've seen multiple examples with our customers where they are either all consolidated into executive scorecards that come out of the reporting from the infrastructure portion of the program that rolls it all together, or that the business may be running separate metrics and then application teams and infrastructure are running the IT level metrics that all get rolled together into some consolidated reporting on some level.

Gardner: And that, of course, ensures that IT isn’t the odd man out, when it comes to being on time and in alignment with these other priorities. That sounds like a very nice addition to the way things may have been done five or 10 years ago.

Lawton: Absolutely.

Gardner: Any examples, Randy, either with organizations you could name, or use cases where you could describe, where the use of this ongoing baselining, tracking, measuring, and delivering metrics facilitates some benefits? Any stories that you can share?

Cloning applications

Lawton: A very notable example is one of our telecom customers we worked with during the last year and finished a program earlier this year. The company was purchasing the assets of another organization and needed to be able to clone the applications and infrastructure that supported business processes from the acquired company.

Within the mix of delivery for stakeholders in the program, there were nine different companies represented. There were some outsourced vendors from the application support side in the acquiree’s company, outsourcers in the application side for the acquiring company, and outsourcers in the data centers that operated data center infrastructure and operations for the target data centers we were moving into.

What was really critical in pulling all this together was to be able to map out, at a very detailed level, the tasks that needed to be executed, and in what time frame, across all of these teams.

The final cutover migration required over 2,500 tasks across these 9 different companies that all needed to be executed in less than 96 hours in order to meet the downtime window of requirements that were required of the acquiring company’s executive management.

It was the detailed scorecarding and operating war rooms to keep those scorecards up to date in real-time that allowed us to be able to accomplish that. There’s just no possible way we would have been able to do that ahead of time.

For more information on The HUB, HP's video series on data center transformation, go to www.hp.com/go/thehub.

I think that HP was very helpful in working with the customer and bringing that perspective into the program very early on, because there had been a failed attempt to operate this program prior to that, and with our assistance and with developing these tools and capabilities, we were able to successfully achieve the objectives of that program.

Gardner: One thing that jumped out at me there was your use of the words real time. How important is it to capture this data and adjust it and update it in real-time, where there’s not a lot of latency? How has that become so important?

Lawton: In this particular program, because there were so many activities taking place in parallel by representatives from all over the world across these nine different companies, the real-time capture and update of all of the data and information that went into the scorecarding was absolutely essential.

In some of the other programs we've operated, there was not such a compressed time frame that required real-time metrics, but we, at minimum, often required daily updates to the metrics. So each program, the strategies that drive that program, and some of the time constraints will drive what the need is for the real-time update.

We often can provide the capabilities for the real-time updates to come from all stakeholders in the program, so that the tools can capture the data, as long as the stakeholders are providing the updates on a real-time basis.

Gardner: So as is often the case, good information in, good results back.

Lawton: Absolutely.

Organizing infrastructure

Gardner: Let’s move now to our third panelist today. We're going to hear about why organizing facilities and infrastructure planning in conjunction in relationship to one another is so important.

Now to Larry Hinman. Larry, let’s go historical for a second. Has there usually been a completely separate direction for facilities planning in IT infrastructure? Why was that the case, and why is it so important to end that practice?

Larry Hinman: Hi, Dana. If you look over time and over the last several years, everybody has

data centers and everybody has IT. The things that we've seen over the last 10 or 15 years are things like the Internet and criticality of IT and high density and all this stuff that people are talking about these days. If you look at the ways companies organized themselves several years ago, IT was a separate organization, facilities was a separate organization, and that actually still exists today.

One of the things that we're still seeing today is that, even though there is this push to try to get IT groups and facilities organizations to talk and work each other, this gap that exists between truly how to glue all of this together.

If you look at the way people do this traditionally -- and when I say people, I'm talking about IT organizations and facilities organization -- they typically will model IT and data centers, even if they are attempting to try and glue them together, they try to look at power requirements.

One of the things that we spotted a few years ago was that when companies do this, the risk of over provisioning or under provisioning is very high. We tried to figure out a way to back this up a few notches.

What we figured out was that you have to stop and back up a few notches to really start to get all this glued together.

How can we remedy this problem and how can we bring some structure to this and bring some, what I would call, sanity to the whole equation, to be able to have something predictable over time? What we figured out was that you have to stop and back up a few notches to really start to get all this glued together.

So we took this whole complex framework and data center program and broke it into four key areas. It looks simplistic in the way we've done this, and we have done this over many, many years of analysis and trying to figure out exactly what direction we should take. We've actually spun this off in many directions a few times, trying to continually make it better, but we always keep coming back to these four key profiles.

Business and risk is the first profile. IT architecture, which is really the application suite, is the second profile. IT infrastructure is the third. Data center facilities is the fourth.

One of the things that you will start to hear from us, if you haven’t heard it already via the data center transformation story that you guys were just recently talking about, is this nomenclature of IT plus facilities equals the data center.

Getting synchronized

Look at that, look at these four profiles, and look at what we call a top-down approach, where I start to get everybody synchronized on what risk profiles are and tolerances for risk are from an IT perspective and how to run the business, gluing that together with an IT infrastructure strategy, and then gluing all that into a data center facility strategy.

What we found over time is that we were able to take this complex program of trying to have something predictable, scalable, all of the groovy stuff that people talk about these days, and have something that I could really manage. If you're called into the boss’s office, as I and others have been over the many years in my career, to ask what’s the data center going to look like over the next five years, at least I would have some hope of trying to answer that question.

That is kind of the secret sauce here, and the way we have developed our framework was breaking this complex program into these four key areas. I'm certainly not trying to say this is an easy thing to do. In a lot of companies, it’s culture changes. It’s a threat to the way the very organization is organized from an IT and a facilities perspective. The risk and recovery teams and the management teams all have to start working together collaboratively and collectively to be able to start to glue this together.

Gardner: You mentioned earlier the issues around energy and the ongoing importance around the cost structure for that. I suppose it's not just fitting these together, but making them fit for purpose. That is to say, IT and facilities on an ongoing basis.

You get it pointing the right direction, collect the data, complete the modeling, put it in the toolset, and now you have something very dynamic that you can manage over time.

It’s not really something that you do and sit still, as would have been the case several years ago, or in the past generation of computing. This is something that's dynamic. So how do you allow a fit-for-purpose goal with data-center facilities to be something that you can maintain over time, even as your requirements change?

Hinman: You just hit a very important point. One of the the big lessons learned for us over the years has been this ability to not only provide this kind of modeling and predictability over time for clients and for customers. We had to get out of this mode of doing this once and putting it on a shelf, deploying a future state data center framework, keep client pointing in the right direction.

The data is, as you said, gets archived, and they pick it up every few years and do it again and again and again, finding out that a lot of times there's an "aha" moment during those periods, the gaps between doing it again and again.

One thing that we have learned is to not only have this deliberate framework and break it into these four simplistic areas, where we can manage all of this, but to redevelop and re-hone our tools and our focus a little bit, so that we could use this as a dynamic ongoing process to get the client pointing the right direction. Build a data center framework that truly is right size, integrated, aligned, and all that stuff. But then, to have something that was very dynamic that they could manage over time.

That's what we've done. We've taken all of our modeling tools and integrated them to common databases, where now we can start to glue together even the operational piece, of data center infrastructure management (DCIM), or architecture and infrastructure management, facilities management, etc., so now the client can have this real-time, long-term, what we call a 10-year view of the overall operation.

So now, you do this. You get it pointing the right direction, collect the data, complete the modeling, put it in the toolset, and now you have something very dynamic that you can manage over time. That's what we've done, and that's where we have been heading with all of our tools and processes over the last two to three years.

EcoPOD concept

Gardner: I also remember with great interest the news from HP Discover in Las Vegas last summer about your EcoPOD and the whole POD concept toward facilities and infrastructure. Does that also play a part in this and perhaps make it easier when your modularity is ratcheted up to almost a mini data center level, rather than at the server or rack level?

Hinman: With the various what we call facility sourcing options, which PODs are certainly one of those these days, we've also been very careful to make sure that our framework is completely unbiased when it comes to a specific sourcing option.

What that means is, over the last 10 plus years, most people were really targeted at building new green-field data centers. It was all about space, then it became all about power, then about cooling, but we were still in this brick and mortar age, but modularity and scalability has been driving everything.

With PODs coming on the scene with some of the other design technologies, like multi-tiered or flexible data center, what we've been able to do is make sure that our framework is targeted at almost a generic framework where we can complete all the growth modeling and analysis, regardless of what the client is going to do from a facilities perspective.

It lays the groundwork for the customer to get their arms around all of this and tie together IT and facilities with risk and business, and then start to map out an appropriate facility sourcing option.

We find these days that POD is actually a very nice fit with all of our clients, because it provides high density server farms, it provides things that they can implement very quickly, and gets the power usage effectiveness (PUE) and power and operational cost down. We're starting to see that take a stronghold in a lot of customers.

Gardner: As we begin to wrap up, I should think that these trends are going to be even more important, these methods even more productive, when we start to factor in movement toward private cloud. There's the need to support more of a mobile tier set of devices, and the fact that we're looking for of course even more savings on those long-term energy and operating costs.

Back to you, Randy Lawton. Any thoughts about how scorecards and tracking will be even more important in the future, as we move, as we expect we will, to a more cloud-, mobile-, and eco-friendly world?

Lawton: Yes, Dana. In a lot of ways, there is added complexity these days with more customers operating in a hybrid delivery model, where there may be multiple suppliers in addition to their internal IT organizations.

Greater complexity

Just like the example case I gave earlier, where you spread some of these activities not only across multiple teams and stakeholders, but also into separate companies and suppliers who are working under various contract mechanism, the complexity is even greater. If that complexity is not pulled into a simplified model that is beta driven, that is supported by plans and contracts, then there are big gaps in the programs.

The scorecarding and data gathering methods and approaches that we take on our programs are going to be even more critical as we go forward in these more complex environments.

Operating the cloud environments simplifies things from a customer perspective, but it does add some additional complexities in the infrastructure and operations of the organization as well. All of those complexities add up to, meaning that even more attention needs to be brought to the details of the program and where those responsibilities lie within stakeholders.

Gardner: Larry Hinman, we're seeing this drive toward cloud. We're also seeing consolidation and standardization around data center infrastructure. So perhaps more large data centers to support more types of applications to even more endpoints, users, and geographic locations or business units. Getting that facilities and IT equation just right becomes even more important as we have fewer, yet more massive and critical, data centers involved.

Hinman: Dana, that's exactly correct. If you look at this, you have to look at the data center facilities piece, not only from a framework or model or topology perspective, but all the way down to the specific environment.

You have to look at the data center facilities piece, not only from a framework or model or topology perspective, but all the way down to the specific environment.

It could be that based on a specific client’s business requirements and IT strategy that it will require possibly a couple of large-scale core data centers and multiple remote sites and/or it could just be a bunch of smaller types of facilities.

It really depends on how the business is being run and supported by IT and the application suite, what the tolerances for risk are, whether it’s high availability, synchronous, all the groovy stuff, and then coming up with a framework that matches all those requirements that it’s integrating.

We tell clients constantly that you have to have your act together with respect to your profile, and start to align all of this, before you can even think about cloud and all the wonderful technologies that are coming down the pike. You have to be able to have something that you can at least manage to control cost and control this whole framework and manage to a future-state business requirement, before you can even start to really deploy some of these other things.

So it all glues together. It's extremely important that customers understand that this really is a process they have to do.

Gardner: Very good. You've been listening to a sponsored BriefingsDirect podcast discussion on how quick and proven ways to attain productivity can significantly improve IT operations and efficiency.

This is the second in an ongoing series of podcasts on data center transformation best practices and is presented in conjunction with a complementary video series.

I'd like to thank our guests, Duncan Campbell, Vice President of Marketing for HP Converged Infrastructure and SMB; Randy Lawton, Practice Principal in the Americas West Data Center Transformation & Cloud Infrastructure Consulting at HP, and Larry Hinman, Critical Facilities Consulting Director and Worldwide Practice Leader for HP Critical Facility Services and HP Technology Services. So thanks to you all.

This is Dana Gardner, Principal Analyst at Interarbor Solutions. Also, thanks to our audience for listening, and come back next time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: HP.

For more information on The HUB, HP's video series on data center transformation, go to www.hp.com/go/thehub.

Transcript of a sponsored podcast discussion in conjunction with an HP video series on the best practices for developing a common roadmap for DCT. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in:

Wednesday, November 30, 2011

Big Data Meets Complex Event Processing: AccelOps Delivers a Better Architecture to Attack the Data Center Monitoring and Analytics Problem

Transcript of a BriefingsDirect podcast on how enterprises can benefit from capturing and analyzing systems data to improve IT management in real-time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: AccelOps.

Connect with AccelOps: Linkedin, Twitter, Facebook, RSS.

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you're listening to BriefingsDirect.

Today, we present a sponsored podcast discussion on how new data and analysis approaches are

significantly improving IT operations monitoring, as well as providing stronger security. We'll examine how advances in big data analytics and complex events processing (CEP) can come together to provide deep and real-time, pattern-based insight into large-scale IT operations.

AccelOps has developed the technology to correlate events with relevant data across IT systems, so that operators can gain much better insights faster, and then learn as they go to better predict future problems before they emerge. [Disclosure: AccelOps is a sponsor of BriefingsDirect podcasts.]

With us now to explain how these new solutions can drive better IT monitoring and remediation response -- and keep those critical systems performing at their best -- is our guest, Mahesh Kumar, Vice President of Marketing at AccelOps. Welcome to BriefingsDirect, Mahesh.

Mahesh Kumar: Dana, glad to be here.

Gardner: It's always been difficult to gain and maintain comprehensive and accurate analysis of large-scale IT operations, but it seems, Mahesh, that this is getting more difficult. I think there have been some shifts in computing in general in these environments that makes getting a comprehensive view of what’s going on perhaps more difficult than ever. Is that fair in your estimation?

Kumar: Absolutely, Dana. There are several trends that are fundamentally questioning existing and traditional ways of monitoring a data center.

Gardner: Of course we're seeing lots of virtualization. People are getting into higher levels of density, and so forth. How does that impact the issue about monitoring and knowing what’s going on with your systems? How is virtualization a complexity factor?

Kumar: If you look at trends, there are on average about 10 virtual machines (VMs) to a

physical server. Predictions are that this is going to increase to about 50 to 1, maybe higher, with advances in hardware and virtualization technologies. So that’s one trend, the increase in density of VMs is a complicating factor for capacity planning, capacity management, performance management, and security.

Corresponding to this is just the sheer number of VMs being added in the enterprise. Analysts estimate that just in the last few years, we have added as many VMs as there were physical machines. In a very short period of time, you have in effect seen a doubling of the size of the IT management problem. So there are a huge number of VMs to manage and that introduces complexity and a lot of data that is created.

Moreover, your workloads are constantly changing. vMotion and DRS are causing changes to happen in hours, minutes, or even seconds, whereas in the past, it would take a week or two for a new server to be introduced, or a server to be moved from one segment of the network to the other.

So change is happening much more quickly and rapidly than ever before. At the very least, you need monitoring and management that can keep pace with today’s rate of change.

Cloud computing

Cloud computing is another big trend. All analyst research and customer feedback suggests that

we're moving to a hybrid model, where you have some workloads on a public cloud, some in a private cloud, and some running in a traditional data center. For this, monitoring has to work in a distributed environment, across multiple controlling parties.

Last but certainly not the least, in a hybrid environment, there is absolutely no clear perimeter that you need to defend from a security perspective. Security has to be pervasive.

Given these new realities, it's no longer possible to separate performance monitoring aspects from security monitoring aspects, because of the distributed nature of the problem. You can’t have two different sets of eyes looking at multiple points of presence, from different angles and then try to piece that together.

Those are some of the trends that are causing a fundamental rethink in how IT monitoring and management systems have to be architected.

Gardner: And even as we're seeing complexity ramp-up in these data centers, many organizations are bringing these data centers together and consolidating them. At the same time, we're seeing more spread of IT into remote locations and offices. And we're seeing more use of mobile and distributed activities for data and applications. So we're not only talking about complexity, but we're talking about scale here.

Every office with voice over IP (VoIP) phones needs some servers and network equipment in their office, and those servers and network equipment have to be secured and their up-time guaranteed.

Kumar: And very geographically distributed scale. To give you an example, every office with voice over IP (VoIP) phones needs some servers and network equipment in their office, and those servers and network equipment have to be secured and their up-time guaranteed.

So what was typically thought of as a remote office now has a mini data center, or at least some elements of a data center, in it. You need your monitoring and management systems to have the reach and can easily and flexibly bring those under management and ensure their availability and security.

Gardner: What are some of the ways that you can think about this differently? I know it’s sort of at a vision level, but typically in the past, people thought about a system and then the management of that system. Now, we have to think about clouds and fabrics. We're just using a different vocabulary to describe IT. I suppose we need to have a different vocabulary to describe how we manage and monitor it as well.

Kumar: The basic problem you need to address is one of analysis. Why is that? As we discussed earlier, the scale of systems is really high. The pace of change is very high. The sheer number of configurations that need to be managed is very large. So there's data explosion here.

Since you have a plethora of information coming at you, the challenge is no longer collection of that information. It's how you analyze that information in a holistic manner and provide consumable and actionable data to your business, so that you're able to actually then prevent problems in the future or respond to any issues in real-time or in near real-time.

You need to nail the real-time analytics problem and this has to be the centerpiece of any monitoring or management platform going forward.

Fire hose of data

Gardner: In the past, this fire hose of data was often brought into a repository, perhaps indexed and analyzed, and then over time reports and analysis would be derived from it. That’s the way that all data was managed.

But we really can't take the time to do that, especially when we have to think about real-time management. Is there a fundamental change in how we approach the data that’s coming from IT systems in order to get a better monitoring and analysis capability?

Kumar: The data has to be analyzed in real-time. By real-time I mean in streaming mode before the data hits the disk. You need to be able to analyze it and make decisions. That's actually a very efficient way of analyzing information. Because you avoid a lot of data sync issues and duplicate data, you can react immediately in real time to remediate systems or provide very early warnings in terms of what is going wrong.

The challenges in doing this streaming-mode analysis are scale and speed. The traditional approaches with pure relational databases alone are not equipped to analyze data in this manner. You need new thinking and new approaches to tackle this analysis problem.

Gardner: Also for issues of security, you don't want to find out about security weaknesses by going back and analyzing a bunch of data in a repository. You want to be able to look and find correlations about what's going on, where attacks might be originating, and how that might be affecting different aspects of your infrastructure.

Attackers may hijack an account or gain access to a server, and then over time, stealthily, be able to collect or capture the information that they are after.

People are trying different types of attacks. So this needs to be in real-time as well. It strikes me that if you want to solve security as well as monitoring, that that is also something that has to be in real-time and not something that you go back to every week or month.

Kumar: You might be familiar with advanced persistent threats (APTs). These are attacks where the attacker tries their best to be invisible. These are not the brute-force attacks that we have witnessed in the past. Attackers may hijack an account or gain access to a server, and then over time, stealthily, be able to collect or capture the information that they are after.

These kinds of threats cannot be effectively handled only by looking at data historically, because these are activities that are happening in real-time, and there are very, very weak signals that need to be interpreted, and there is a time element of what else is happening at that time. What seems like disparate sets of activity have to be brought together to be able to provide a level of defense or a defense mechanism against these APTs. This too calls for streaming-mode analysis.

If you notice, for example, someone accessing a server, a database administrator accessing a server for which they have an admin account, it gives you a certain amount of feedback around that activity. But if on the other hand, you learn that a user is accessing a database server for which they don’t have the right level of privileges, it may be a red flag.

You need to be able to connect this red flag that you identify in one instance with the same user trying to do other activity in different kinds of systems. And you need to do that over long periods of time in order to defend yourself against APTs.

Advances in IT

Gardner: So we have the modern data center, we have issues of complexity and virtualization, we have scale, we have data as a deluge, and we need to do something fast in real-time and consistently to learn and relearn and derive correlations.

It turns out that there are some advances in IT over the past several years that have been applied to solve other problems that can be brought to bear here.

This is one of the things that really jumped out at me when I did my initial briefing with AccelOps. You've looked at what's being done with big data and in-memory architectures, and you've also looked at some of the great work that’s been done in services-oriented architecture (SOA) and CEP, and you've put these together in an interesting way.

Let's talk about what the architecture needs to be in order to start doing for IT what we have been doing with retail data or looking at complex events in a financial environment to derive inference into what's going on in the real world. What is the right architecture, now that we need to move to for this higher level of operations and monitoring?

Kumar: Excellent point, Dana. Clearly, based on what we've discussed, there is a big-data angle to this. And, I want to clarify here that big data is not just about volume.

A single configuration setting can have a security implication, a performance implication, an availability implication, and even a capacity implication in some cases.

Doug Laney, a META and a Gartner analyst, probably put it best when he highlighted that big data is about volume, the velocity or the speed with which the data comes in and out, and the variety or the number of different data types and sources that are being indexed and managed. I would add to this a fourth V, which is verdicts, or decisions, that are made. How many decisions are actually impacted or potentially impacted by a slight change in data?

For example, in an IT management paradigm, a single configuration setting can have a security implication, a performance implication, an availability implication, and even a capacity implication in some cases. Just a small change in data has multiple decision points that are affected by it. From our angle, all these different types of criteria affect the big data problem.

When you look at all these different aspects of IT management and how it impacts what essentially presents itself as a big data challenge or a big data problem, that’s an important angle that all IT management and monitoring products need to incorporate in their thinking and in their architectures, because the problem is only going to get worse.

Gardner: Understanding that big data is the issue, and we know what's been done with managing big data in this most comprehensive definition, how can we apply that realistically and practically to IT systems?

It seems to me that you are going to have to do more with the data, cleansing it, discovering it, and making it manageable. Tell me how we can apply the concepts of big data that people have been using in retail and these other applications, and now point that at the IT operations issues and make it applicable and productive.

Couple of approaches

Kumar: I mentioned the analytics ability as central to monitoring systems – big-data analytics to be specific. There are a couple of approaches. Some companies are doing some really interesting work around big-data analysis for IT operations.

They primarily focus on gathering the data, heavily indexing it, and making it available for search, thereby derive analytical results. It allows you to do forensic analysis that you were not easily able to with traditional monitoring systems.

The challenge with that approach is that it swings the pendulum all the way to the other end. Previously we had a very rigid, well-defined relational data-models or data structures, and the index and search approach is much more of a free form. So the pure index-and-search type of an approach is sort of the other end of the spectrum.

What you really need is something that incorporates the best of both worlds and puts that together, and I can explain to you how that can be accomplished with a more modern architecture. To start with, we can't do away with this whole concept of a model or a relationship diagram or entity relationship map. It's really critical for us to maintain that.

I’ll give you an example. When you say that a server is part of a network segment, and a server is connected to a switch in a particular way, it conveys certain meaning. And because of that meaning, you can now automatically apply policies, rules, patterns, and automatically exploit the meaning that you capture purely from that relationship. You can automate a lot of things just by knowing that.

If you stick to a pure index-and-search approach, you basically zero out a lot of this meaning and you lose information in the process.

If you stick to a pure index-and-search approach, you basically zero out a lot of this meaning and you lose information in the process. Then it's the operators who have to handcraft these queries to have to then reestablish this meaning that’s already out there. That can get very, very expensive pretty quickly.

Even at a fairly small scale, you'll find more and more people having to do things, and a pure index and search approach really scales with people, not as much with technology and automation. Index and search certainly adds a positive dimension to traditional IT monitoring tools -- but that alone is not the answer for the future.

Our approach to this big-data analytics problem is to take a hybrid approach. You need a flexible and extensible model that you start with as a foundation, that allows you to then apply meaning on top of that model to all the extended data that you capture and that can be kept in flat files and searched and indexed. You need that hybrid approach in order to get a handle on this problem.

Gardner: I suppose you also have to have your own architecture that can scale. So you're going to concepts like virtual appliances and scaling on-demand vis-à-vis clustering, and taking advantage of in-memory and streaming capabilities to manage this. Tell me why you need to think about the architecture that supports this big data capability in order for it to actually work in practical terms?

Kumar: You start with a fully virtualized architecture, because it allows you not only to scale easily. From a reach standpoint, with a virtualized architecture, you're able to reach into these multiple disparate environments and capture and analyze and bring that information in. So virtualized architecture is absolutely essentially for you to start with.

Auto correlate

Maybe more important is the ability for you to auto-correlate and analyze data, and that analysis has to be distributed analysis. Because whenever you have a big data problem, especially in something like IT management, you're not really sure of the scale of data that you need to analyze and you can never plan for it.

Let me put it another way. There is no server big enough to be able to analyze all of that. You'll always fall short of compute capacity because analysis requirements keep growing. Fundamentally, the architecture has to be one where the analysis is done in a distributed manner. It's easy to add compute capacity by scaling horizontally. Your architecture fits how computing models are evolving over the long run. So there are a lot of synergies to be exploited here by having a distributed analytics framework.

Think of it as applying a MapReduce type of algorithm to IT management problems, so that you can do distributed analysis, and the analysis is highly granular or specific. In IT management problems, it's always about the specificity with which you analyze and detect a problem that makes all the difference between whether that product or the solution is useful for a customer or not.

Gardner: In order for us to meet our requirements around scale and speed, we really have to think about the support underneath these capabilities in a new way. It seems like, in a sense, architecture is destiny when it comes to the support and monitoring for these large volumes in this velocity of data.

Let's look at the other part of this. We talked about the big data, but in order for the solution to work, we're looking at CEP capabilities in an engine that can take that data and then work with it and analyze it for these events and these programmable events and looking for certain patterns.

A major advantage of distributed analytics is that you're freed from the scale-versus-richness trade-off, from the limits on the type of events you can process.

Now that we understand the architecture and why it's important, tell me why this engine brings you to a higher level and differentiates you in the field around the monitoring.

Kumar: A major advantage of distributed analytics is that you're freed from the scale-versus-richness trade-off, from the limits on the type of events you can process. If I wanted to do more complex events and process more complex events, it's a lot easier to add compute capacity by just simply adding VMs and scaling horizontally. That’s a big aspect of automating deep forensic analysis into the data that you're receiving.

I want to add a little bit more about the richness of CEP. It's not just around capturing data and massaging it or looking at it from different angles and events. When we say CEP, we mean it is advanced to the point where it starts to capture how people would actually rationalize and analyze a problem.

For example, the ability for people in a simple visual snapshot to connect three different data points or three events together and say that they're interrelated and they point to a specific problem.

The only way you can automate your monitoring systems end-to-end and get more of the human element out of it is when your CEP system is able to capture those nuances that people in the NOC and SOC would normally use to rationalize when they look at events. You not only look at a stream of events, you ask further questions and then determine the remedy.

No hard limits

To do this, you should have a rich data set to analyze, i.e. there shouldn’t be any hard limits placed on what data can participate in the analysis and you should have the flexibility to easily add new data sources or types of data. So it's very important for the architecture to be able to not only event on data that are is stored in in traditional models or well-defined relational models, but also event against data that’s typically serialized and indexed in flat file databases.

This hybrid approach basically breaks the logjam in terms of creating these systems that are very smart and that could substitute for people in terms of how they think and how they react to events that are manifested in the NOC. You are not bound to data in an inflexible vendor defined model. You can also bring in the more free-form data into the analytics domain and do deep and specific analysis with it.

Cloud and virtualization are also making this possible. Although they’ve introduced more complexity due to change frequency, distributed workloads etc., they’ve also introduced some structure into IT environments. An example here is the use of converged infrastructure (Cisco UCS, HP Blade Matrix) to build private-cloud environments. At least at the infrastructure level it introduces some order and predictability.

Gardner: All right, Mahesh, we've talked about the problem in the market, we have talked about high-level look at the solution and why you need to do things differently, and why having the right architecture to support that is important, but let's get into the results.

If you do this properly, if you leverage and exploit these newer methods in IT -- like big data, analytics, CEP, virtual appliances and clustered instances of workload and support, and when you apply all those to this problem about the fire hose of data coming out of IT systems, a comprehensive look at IT in this fashion -- what do you get? What's the payoff if you do this properly?

Their needs are really around managing security, performance and configurations. These are three interconnected metrics in a virtualized cloud environment.

Kumar: I want to answer this question from a customer standpoint. It is no surprise that our customers don’t come to us saying we have a big data problem, help us solve a big data problem, or we have a complex event problem.

Their needs are really around managing security, performance and configurations. These are three interconnected metrics in a virtualized cloud environment. You can't separate one from the other. And customers say they are so interconnected that they want these managed on a common platform. So they're really coming at it from a business-level or outcome-focused perspective.

What AccelOps does under the covers, is apply techniques such as big-data analysis, complex driven processing, etc., to then solve those problems for the customer. That is the key payoff -- that customer’s key concerns that I just mentioned are addressed in a unified and scalable manner.

An important factor for customer productivity and adoption is the product user-interface. It is not of much use if a product leverages these advanced techniques but makes the user interface complicated -- you end up with the same result as before. So we’ve designed a UI that’s very easy to use, requires one or two clicks to get the information you need; a UI-driven ability to compose rich events and event patterns. Our customers find this very valuable, as they do not need super-specialized skills to work with our product.

Gardner: What's important to think about when we mention your customers is not just applying this value to an enterprise environment. Increasingly the cloud, with the virtualization, the importance of managing performance to very high standards, these are also impacting the cloud providers, managed service providers (MSPs), and software-as-a-service (SaaS) providers.

Up and running

This sounds like an architecture, an approach and a solution that's going to really benefit them, because their bread and butter is about keeping all of the systems up and running and making sure that all their service level agreements (SLAs) and contracts are being managed and adhered to.

Just to be clear, we're talking about an approach for a fairly large cross-section of the modern computing world -- enterprises and many different stripes of what we consider as service providers.

Kumar: Service providers are a very significant market segment for us and they are some of our largest customers. The reason they like the architecture that we have, very clearly, is that it's scalable. They know that the architecture scales as their business scales.

They also know that they get both the performance management and the security management aspects in a single platform. They're actually able to differentiate their customer offerings compared to other MSPs that may not have both, because security becomes really critical.

For anyone wanting to outsource to an MSP, the first question or one of the first questions that they are going to ask, in addition to the SLAs, are how are you going to ensure security? So to have both of those options is absolutely critical.

Subscription based licensing, which we offer in addition to perpetual licensing, also fits well with the CSP/MSP business model.

The third piece really is the fact that our architecture is multi-tenant from day one. We're able to bring customers on board with a one-touch mechanism, where they can bring the customer online, apply the right types of policies, whether it's SLA policies or security polices, automatically in our product and completely segment the data from one customer to the other.

All of that capability was built into our products from day one. So we didn’t have to retrofit any of that. That’s something our cloud-service providers and managed service provider customers find very appealing in terms of adopting AccelOps products.

Subscription based licensing, which we offer in addition to perpetual licensing, also fits well with the CSP/MSP business model.

Gardner: All right. Let's introduce your products in a little bit more detail. We understand you have created a platform, an architecture, for doing these issues or solving these issues for these very intense types of environments, for these large customers, enterprises, and service providers. Tell us a little bit about your portfolio.

Key metrics

Kumar: What we've built is a platform that monitors data center performance, security, and configurations. The three key interconnected metrics in virtualized cloud environments. Most of our customers really want that combined and integrated platform. Some of them might choose to start with addressing security, but they soon bring in the performance management aspects into it also. And vice versa.

And we take a holistic cross-domain perspective -- we span server, storage, network, virtualization and applications.

What we've really built is a common consistent platform that addresses these problems of performance, security, and configurations, in a holistic manner and that’s the main thing that our customers buy from us today.

Gardner: It sounds as if we're doing business intelligence for IT. We really are getting to the point where we can have precise dashboards, and we are not just making inferences and guesses. We're not just doing Boolean searches on old or even faulty data.

We're really looking at the true data, the true picture in real-time, and therefore starting to do the analysis that I think can start driving productivity to even newer heights than we have been accustomed to. So is that the vision, business intelligence (BI) for IT?

As you add the number of VMs or devices, you simply cannot scale the management cost, in a linear fashion. You want to have continuously reducing management cost for every new VM added or new device introduced.

Kumar: I guess you could say that. To break it down, from an IT management and monitoring standpoint, it is on an ongoing basis to continuously reducing the per capita management costs. As you add the number of VMs or devices, you simply cannot scale the management cost, in a linear fashion. You want to have continuously reducing management cost for every new VM added or new device introduced.

The way you do that is obviously through automation and through a self-learning process, whereby as you continue to learn more and more about the behavior of your applications and infrastructure, you're able to start to easily codify more and more of those patterns and rules in the system, thereby taking sort of the human element out of it bit by bit.

What we have as a product and a platform is the ability for you to increase the return on investment (ROI) on the platform as you continue to use that platform day-to-day. You add more information and enrich the platform with more rules, more patterns, and complex events that you can detect and potentially take automated actions on in the future.

So we create a virtuous cycle, with our product returning higher and higher return on your investment with time. Whereas, in traditional products, scale and longevity have the opposite effect.

So that’s really our vision. How do you reduce the per capita management cost as the scale of the enterprises start to increase, and how do you increase more automation as one of the elements of reducing the management cost within IT?

Gardner: You have given people a path to start in on this, sort of a crawl-walk-run approach. Tell me how that works. I believe you have a trial download, an opportunity for people to try this out for free.

Free trial download

Kumar: Most of our customers start off with the free trial download. It’s a very simple process. Visit www.accelops.com/download and download a virtual appliance trial that you can install in your data center within your firewall very quickly and easily.

Getting started with the AccelOps product is pretty simple. You fire up the product and enter the credentials needed to access the devices to be monitored. We do most of it agentlessly, and so you just enter the credentials, the range that you want to discover and monitor, and that’s it. You get started that way and you hit Go.

The product then uses this information to determine what’s in the environment. It automatically establishes relationships between them, automatically applies the rules and policies that come out of the box with the product, and some basic thresholds that are already in the product that you can actually start measuring the results. Within a few hours of getting started, you'll have measurable results and trends and graphs and charts to look at and gain benefits from it.

That’s a very simple process, and I encourage all our listeners and readers to download our free trial software and try AccelOps.

Gardner: I also have to imagine that your comments a few moments ago about not being able to continue on the same trajectory when it comes to management is only going to accelerate the need to automate and find the intelligent rather than the hard or laborious way to solve this when we go to things like cloud and increased mobility of workers and distributed computing.

It’s about automation and distributed analytics and about getting very specific with the information that you have, so that you can make absolutely more predictable, 99.9 percent correct of decisions and do that in an automated manner.

So the trends are really in your favor. It seems that as we move toward cloud and mobile that at some point or another organizations will hit the wall and look for the automation alternative.

Kumar: It’s about automation and distributed analytics and about getting very specific with the information that you have, so that you can make absolutely more predictable, 99.9 percent correct of decisions and do that in an automated manner. The only way you can do that is if you have a platform that’s rich enough and scalable and that allows you to then reach that ultimate goal of automating most of the management of these diverse and disparate environments.

That’s something that's sorely lacking in products today. As you said, it's all brute-force today. What we have built is a very elegant, easy-to-use way of managing your IT problems, whether it’s from a security standpoint, performance management standpoint, or configuration standpoint, in a single integrated platform. That's extremely appealing for our customers, both enterprise and cloud-service providers.

I also want to take this opportunity to encourage those of your listening or reading this podcast to come meet our team at the 2011 Gartner Data Center Conference, Dec. 5-9, at Booth 49 and learn more. AccelOps is a silver sponsor of the conference.

Gardner: I am afraid we will have to leave it there. You've been listening to a sponsored BriefingsDirect podcast. We've been talking about how new data and analysis approaches from AccelOps are attaining significantly improved IT operations monitoring as well as stronger security.

I'd like to thank our guest, Mahesh Kumar, Vice President of Marketing at AccelOps. Thank so much, Mahesh.

Kumar: Thank you, Dana.

Gardner: This is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks again for listening and come back next time.

Listen to the podcast. Find it on iTunes/iPod. Download the transcript. Sponsor: AccelOps.

Connect with AccelOps: Linkedin, Twitter, Facebook, RSS.

Transcript of a BriefingsDirect podcast on how enterprises can benefit from capturing and analyzing systems data to improve IT management in real-time. Copyright Interarbor Solutions, LLC, 2005-2011. All rights reserved.

You may also be interested in:

BriefingsDirect Transcripts

Wednesday, December 14, 2011

Case Study: How SEGA Europe Uses VMware to Standardize Cloud Environment for Globally Distributed Game Development

Monday, December 12, 2011

Efficient Data Center Transformation Requires Consolidation and Standardization Across Critical IT Tasks

Wednesday, November 30, 2011

Big Data Meets Complex Event Processing: AccelOps Delivers a Better Architecture to Attack the Data Center Monitoring and Analytics Problem

Principal Analyst

Translate this Blog

Folo My Flipboard Magazines

Search Blog

Subscribe to Podcast Via iTunes

BriefingsDirect Network

Blog Archive