Commentary--Given the critical nature of today's business applications, an enterprise should develop a comprehensive strategy and consider the deployment of tools and processes to aid IT in quickly isolating and resolving application problems across the lifecycle. With the time, cost and negative impact resulting from software defects, it is essential for businesses to keep their applications functioning properly, and mitigate support costs.
Application downtime costs can range from $100,000 per hour to millions of dollars per hour. Throughout the 7.2-year average lifespan of major enterprise applications, software maintenance and support costs represent from 60 to 80 percent of total software lifecycle costs. The process typically employed today to resolve application performance problems remains a manual, error-prone and communications-intensive activity that consumes many resources, including support personnel, operations and the developers.
The resolution process for enterprise companies usually includes iterative attempts to gather information and document what was happening at the time the problem occurred--either through system logs, end-user interviews, or even costly on-site trips (SWAT team deployments). This information is not usually used to resolve the problem, but only to replicate the conditions under which it occurred.
In a recent report, Fawcett Publishing discovered that only 9 percent of IT professionals surveyed described themselves as very satisfied with their diagnostic software and processes. About one-third reported that they rely mostly on manual and homegrown approaches, such as "eyeballing" logs, spreadsheets, or internally written scripts. Much of problem detection is still based on external input—however symptoms often do not reflect the real root cause. On average, problem replication efforts involve more than 6.5 experts, who must replicate the problem 4.8 times (with finger pointing and further interview attempts with frustrated end-users) before establishing the root cause. Actually issuing the fix usually takes 20 percent or less of the resolution cycle.
However, all too often, the problem cannot be replicated, and thus cannot be resolved.
The application performance paradox
One would think that as software applications become indispensable to business, they would likewise become more dependable and reliable. Instead, the rising tide of system complexity and interdependency is making application performance problems inevitable, intractable, and more elusive than ever.
What's needed is a performance management strategy that is not only optimized for rapid problem detection and isolation, but also provides sufficient insight into the application's behavior to support the rapid root cause analysis needed to actually solve the problem.
Application problem resolution systems (APRS) are proving to be a valuable asset in a comprehensive program to manage application performance. An APRS can be deployed to automatically capture a record of the execution history of a poorly performing application to dramatically accelerate the problem resolution process. An APRS can help ensure that application problems are not only detected as soon as they occur, but also that they can be resolved as quickly as possible--which is the end-game, after all.
Drilling down on the "T" for root cause analysis
To comprehend this approach to finding and resolving performance problems, picture the letter "T." Identifying and isolating performance problems would be the top line of the T. However, problem detection and isolation only exposes the surface issue that a problem indeed exists. But if you are able to drill down on the issue, you can actually find the root cause of the problem (the vertical line of the T), and be able to complete the picture for diagnosing and ultimately resolving performance issues.
Using an APRS, organizations can monitor actual application execution and can automatically capture a synchronized, real-time log of how an application is executed from user actions, system events, performance metrics, configuration data, and finally code execution flow--and then show the exact root cause of the problem, without ever having to recreate the issue in a lab environment.
Solving Problems with APR
Application problem resolution (APR) technology automates the entire process and eliminates the need to replicate the problem in order to pinpoint the root cause of any type of application problem (including functional, configuration, performance, and end user errors) through recording of full problem data.
As the application executes, APR systems records actual problems at multiple synchronized levels, including user interactions, application configuration, application performance, application calls, and code-level execution flow--all without interrupting the application use. Recording data throughout the actual application environment is the only way to eliminate the need for recreating problems in the lab.
Instead of relying on guesswork to recreate scenarios, APR systems allow actual events to be analyzed to reveal the true end user activity, correlated with the system level activity and associated code. By capturing actual problem history in a centralized repository, APR systems provide the basis for team collaboration and communication.
Finding the needle in the haystack
Before your IT staff begins the next iterative process of triaging and resolving a complex performance problem, ask yourself if there is an easier path to resolution. Application problem resolution systems allow you to rapidly pinpoint the root cause of any application problem.
Such systems automate the manual information gathering, completely eliminate problem recreation, and dramatically accelerate the root cause analysis process--saving up to 80 percent of the cycle time typically required to solve problems including performance issues. And that delivers both top-line and bottom-line advantage.
biography
Paul Farr is a vice president of BMC Software, maker of Appsight--application problem resolution software.




