You are here

3 ways to put predictive analytics to work in IT Ops

public://pictures/sriram_headshot.jpeg
Sriram Parthasarathy, Senior Director, Product Architecture, Predictive Analytics , Logi Analytics

Predictive analytics can benefit your IT team in many ways, but one that's particularly attractive is the ability to monitor the health or status of an application so that you can predict and respond to application outages.

The ability to prevent failures before they happen is a big win. It saves time and money and creates a more resilient IT infrastructure.

As you consider how to put predictive analytics to work in your IT department, here are three ways to get started. 

The State of Analytics in IT Operations

1. Identify root causes for application performance 

By identifying root causes for application performance using unsupervised techniques, IT teams can focus on the right set of areas in which to take action.

In most instances, people only view application, error, and performance logs when there is a significant problem with the IT infrastructure. One technique that can provides a good insight into application performance involves taking all of your log data, along with related configuration information, and creating multiple clusters with it. Then you study the characteristics of the various attributes within each cluster.

Those clusters can provide IT deep insights into what tweaks an IT administrator can make to achieve ideal performance and avoid specific bottlenecks.

Figure 1. Use clustering algorithms to transform application log data into logical application performance clusters.

The great thing about this technique is that it incorporates seasonal changes in the performance pattern by adding a time component to the cluster. For example, what is important in the summer may be different than what's important in the winter.

In addition, by studying the clusters, you can determine which combination of parameters will contribute to the best application behavior for a set of conditions, and which others tend to lead to errors under specific conditions.

2. Monitor application health in real time

Performing real-time monitoring of application health via multi-variate machine-learning (ML) techniques allows IT teams to catch and respond to a degradation of application health in a timely manner.

Most applications rely on multiple services to capture the true health of the application. Typically, you'd take into account performance metrics from all of  these services and sources; thus, it's a multi-variate prediction problem.

The key to monitoring application health is to identify what is a normal behavior within an application and what is not. To start this process, compile all of the available data generated by the application. That data might consist of configuration data, application logs, network logs, error logs, performance logs, and more.

Once you have the data compiled, analyze the past data during a time in which the application was in a good state. By allowing the anomaly detection algorithm to learn what the application's normal behavior is, you'll be able to set the basis for your predictive model.

As new data enters the application, this trained model will identify whether the incoming data exhibits normal behavior or not. If the data is tagged as not showing normal behavior, it will automatically be identified for an IT administrator to follow up.

[ Webinar: 5 Things Every SecOps Team Wants Their NetOps Team to Know ]

3. Predict application outages before they happen

Predicting application downtime or outages before they happen helps IT teams initiate backup servers and perform needed maintenance on that application without any downtime. This can save an organization time and money, in addition to saving IT leaders from headaches.

Application downtime is a significant drain on a company's financial health and is a major pain point for IT leaders. Before an application outage, the IT infrastructure leaves lots of indirect clues hours, or even days, before it dies.

The key to predicting application outages is to create a predictive model based on historical data prior to past failures. Some examples of historical data points that work well for this model include application logs, network logs, and error logs. You can then use this data to identify patterns that exist within a given application before it fails.

The model will learn those patterns and continue to monitor for similar occurrences, predicting future failures before they happen. With this type of predictive model in place, IT personnel can take preventive action at the right time.

Figure 2. A classification model trained to predict application outages. 

When creating this model, it's important to capture which action you took for different conditions and what the outcomes where. This enables you to create a subsequent model that can predict the best course of action for a given failure condition.

An example of this is a simple service resource allocation. These predictive models can work behind the scenes of your IT infrastructure. So, for example, if an application is about to fail, the application could automatically pinpoint the microservice that's causing the problem, trigger an operation to point the app to a backup service, and alert IT staff to perform maintenance on the specific failing microservice.

This form of predictive modeling will help organizations better allocate their IT teams within their companies, since there will no longer be a need to constantly monitor and reconfigure the data center.

Get going

There are many ways, beyond the three mentioned here, that your IT operations teams can use predictive models. IT leaders are just starting to explore how applications embedded with predictive analytics can benefit their infrastructures and overall business objectives. As data continues to dominate futuristic business models, the adoption rate will continue to escalate.

Get started with predictive analytics by picking a key problem that has demonstrable customer pain. Then build out a proof of concept around that pain to showcase the value to key stakeholders who can help move the project along.

Infographics courtesy of Logi Analytics