7 Questions on How to Use Machine Learning for Anomaly Detection

How to Use Machine Learning for Anomaly Detection TIBCO
Reading Time: 3 minutes

Asking questions is one of the best ways to learn. But sometimes you don’t know where to start or what to ask—especially on a topic like anomaly detection that you’re still becoming familiar with. In that case, it’s best to listen to the questions of others and let their line of thinking guide your learning. Below are some questions we received during our “Ask Me Anything: Anomaly Detection” webinar to help you get started.

What’s the difference between outliers and anomalies?

Outliers are observations that are distant from the mean or location of a distribution. However, they don’t necessarily represent abnormal behavior or behavior generated by a different process. On the other hand, anomalies are data patterns that are generated by different processes.

Are there any applications for anomaly detection in pharmaceuticals?

There are many applications of anomaly detection in the pharmaceutical life sciences space. Including, process monitoring and quality control using Statistical Process Control (SPC) or Quality Control (QC)  and Multivariate SPC (MSPC) charts in pharmaceutical manufacturing. Timely anomaly detection is critical to avoid abnormal events and adhere to safety standards. Identifying anomalies in over-the-counter transactions can be used to fight prescription abuse in pharma retail data. And real-time detection of anomalies in multi-parameter clinical trial data helps ensure the success of clinical trials.

Are GANs also used for anomaly detection? If so, could you please provide an industry use case.

Generative Adversarial Networks (GANs) are novel unsupervised learning methods that are very efficient in identifying anomalies. Since GANs are iterative by design and the adversarial training is optimized to leverage reconstructed samples to reduce residual loss, they work well in semi-structured and unstructured data. They are specifically very useful for medical image analysis (helping radiologists find hard-to-identify tumors), facial recognition, text to image translation, etc.

Does correlation data affect anomaly detection? In what ways and how could we diminish the effects? Is it preferable to clean and remove correlations before starting anomaly detection?

As mentioned in the webinar, we do not think that correlations would affect anomaly detection, but we have many techniques available to help determine how to treat correlated variables. One suggestion would be to reduce the number of dimensions with a technique like Principle Component Analysis (PCA). 

What is a suggested algorithm that would be appropriate for anomaly detection related to identifying unusual activities in network activities or data?

As mentioned in the webinar, there are many methods and algorithms that work well for various applications and use cases of anomaly detection. Some of them are Recurrent Neural Network (RNN), Generative Adversarial Network (GAN), Isolation forests, Deep Autoencoders, etc. If you are specifically interested in Network/Graph analytics, the two main methods used for identifying anomalies in network graphs are the Direct Neighbour Outlier Detection Algorithm (DNODA) and Community Neighbour Algorithm (CNA).

In my current work “novelty” is the main thing we struggle to detect. Quality control charts work well for known patterns, but automated identification of new patterns is harder. I’m hoping to get some ideas for tools to help in that area.

For univariate quality control charts, Western Electric rules are used to detect a handful of common patterns. Classic multivariate methods such as Partial Least Squares (PLS) will capture patterns involving more than one variable that would not be detected by univariate methods. Autoencoder is the most comprehensive tool that will cover the widest range of different patterns. It can capture patterns that are multivariate, cyclical, nonlinear, and have interactions. You train the autoencoder using a set of ‘normal’ data’; any patterns present in new data that were not present in the training set will be flagged. We have a template for Spotfire on the TIBCO Community.

Would reducing the dimensions by doing PCA affect the anomalies in a dataset? Would it lead to the disappearance of the anomalies? If so, how could it be prevented?

Doing PCA will capture some percentage of the variance in the original dataset. So the way we use PCA to do anomaly detection is to compute the ‘distance’ from the original point to the point represented in the lower-dimensional space. The larger the distance (i.e. the more that was ‘lost’ when mapping the observation to the lower-dimensional space), the more we regard it as an anomaly.

Chances are you had a similar question to the ones above, but didn’t know how to ask it. In fact, if any of the questions above spark another one in your mind, contact us and we can answer it for you. And watch the full on-demand “Ask Me Anything: Anomaly Detection and  Machine Learning” webinar for more on the basics of anomaly detection, common use cases, and some key techniques to keep in mind as you get started.