Funnel plots are a graphical analysis technique for identifying outliers in your data. Funnel plots are visually appealing and really simple to understand, despite some fairly involved maths and stats underlying them. TIBCO Spotfire allows you to draw funnel plots using its inbuilt scatter plot visualisation, with statistical functionality included via the TIBCO Enterprise Runtime for R engine, which is embedded within Spotfire.

Funnel plots are particularly useful for comparing the performance of assets or institutions or evaluating some form of risk measure.

This funnel plot shows hospital readmission rates for acute myocardial infarction in the USA – each hospital is represented by a point on the funnel plot. It is immediately clear which hospitals are in the center of the readmission rate distribution – these points are colored Green! The orange/amber points are those hospitals which are slightly outside the norm and the red points are the outliers of the distribution. The funnel plot is naturally centered about the mean of the readmission ratio on the y-axis. The number of discharges is displayed on the x-axis, and this drives the confidence one has regarding the readmission rate. The boundaries are curved to reflect the lower confidence we have in readmission rate estimates for hospitals with a smaller number of discharges. One can think of this as a sideways bell curve, with the hospitals in green forming the body of the readmission rate distribution.

Let’s zoom in on part of the plot:

Three points are marked on the plot. We first examine the green point with approximately 100 discharges – the readmission rate (ratio between readmissions and discharges) is 28%. This hospital is nicely within the “green” or normal area of the readmission rate distribution. Now look at the two points we have marked at the right of the plot – with 500 and 650 discharges respectively. The amber point has a 27% readmission rate and the red point 28%. This illustrates the effect of sample size (number of discharges) on our confidence in assessing departure of readmission rate from the center of the distribution (21% readmission). As the number of discharges increases, we are more confident in a smaller difference in readmission rate representing a significant departure from the average readmission rate.

Here are some further aspects of the funnel plot:

Where have all the green spots gone? Easy – we filtered them out using Spotfire’s in-built filtering functionality! This allows us to quickly focus on the amber and red points, representing hospitals with readmission rates that are significantly above or below the average readmission rate.

Hospitals with high readmission rates may not be desirable for dealing with myocardial infarction from a patient perspective. Moreover, these hospitals may be penalized under the Affordable Healthcare Act, e.g. re Medicare reimbursement. Hospitals with significantly lower ratios than the norm are also interesting. Perhaps the hospital might be under-reporting readmissions, or failing to track them in another way? Even more seriously, the lower readmission rates could be due to more patients than normal dying once they return home. In general, hospitals with significantly higher or lower readmission rates deserve a closer look! This is simply achieved in Spotfire via drill-down or details visualizations linked to the funnel plot.

From a business perspective, hospitals and healthcare organizations can track readmission rates over time and trigger interventions when rates enter the amber zone. This enables corrective actions to be taken before patient outcomes get to a critical state and/or readmissions reach levels where financials are impacted – either through operations or reimbursement. This can be achieved with minimal effort via inbuilt integration of Spotfire with the TIBCO Fast Data platform e.g. via triggered analysis through TIBCO Streambase or Business Events. The intervention actions may also be automated via Business Process Management (BPM) systems such as TIBCO Active Matrix BPM.

Other use cases to consider for funnel plots could be comparing oil well production, retail store performance or any other measure where like-for-like is being compared and it is important to identify outliers from the norm.

So how do we build a funnel plot in Spotfire? We use a standard Spotfire scatter plot and using sample size on the X axis (Number of Discharges) and the rate on the Y axis (Readmission Rate). In our example, the expression for calculating the rate is:

**Real([Number of Readmissions]) / Real([Number of Discharges])**

so it’s just the number of readmissions divided by the number of discharges.

How about those magical control limit lines? In this case, they are drawn according to the Poisson distribution; and we have configured the Spotfire analysis to automatically adjust to the size of the population and the distribution of the data.

Spotfire has a unique ability to embed statistical functions within calculated columns. The statistical functions are evaluated using the TIBCO Enterprise Runtime for R engine (TERR). TERR is a highly scalable and commercially supported implementation of the R language and it integrates seamlessly with Spotfire and other TIBCO products. Within Spotfire, it’s possible to create custom Expression Functions (on the fly scripts within a Spotfire analysis that return columns of data) and Data Functions (server side managed functions for enterprise use, returning general data structures). The Data Function Properties dialog is shown below; the Expression Functions tab shows the functions we have created for the funnel plot. We have defined the PoissonFunnelLower and PoissonFunnelUpper functions:

Now that these two functions have been defined, it is just a matter of using them in Spotfire e.g. to create calculated columns, like the below:

We have defined four columns – Low95 (lower 95%), Low999 (lower 99.9%) and High95 and High999. These PoissonFunnelLower and Upper functions are usable in any Spotfire calculated column once they are defined.

We can now directly calculate the funnel plot colors for the points with this expression:

**If(((Real([Number of Readmissions]) / Real([Number of Discharges]))>[High999]) or ((**

** Real([Number of Readmissions]) / Real([Number of Discharges]))<[Low999]),”Red”,**

** If(((Real([Number of Readmissions]) / Real([Number of Discharges]))>[High95]) or ((**

** Real([Number of Readmissions]) / Real([Number of Discharges]))<[Low95]),”Amber”,**

** “Green”))**

The limit lines are drawn on the graph in a similar way.

These expressions can be saved as data functions in the Spotfire library so that anyone with legitimate credentials can use them in any Spotfire analysis. They can also be shared outside of the library as a Spotfire function definition file (.sfd); these are readily imported and exported from the library.

In summary, funnel plots are powerful graphical analyses that are easily constructed in TIBCO Spotfire. Funnel plots provide an automated method for identifying outliers in any numerical data. As deployed in Spotfire, these plots automatically adjust to the volume and distribution of the data in the analysis. We have illustrated the funnel plot with an underlying Poisson distribution, which is appropriate for rates. For other measures eg proportions and continuous measurements, binomial and Gaussian reference distributions are appropriate. Spotfire data functions for all of these analyses are available from the authors.

References:

Spiegelhalter, D.: Funnel plots for comparing institutional performance, Stat Med. 2005

Lane, P. et al: Graphics for Meta-analysis; in Krause and O’Connell: A Picture is Worth a Thousand Tables, Springer, 2011

Phillips, M: TIBCO Spotfire – A Comprehensive Primer, Packt Publishing, 2015

**Andrew Berridge**

** Michael O’Connell**

** TIBCO Data Science**

## 4 Comments

Great post. I am also a fan of the funnel plots. Where can we find the PoissonFunnelUpper/Lower functions?

Hi David, here’s a link to a DXP file that has those data functions defined:

https://drive.google.com/uc?export=download&id=0B8Kg0T75ytoiQ0RsdkszbmIxd1U

I tried to open the link above but don’t have version 7.0. I am running 6.5. Do you happen to have a txt file with the functions?

Dave… thanks for sending the DXP. I was able to open it and look at the function but it gives an error when you edit the calculated columns saying something like:

“can not open connection to c:/test.RData”

Does that file go together with the R functions?

here is the calculated column expression:

PoissonFunnelLower([Number of Readmissions],[Number of Discharges],0.95)

Any thoughts?