Machine Learning: A dynamic understanding of risk

Machine Learning provides endless opportunities to the financial services industry. Its ability to autonomously sift through large amounts of data and provide new insights is unprecedented.

A large insurance company recently allowed us - for a selected line of business - to access its claims and offer underwriting data for a 3-5 year historical period. Our goal was to assess their risk appetite definition and see if we could create a better predictor for future claims performance.

We defined claims performance as a simple metric: the ratio of annualized insured amount relative to annualized claims volume. In layman’s terms: “how much on average are we insuring versus how much are we paying”. We separated the insurance company’s customers into into 3 broad categories using the claims performance metric: “low” with the bottom 50%, “medium” with the middle 20%, and “high” with the top 30%.

The goal was to predict which claims performance bucket a customer would land into, given only the information available at offer time. We opted for a “random forest” machine learning model and trained it using the available historic data, repeatedly tuning hyperparameters to avoid overfitting (3-5 years is not a lot of data for machine learning). This process took about 5 days; training a new model takes about 30 minutes.

Image 1_Histogram-1

Our model was able to predict >80% of the time, whether a client would land into the “high” or “low” categories. Note that we didn’t focus on predicting the “medium” claims performance layer (20% of customers), which we used as a “buffer” to separate the “high” vs. “low”. If we include predictions for “medium” then the prediction accuracy descends to 70%: a still impressive result.

For the anecdote, we initially thought that the predictions were too strong and spent a couple of days trying to debug our model, but found no mistakes. We think that these findings indicate a new way forward, supplementing this insurer’s risk appetite definitions with a dynamic and data-driven assessment. Results were presented to the CEO and the company’s leadership team. It’s clear that we can build a more potent model, moving beyond our basic “claims performance” metric and predicting more sophisticated metrics, or tackling other use-cases.

Image 2_Random_forest_overview

These are exciting times for companies willing to introspect their “core beliefs”: metrics critical to operations usually managed by Excel and updated once a year. Machine learning is an ideal tool to supplement such “beliefs” with dynamic, fact-based and historically accurate predictions. We are continuously surprised by how it uncovers blind spots and creates tangible insights on a variety of topics: performance, pricing or customer profile behaviors.



Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et