Python Tutorial: Developing Custom Metrics for Insurance AI Models

Do not index

Canonical URL

Introduction

The insurance industry is embracing artificial intelligence (AI) at an accelerating pace, with 77% of respondents in a 2024 recent survey indicating they are in some stage of adopting AI in their value chain.

As AI models become central to operations, evaluating their performance becomes more dire. Traditional metrics like accuracy or mean squared error can only go so far in assessing model effectiveness.

In this blog, we will discuss some custom metrics that you can use in your evaluation and show how they can be monitored in NannyML Cloud.

How to Get Ahead With Custom Metrics

Insurance companies are often concerned not just with the accuracy of their predictions but also with their financial implications. Custom metrics offer a way to measure model performance in a more nuanced and context-specific manner than traditional metrics.

We’ll explore two machine learning applications in insurance and the custom metrics suited to each:

Policyholder Churn Prediction

Premium Value Prediction.

Let’s get started with regression usecase.

A Regression Use Case: Predicting Premium Charges

In most insurance companies, premium charges are the main revenue driver, and they need to be calibrated to balance customer retention and company profitability. Overcharging might push customers to switch providers, while undercharging can lead to financial losses.

Several evaluation metrics, like RMSE, are used to monitor model performance.

RMSE metric is popular because it penalizes large errors more heavily than small ones, making it useful when large deviations from the actual value are particularly undesirable. Here, this metric would tell us how close the predicted premiums are to the actual premiums across all customers.

RMSE metric , a standard metric in NannyML Cloud

But there is a limitation, it doesn't differentiate between a model undercharging a customer by 10% or overcharging by 10%.

Both errors are treated the same despite having potentially different business impacts.

Combined Ratio

The Combined Ratio is a key performance metric in the insurance industry that evaluates the profitability of an insurance company. A combined ratio of less than 100% indicates profitability, while a ratio above 100% signals an underwriting loss.

Loss Ratio: This is calculated as the ratio of actual losses (y_true) to the predicted premiums (premium). It reflects the percentage of premiums that are lost due to claims.

Expense Ratio: A fixed expense ratio (20% in this case) is applied uniformly across all instances. This represents the operational costs as a proportion of premiums.

and using these two values you can derive combined ratio like so,

To add any regression custom metric in NannyML Cloud, we need two Python functions.

A Loss function that wraps the metric’s computation and an aggregate function that defines how the metric should be aggregated.

In the following implementation, we define two functions: loss and aggregate.

import numpy as np
import pandas as pd

def loss(
    y_true: pd.Series,
    y_pred: pd.Series,
    chunk_data: pd.DataFrame,
    **kwargs
) -> np.ndarray:

    
    expenses_ratio = 0.2  
    premium = y_pred  
    loss_ratio = y_true / premium
    expense_ratio = np.full_like(loss_ratio, expenses_ratio)  
    combined_ratio = loss_ratio + expense_ratio
    return combined_ratio.values

The aggregate function computes the mean of the combined ratios calculated in the loss function.

import numpy as np
import pandas as pd

def aggregate(
    loss: np.ndarray,
    chunk_data: pd.DataFrame,
    **kwargs
) -> float:
    mean_combined_ratio = np.mean(loss)

    return mean_combined_ratio

On adding these two functions in NannyML Cloud Dashboard and we get the following result.

Incase your expense ratio is not fixed, you might find the need to implement this metric more dynamically. In that case, add the expense ratio as a feature in your model input to use in your function.

📕

Metrics are at the heart of everything we do in data science, guiding our decisions and shaping our models.

The Little Book of ML Metrics is the guide you didn’t know you needed. Packed with everything about machine learning metrics, it’s the resource that can elevate your workflow.

Grab a copy here: https://www.nannyml.com/metrics

Let's now shift our focus to classification.

A Classification Use Case: Policyholder Churn Prediction

The goal of policyholder churn prediction is to identify which customers are likely to leave. By analyzing historical data and customer behaviors, insurers can proactively develop strategies to engage at-risk policyholders. This targeted approach not only enhances customer retention efforts but also ultimately safeguards revenue

Understanding how well your model differentiates between customers who are likely to remain with the company and those who may leave is essential for developing effective retention strategies.

Most of the times, data scientists fall on the ROC AUC metric to capture this notion of the model.

ROC AUC - a standard metric in NannyML Cloud

If a model incorrectly predicts that a policyholder won’t churn (a false negative), the company may miss the opportunity to take necessary actions to retain that customer, potentially resulting in revenue loss—what metric can effectively capture that impact?

Expected Loss of Premium

The Expected Loss of Premium or ELP metric is useful when we wish to understand the financial ramifications of policyholder churn. It quantifies the potential revenue loss when customers switch to competitors.

The ELP metric focuses explicitly on the monetary risks associated with churn, allowing insurance companies to prioritize their retention strategies.

For a classification use case, you again need only two Python functions, but this time, a calculate function and an estimate function.

import pandas as pd
import numpy as np

def calculate(
    y_true: pd.Series,
    y_pred_proba: pd.DataFrame,
    chunk_data: pd.DataFrame,
    **kwargs
) -> float:
    premium = 100.0  
    y_true = chunk_data['y_true'].to_numpy()
    y_pred_proba = chunk_data['model_predicted_probability'].to_numpy()
    
    data = pd.DataFrame({'y_true': y_true, 'y_pred_proba': y_pred_proba})
    data.dropna(axis=0, inplace=True, subset=['y_true'])
    
    y_true, y_pred_proba = data['y_true'].to_numpy(), data['y_pred_proba'].to_numpy()
    
    if len(y_true) == 0:
        return np.nan
    total_expected_loss = premium * np.sum(y_pred_proba)
    
    return total_expected_loss

import pandas as pd
import numpy as np

def estimate(
    estimated_target_probabilities: pd.DataFrame,
    labels: list[str],
    **kwargs
) -> float:
    premium = 100.0
    estimated_churn_probabilities = estimated_target_probabilities.iloc[:, 0]
    estimated_churn_probabilities.dropna(inplace=True)
    if estimated_churn_probabilities.empty:
        return np.nan
    total_expected_loss = premium * estimated_churn_probabilities.sum()

    return total_expected_loss

Here’s the resultant plot after adding them in the NannyML Cloud

Bonus: Tutorial for Adding Any Custom Metric in Your Dashboard

Log into your NannyML dashboard and find the Custom Metrics button in the top right corner.

You'll find all your metrics ordered according to their problem type.

On clicking “Add a new metric” you will find a window like the one below:

After entering the details, paste your well-tested metric code. In NannyML Cloud, you can set Metric limits to manage threshold values. If you don't set them, the thresholds might become too large, distorting the plot scale. Once all the information is filled out, save your metric.

To apply it, navigate to the Model Dashboard > Management > Settings > Performance for your chosen machine learning model. By clicking the "+ Add Custom Metric" button, you'll see a list of applicable metrics that can be added to your models.

After adding your custom metric, head over to Monitoring > Summary and run performance monitoring to calculate it.

If any issues arise, check the Logs section in the Model Management area. Click on the File Icon to download a detailed .txt file that can help with debugging.

If everything goes well, your metric will be displayed in the Performance page.

Conclusion

The blog covered two key use cases: Premium Value Prediction (regression) and Policyholder Churn Prediction (classification).

By adopting custom metrics, you can align your model's performance with the financial goals of your insurance business.

Machine learning models face inevitable challenges after deployment. As real-world conditions change, issues like covariate shift, concept drift, and data quality degradation can occur. These problems result from shifts in data distribution or the real-world environment, causing model performance to diminish over time.

This is exactly what NannyML helps you solve with Post-Deployment Data Science. It’s all about monitoring and maintaining models after they’ve gone live, identifying shifts in data, and taking corrective actions before they impact decision-making.

Want to know how to get started? Schedule a demo with the NannyML founders, who can help you find solutions tailored to your specific use cases.