Welcome to our new blog series InsurTech Introspect, Thoughtful Conversations with Betterview. We invite you to join our Chief Science Officer, Dan Shoham, in a series that explores and explains the scientific concepts that underpin our solutions. InsurTech Introspect aims to make complex subjects easily understood so that together, we can continue to push the industry forward in a thoughtful way. If you have any feedback on this post, we’d love to hear it: email@example.com. Without further adieu, here’s our first post:
Underwriting is one of the oldest forms of predictive analytics. The Babylonian Hammurabi Code (circa 1750 BC) detailed merchant shipping insurance. After the Great Fire of London (1666), dozens of commercial fire insurers were funded, heralding the modern age of property insurance. Benjamin Franklin founded America’s first fire insurance company (1752).
With such deep roots, the arts and science of risk assessment and competitive premium pricing have had plenty of opportunities to evolve, self-correct, and mature. Whenever an insurer discovered a predictive method to incrementally improve risk assessment, they gained a competitive advantage, thereby forcing their competitors to either follow suit or exit the market. Over the millennia, actuarial risk prediction became so good that finding new incremental predictors became an act of diminishing returns. Just because something that wasn’t measured before and is now predictive of risk is no assurance that it’s incrementally predictive to an already mature methodology. In this post, we will endeavor to answer if geospatial imagery AI really can provide a genuine quantifiable, dollar-denominated, hard value proposition.
Geospatial imagery does bring new predictive information. If there is a defect on a roof, or if it is poorly maintained, it wouldn’t be captured by standard actuarial predictive factors. That’s why, even in the mature underwriting process, site inspections are still economically justifiable, at least some of the time.
However, the question remains: Is the incremental predictive value enough to drive a legitimate business case?
How Does Geospatial Imagery AI Work?
With the wide availability of aerial and space imagery capturing nearly every structure on the planet, the question is not about obtaining the imagery, but rather what to do with it. Not surprisingly, early adopters of geospatial imagery in underwriting simply replicated the annotations and forms previously performed by on-site inspectors. In the past, a roof inspector would log observations of rust, debris, tarp, damage, ponding, or other maladies on-site; today, these annotations can be made using imagery. Shape, material, footprint, and other parameters can also be surmised. The process initially involved trained human annotators, and over time, transitioned to machine vision and other automated algorithms.
As a cost-savings and process standardization method, annotation automation is valuable. However, to generate game-changing value for insurers and society, geospatial imagery derived AI must directly impact the money side of the insurance equation: premium and losses. The AI needs to find, directly from the geospatial imagery, preponderance to claim losses. AI-generated insights must go beyond what’s already been accounted for during the actuarial underwriting process and reflected in the premiums. In other words, rather than being trained to find all annotations, the AI system should instead be trained to predict claims risk and severity.
Figure 1 (below) shows the architecture of a machine vision system trained to predict claim losses. Here we combine the traditional machine vision Deep Learning Convolutional Neural Network (DLCNN) system (which finds individual features) with a risk-prediction scoring system (which identifies claims risk and severity). By allowing the combined neural network to train directly against known claims, the system is optimized to locate the risks, rather than simply annotate what’s readily apparent. Furthermore, it can find correlations between observable, but not annotatable, features and claims. For example, a poorly maintained yard, while not directly contributing to structure risks, may indicate an overall attitude of indifference, which may be predictive of losses even if there are no externally visible defects with the structure itself. A trainable predictive model that combines the machine vision and scoring elements into a unified optimization can take advantage of such correlations.
How Does Scoring Work?
Perhaps the best-known example of scoring is the FICO consumer credit score. We’ve all heard about it, and perhaps you even track your score, but how does it work?
To answer this question, we note that behind every established scoring solution there is a history of a human-mediated process that was gradually automated and scaled. Lending is at least as old as insurance – even the Hammurabi Code detailed credit risk pricing.
As with insurance underwriting, lenders must carefully balance risk management and pricing competitiveness. Since the late 19th century, consumer credit risk evaluation in the US utilized Credit Bureaus. These organizations collect, store, and market detailed individualized histories of consumer credit behavior: outstanding loans, payments made, defaults, bankruptcies, and similar data.
Before the advent of credit scoring (the earliest credit scores were created in the 1950s, but wide usage did not come about until the 1970-80s), professional loan officers would review applicants’ credit reports to peek into the consumer’s overall credit posture, their other obligations, capacity to pay, and financial maturity – all presumed predictors of default risk. The modern age of predictive analytics, arguably, started with credit scores. Scorecards, as they were originally named, are statistical models that combine all the predictive information available in the credit report to generate a single number indicating the likelihood of future default.
Contrary to popular belief, the credit score does not evaluate the probability of default. That probability is dependent on many macroscopic factors, for example, the future state of the economy and employment rate, that are well outside the control of the borrower or predictable from their credit report. Instead, it ranks borrowers by the likelihood of default. All else equal, a borrower whose credit score is 699 is judged more likely to default than a 700, regardless of the details of the loan, the future state of the economy, or any other factor. From the lender’s perspective, that’s all that’s needed.
The lender now has an unlimited tuning mechanism with which they can micro-optimize their entire actioning and credit policy. Every lending policy, every due diligence protocol, every marketing outreach, every collateral demand, and every portfolio maintenance procedure can now be confidently designed with score threshold cutoffs. Below a score of X, we won’t lend at all; above a score of Y, and we don’t need employment verification, etc. Figure 2 (below) shows the 7 steps of how credit scoring works in the real world.
In the 1990s, and even more so in the 21st century, scoring expanded into other areas, including fraud control, marketing, and elsewhere. The paradigm was the same: a low frequency, high-impact event, is statistically predicted using known parameters, and protocols are designed around score thresholds. For example, decline a credit card transaction if it scores above X, or send out marking outreach to prospective consumers scoring above Y. Figure 3 (below) shows how credit card fraud scoring works following substantially the same 7 steps.
The scoring methodology is directly translatable to using geospatial data to aid in underwriting and renewing structure insurance. Following the well-treaded methodologies of credit and fraud scoring, the same 7 steps protocol is intuitive and straightforward. Figure 4 (below) shows a typical Betterview scoring solution.
The ever-widening use of scoring created some interesting chicken-and-egg problems. Business managers needed to see model performance parameters to decide which thresholds, protocols, and action choices will generate the best financial returns, and modelers needed to know exactly how their model would be used to optimize those same performance parameters. What was needed was a model performance metric that both sides could work with: modelers could optimize this metric regardless of how the model will be used, and business managers could develop policies driven by this performance metric and also decide how much investment in modeling is merited based on a quantifiable anticipated value proposition. Of course, such a metric would need to simultaneously be scientifically rigorous and precisely measurable for the modelers, while also intuitive and directly translatable to quantifiable business benefits for the managers.
In Part 2, we will introduce the Kolmogorov-Smirnov (KS) metric, which achieves these goals; but might require adjustments to work properly in mature underwriting environments. We will then explain those adjustments.