Technologies built around AI (Artificial Intelligence) are revolutionizing businesses across the world. Given how powerful these technologies can be, companies must be prudent in how they deploy them. AI must not only be effective, but transparent and non-biased. Exactly how to achieve this will vary, but the fundamental solution is unchanging: Companies that provide AI products and solutions must prioritize transparency.
AI tools for P&C Carriers should not only give insights on a property but show exactly how those insights were generated. An AI that says “This is a bad property” isn’t useful. On the other hand, an AI that says “The shingles on the northwest side of the roof have come off, and therefore this property is a higher risk” is highly useful. Transparency should be baked into every step of the process, not thrown in as an afterthought.
Our team has thought hard about how to provide transparency without compromising the performance of our models. In this post, I will explain our current approaches to transparency and then detail how we’re expanding upon them with our new Total Confidence Score.
Transparency for Roof Classification
One way we use AI at Betterview is to classify the shape and material of each roof. The best way to be transparent in this case is to provide an intuitive score that indicates how likely we are to be correct. This is the classic confidence score, which is clear and intuitive, yet mathematically rigorous: If we say we are 95% confident, we expect to be correct 95% of the time. For example, if we say that a building has a gable roof, or that the roof is made of asphalt shingles, we provide an exact measure of how confident we are.
Figure 1: Confidence scores for shape and material visible in PropertyInsight.
We provide that intuitive definition of confidence because it is important to be clear about what a “confidence” score is. Some statistical models, such as neural networks, are very complex, and it’s easy to be misled by their output. The results of modern neural networks need to be carefully calibrated so that they represent accurate confidence scores. These calibrated confidence scores are what we provide to our customers.
Transparency for Roof Maladies
When maladies are detected on a roof, we are transparent by spotlighting them for the user. We would never say a roof is in terrible condition without spotlighting the missing shingles, structural damage, or other severe maladies. In the figures below, we show all the maladies we spotlight and an example of spotlights in action.
Figure 2: List of roof maladies detected by Betterview, sorted from most severe to least.
Figure 3: An example of Betterview’s detections on a poor-quality roof.
These individual spotlights are combined into a single score, the Roof Spotlight Index. The Roof Spotlight Index is an aggregate of the spotlights weighted by severity. This score allows our customers to focus their attention on the properties that require immediate remediation.
Transparency for the Roof Spotlight Index
Ideally, once a property is assessed to be in good condition, an insurer could fast-track its approval. But without a measure of confidence in that assessment, how can the insurer be certain? Poor image quality could cause the AI to overlook a major defect in the roof. This is when a confidence score becomes necessary. The Betterview confidence score determines not only how well our AI performs on an image, but how well that image actually reflects the real property. Let’s consider an example. Looking at the image below, is it possible to be sure there isn’t serious damage underneath all the foliage?
Figure 4: This building receives a Total Confidence Score of Poor due to all the overhang obscuring the view.
I’m sure the model would be very confident looking at this image and concluding there isn’t a tarp, but should we have confidence in it? Looking at the image, I’m confident that there’s not a tarp in that image, but I’m not confident there’s not a tarp on the roof itself. We combine all factors that could lead to uncertainty into a single score: the Total Confidence Score.
The Total Confidence Score consists of a variety of metrics and measurements to ascertain whether or not our results on the image are reflective of the real world. For example, we know that our models’ performances are heavily dependent on the quality of the image, i.e. we’re far more likely to overlook missing shingles when the image quality is poor. So we factor that into our overall Total Confidence Score, along with other relevant attributes.
Testing the Total Confidence Score
The only way to test our score with the quality and accuracy we required was to do an extensive data collection campaign. We had human labelers painstakingly label thousands of images from all across the United States. Teams trained to analyze roofs in aerial imagery determined exactly what the roof conditions were. They labeled properties, counted tarps and overhanging trees, and looked at every missing shingle on every roof on every property.
Figure 5: Properties with high confidence are twice as likely to have accurate AI as those with Poor or Low confidence.
After the labeling campaign, we found the true condition of each property. Then we looked at how far off our AI was in predicting that score. We had a number of important findings that validated our approach:
- The majority of properties have Medium or High confidence (90%).
- Very few properties have Low or Poor confidence (10%).
- Properties with High confidence are the most likely to have accurate AI – twice as likely as those with Poor or Low confidence.
- Properties with Poor confidence are very rare (<4%), and they are the most likely to have AI errors.
The most exciting aspect of our efforts to provide transparency is how the different approaches can be used in combination. A model that admits when it isn’t confident is a good start, but its utility is limited. It is necessary for users to also understand the model’s score, so that they may evaluate it themselves. That’s where the roof spotlights come in. Users can look at the specific spotlights and see whether they are valid. By combining all of these elements together we are putting the power of state-of-the-art AI into the hands of our customers.
AI is already having a huge impact on the insurance industry. But efforts so far have been held back by a lack of transparency. That’s why we are so excited to demonstrate our new transparency measures. Those that can incorporate our confidence scores into their decision-making will be at a significant advantage, allowing them to drive faster and smarter decisions throughout the policy life cycle, cut down on their expenses, and predict and prevent future losses.