CSAA, the California-based insurer, traces its roots back to the early days of automobiles: 1914. It provides auto, homeowners, and other insurance to about 2.5 million AAA members. We spoke to Debbie Brackeen, CSAA’s Chief Strategy and Innovation Officer, and Beti Cung, the company’s Vice President of Innovation and Labs, about what they’ve learned from both building AI projects in-house and working with outside AI technology vendors like property intelligence provider CAPE Analytics and ZestyAI, which uses artificial intelligence to model wildfire and other catastrophic risk.
Tell us about the work you’ve been doing in AI.
Brackeen: For about five years now, we’ve had a formal AI and machine learning-oriented program. It’s a big journey, but insurance companies are one of the original, very data-centric industries. And for me as an innovation businessperson, some of the newer, more interesting areas to think about in terms of overall company growth and diversification are in new business models where there is no historical data from an actuarial perspective to think about in terms of how you might underwrite some new emerging risks.
We started on the journey a while ago. We hired one of the first one or two data scientists in the company to start building our first machine learning-based model that is now fully operationalize in predicting people who are going to churn out of our book, outside of the normal renewal cycle. And then you learn that before you can actually get to building the model, there is a whole lot of work that needs to happen to clean up the data.
One of the other functions that I lead is our venture capital function. And one of the first investments that we made is a company called CAPE Analytics. CAPE Analytics is an insuretech startup that leverages satellite and aerial imagery, data, and machine learning algorithms to assess roof type and condition and surrounding property condition. We started piloting that with our homeowners product team probably four-and-a-half or five years ago. You know, we don’t physically send human home inspectors out to every single property. You can’t have one go up on every single roof.
So CAPE Analytics is a more automated way to do something we’ve been doing for a long time, which is inspect homes, but it allows for more, better, richer, faster data that enables us to assess a property condition in the underwriting process.
And then Zesty, which is just really interesting and unique. And we use both actively in our business as Zesty has the Z-fire rating that you can use in pricing, and that’s a really important partnership as well.
Maybe we can start with churn prediction. How did you start developing that?
Brackeen: We had created a huge data lake, and we didn’t have any machine learning-oriented models. We had prioritized a whole bunch of use cases for machine learning and AI: we did kind of a heat map: high risk and high reward. I think the churn model came about because we were trying to think of use cases that would force us to look at every single data source to see if certain trends or themes emerged looking across all these different data sets.
We usually think about retention in the context of a renewal cycle, which is 60 days before a policy expiration date, and we wanted to look more broadly than that. And ultimately the model was successful at doing that.
Everyone focuses on the model-building part, because that is the sexy part of it. But the actual hard work, and the time that it takes, is around extracting the data and putting it into a format [so] that you can train a model…
— Beti Cung, CSAA
Cung: In terms of the data cleansing, everyone focuses on the model-building part, because that is the sexy part of it. But the actual hard work, and the time that it takes, is around extracting the data and putting it into a format [so] that you can train a model, identifying the different features that contribute most meaningfully to the model.
People are always emphasizing, “Oh, I’ve got lots of data.” But a lot of data also means a lot of noise. So it is really the job, and the art, of distilling signal-to-noise ratio to get these models to really perform and be interesting. And that’s where really a lot of the time and the art go in building them out.
The team’s since looked a lot of different areas. Another is around acquisition, what we call the propensity to bind. That helps us identify who is the likely best customer for us.
And how do you make the decision of when to build these models versus bring in outside technology?
Brackeen: Part of my division, we have a venture capital arm—strategy rolls up under me alongside innovation and labs. And I would say, it’s really helpful to have all the kinds of innovation tools and strategy under one roof or in the same toolbox, because I don’t actually care if we find a really great third-party company, which could be a company we invested in, or any other startup or mature company. CAPE Analytics is a great example. That’s not a capability that we were going to build in house, but they’re really good at what they do. And they’re serving a really useful function in our homeowners business alongside Zesty.
Then there’s certain things that make a lot of sense for us to build in-house.
Cung: I do think we apply that build-buy-partner lens to decide. Part of the filters that go into that decision have to do with, is it strategic to our business, in which case we may want to build and keep it in house, or because the generic models out there don’t really address what we really need. And so for retention, it is very specific to our particular customer profile. We’ve looked at the more generic models that are out there, and they’re really not better for us—it’s sort of a little watered down, if you will.
But certain things like Zesty, or Cape, their models will be better than what we potentially could build in house because they get a range of data across the industry, across different properties.
For the in-house models, like churn and propensity-to-bind, what did the process look like? How did you iterate to get something to put into production?
Cung: We use a framework called the AI business canvas, because ultimately, it’s about using the technology to solve a business problem, right? The canvas is very structured, so we’ve found it useful to use that framework and structure our work. (Here’s one example of a canvas.)
A lot of times people let the technology get ahead of them. We try to ground it in, what is the business question we’re trying to answer? And if we can’t frame it in a question that a predictive model can answer, then go back and retry.
For the propensity to bind, that was a collaboration with the marketing team on what they thought were their pressing problems and how we could help with a model. So the next step is usually a canvassing of data. There’s a phrase that, I think, one of the early leaders in data at Yahoo coined. He called it the “data safari”— the idea of looking through the data internally and externally to understand what’s there because sometimes you frame a question and you can’t answer it, because you don’t have the data or it’s not available. So there’s that safari idea, which is go look for the categories of animals or, in this case, categories of data you have to inform your model.
Then, the team starts running evaluations around the data, what’s generally called exploratory data analysis. It looks at the amount of data we have, the type of data, basic statistical analysis, any gotchas there, and then they start putting together a model pulling in data they think is relevant, running the model, and testing to see if it’s predictive. Like I said, everyone focuses on the “tada! The model’s built,” but that part is actually the least time intensive.
How do you evaluate if it’s worth putting into production?
Cung: Usually you want to set up some kind of metrics. We will do back testing. We’ll also compare it against business as usual, to see if the model provides a lift.
And in these cases, did you come across any surprises in the process?
Brackeen: I’ll give you one from a less technical perspective. Building on Beti’s description, the actual building of the model is sort of the easiest part. Getting all the data ready, that’s huge effort. And then after the model is built, you think you’ve got it tuned enough to begin testing this stuff, what I discovered that was a little bit unexpected, was getting people to trust and believe in what the model was doing, because it was a little bit of a black box.
It took a while for people to get comfortable trusting the output of the model… — Debbie Brackeen, CSAA
I think we’re more over that hump now, culturally, as an organization, but in the early days, when we first started sharing Version 1 of the model, which I think was a California-only model at that time…it took a while for people to get comfortable trusting the output of the model and planning actions that we might take as a result of that output from the model.
What got you over that barrier?
Brackeen: Time, and explanation, and designing experiments that we ran internally before we unleashed anything in the real world, if you will.
As far as what you mentioned about the challenges of cleaning and preparing data, has that changed at all, or is that still something you need to do for each model you build?
Cung: It’s still a journey, right? Our company already had a pretty sizable data transformation planned, moving to a large data lake in the cloud, and dispelling a lot of the data silos that we have internally. So that was already planned by our IT organization. And once that’s complete, that’ll facilitate some of our work. But right now, while we’re kind of mid-flight, there is still quite a bit of sort of manual aggregation. And also, because machine learning models look at data really broadly, there’s always going to be some work around normalizing data and putting it into sort of the right formats to use. Because if you’re pulling data from two very disparate sources, at some point, you might need to kind of get them into the same configuration.
Is there anything else you’re excited about here?
Cung: I like the Steve Jobs quote, “a bicycle for the mind.” With all the hype around AI, this is AI’s moment in the sun, even more so than before. But it is sort of a bicycle for the human mind, and we never think of taking the human completely out of the loop. And you see in all the work that we’re doing, there is always still a person, an expert in the loop, that continues to train that model and continues to review the results.
And I think the applications are pretty endless: roof assessment, property level assessment. We’re leveraging computer vision for water damage detection. We’re using it in marketing for acquisition and retention. We’re looking at it in terms of wildfire risk. As a company, we’re very focused on our purpose—to help prevent and prepare for and recover from life’s uncertainties—and I think AI will help us do that.
Brackeen: And just to add to that, we’re getting ready to run our first-ever open innovation challenge. We always do innovation challenges internally, and we’ve literally created new businesses as a result of some of the ideas that have been generated. But we’re really excited to open that up more broadly, and we’re partnering with IDEO on this Climate Resiliency Challenge.
The biggest part of our market share in homeowners is in Northern California. So wildfire risk is an ever-present danger. But I’m really hopeful that as we launch this open innovation challenge, we get all sorts of companies and AI experts and entrepreneurs thinking about new and innovative ways we can leverage new technologies like AI or any other ideas they may have to help not just our customers but any homeowner protect their home or prevent wildfires or whatever we can do.
Key insights… |
• Be patient in explaining and demonstrating to colleagues how AI models work; that’s the only way to overcome reluctance and resistance. • Most companies can expect some heavy lifting, and manual aggregation, when creating data sets that AI can tap into. • CSAA considers whether to build, buy, partner, or invest in AI solutions. Is it strategic to the business to “own” a tool? Or can more value be created by working with an outside partner, or making a corporate VC investment? • Steve Jobs once called personal computers a “bicycle for the mind.” In similar ways, AI may augment work that humans perform — without taking them out of the loop. |