Close

GenAI Hits the Road with Mercedes-Benz

By Hadley Thompson |  June 10, 2024
LinkedInTwitterFacebookEmail

In this episode, we connect with James Liu, the Head of MB.OS Customer Experience at Mercedes-Benz. Liu oversees a team responsible for human-machine Interface concepts, UX design, front-end engineering, speech, and AI platforms.

That includes prototypes that leverage generative AI technologies like ChatGPT and Stable Diffusion. “We’re in a beta period where we are learning with the technology,” Liu says.

Liu reports to Mercedes-Benz’s Chief Software Officer, and one topic we explore in this episode is bringing new thinking about digital experiences and data into a company culture that is deeply rooted in engineering, design, and quality. “The hard transformation,” Liu says, “is on the operational side. It’s how a traditional metal-bending company acts a little bit more like a software company that cares not only about selling the product, but how a customer uses the product, how we then iterate over the feedback that we get and the data that we get. To do that at scale… is a really hard challenge.”

You can subscribe to our podcast, “Innovation Answered,” on Spotify, iTunes, Stitcher, or Google Podcasts. A transcript of this episode is below.


Scott Kirsner: 

Welcome. You’re listening to the Innovation Answered podcast. Innovation Answered is the podcast from InnoLead, the web’s most useful resource for corporate innovators and change-makers. If that sounds like you, we encourage you to subscribe to the podcast, so you’ll catch all of our future episodes.

I’m Scott Kirsner, CEO and Co-founder of InnoLead, and in this episode, we talk about generative AI, speech interfaces, and evolving the customer experience for one of the world’s top luxury car brands: Mercedes-Benz. At the Consumer Electronics Show earlier this year, Mercedes made a major splash with a new virtual assistant that leverages advanced software and generative AI — along with on-screen visuals — to provide more proactive support and a more intuitive way of interacting with the vehicle. 

[ Audio from Mercedes-Benz demo video. ]

One of the people behind that project is James Liu, the head of customer experience for the Mercedes-Benz operating system. Liu reports to the company’s Chief Software Officer.

Scott Kirsner: 

Tell us a little bit about how your career took you to Mercedes.

James Liu:

James Liu, Head of MB.OS Customer Experience, Mercedes-Benz

Nothing beats being born in the right time to the right people. My mom was an artist, and my dad was an engineer. I really just lucked out, coming to my professional life, at the explosion of the user experience as a profession. I was trained in design, fine art, applied mathematics, and computer science, and I started, from the beginning, building little websites all the way to starting to work on big platforms. I had a diverse career inventing products and services at Microsoft at little startups. Ultimately, in 2013, I joined Daimler. Daimler was really investing in building the first software hub in Silicon Valley. 

I tried to bring software development in-house at Mercedes-Benz, and I was lucky enough to help ship the very first MB user experience in 2018, with A Class. I took a couple of detours and I came back through the company about a year ago to lead Global Customer Experience.

Scott Kirsner: 

Tell us a little bit about that customer experience role that you now hold, and where it sits in the organization.

James Liu: 

I always joke with my team that we’re the last layer of light and sound waves between all of our technologies and the customer. Practically speaking, my teams are responsible for HMI [human-machine interface] concept, UX [user experience] design, front-end engineering, speech, and the AI platform. I report to our chief software officer, Magnus Östberg, and, in reality, great customer experience is the result of teams working together. I like to think of myself as really orchestrating all of our global collaboration.

Scott Kirsner: 

You said one thing, HMI. Does that stand for human-machine interface? Am I right on that? 

James Liu: 

That’s correct. 

Scott Kirsner: 

That’s great. You also said that there’s a chief software officer at Mercedes-Benz. How long has that role been in existence?

James Liu: 

I think Magnus Östberg started with the company almost three years ago now. This is a recognition in our company that software development is such an important part of all the experiences and every customer feature we’re trying to bring to life. Magnus reports directly to both our CTO as well as to our CEO.

Scott Kirsner: 

It’s interesting, in a previous podcast episode, we were talking about the idea of the software-defined vehicle, which I think is a term that floats around Silicon Valley a lot. Maybe it floats around Detroit and Stuttgart and other places as well. You ask people, well, are there really software defined vehicles out there today? It seems like we’re in this transition where every automaker would acknowledge we’re on the road to a vehicle where software in the vehicle spoken interface software, probably mobile app software, is a bigger part. 

How do you think about where we are on that journey today?

…The hard transformation is on the operational side. It’s how a traditional metal-bending company acts a little bit more like a software company that cares not only about selling the product, but how a customer uses the product…

James Liu: 

Oh, man, that’s a great question. I think you really have to break it apart from both operationally, how a company operates with a software-oriented mindset, and of course, on the other side, defining software-related products, features, and services to the customer. 

I think we are we’re still absolutely on that journey. I think Mercedes-Benz, we’ve been fortunate. I think we’ve had some foresight as an OEM to bring some of the software-driven experiences to the customer. I think the hard transformation is on the operational side. It’s how a traditional metal-bending company acts a little bit more like a software company that cares not only about selling the product, but how a customer uses the product, how we then iterate over the feedback that we get and the data that we get. To do that at scale, to do that globally manufacturing-wise at scale, is a really hard challenge. I think that part specifically, many OEMs, ourselves included are still on a really large journey towards. 

Scott Kirsner: 

So what’s your big vision for the in-car experience? If you think about the next three to five years, how does AI filter into that? How does personalization, how does speech filter in? People are still getting accustomed to having a car that you can speak to, and maybe that’s the easiest way to get it to do something. 

James Liu: 

I think there are a couple of big factors that are driving the revolution there. Speech technology in the classic sense has always been really good in a vehicle, you’re driving, you don’t want to touch too many buttons, and you want to keep focus on the road. So cognitive load-wise, and driver-distraction wise, speech has always been a great environment for us to leverage the power of speech technologies. 

In the last year or two, we’ve had a huge revolution in GenAI, and that has had a dramatic impact on just the architecture of first-generation speech technologies. I think that’s a huge trend that we’re driving, obviously, a new invention from the technology side. I think the second big trend is, of course, ADAS systems — we’re seeing more and more autonomous driving capability in our vehicles. We are one of the few OEMs that are level three certified in the United States. If you do not want to have to think so hard, you can look at the convergence of those two core technologies, I believe that we’re starting to bring a vehicle to life that, I think, the customer will really feel like their first robot. It’s not only something that I command and control, that takes me from A to B but is part of my living room and part of my house, and I can interact with it in such a natural, empathic way. That, I really think it’s going to feel like this robot that you own, which I’m really excited to help bring to life.

Scott Kirsner: 

Say a little more about autonomy and why that’s going to be important to people because I do think with some of the experiments around the world where we’ve had autonomous taxis, people say, “Oh, that’s a use case that I would actually rather be in an Uber with a human driver.” That’s the context for autonomy that a lot of people have in mind when you use the word right now. 

James Liu: 

Oh, absolutely. I really like talking and working with our eight engineers, they try to drive as much as they can in the ADAs world. They literally, every day, going to work, running their groceries, they always have an assistance system on. Especially looking at some of the more advanced newer generation technologies, it’s less about the robo taxi scenario where people are picking you up and driving around everywhere, but really about giving you the customer time back. Our world is still very much designed around vehicles, especially in the US, and the amount of time you can get back in the vehicle. You can be on the highway, you can turn on your Drive Pilot, and just actually take your eyes off the road and do something with it. You can go out there if you want to relax and watch your movie, or answer your emails. 

We all have suffered from getting the notification while driving in the car. So for me, the context is less about the taxis that will replace Uber, but actually giving me more freedom. I think that is why customers want to own a car in the first place. Not only to give you freedom mobility, but also to give you time back during that mobility. 

I think that is really going to be a huge, huge benefit to our customers.

Scott Kirsner: 

Talk a little bit about what you launched this year, what its capabilities are, and what it’s called.

James Liu: 

We have about 10 million connected vehicles out in the world. The things that we launch continuously are what I like. The things that are improvements to speech technology recognition, better ASR [automatic speech recognition], better models, and getting rid of those little customer complaints. We talked about software-defined vehicles, our job now is not just to sell a car, but actually, to make sure that our customers and their experiences are improving every single every single quarter every single year. Those little unnamed bug fixes all those things are already such a huge benefit that we’re able to provide for our customers. 

We did launch the E-Class, which brought a whole ton of really cool features to the customer. Speaking of AI, we actually have a feature there called Routines, which allows our customers to automate both manually as well as automatically by repeatable things that you do in the car. The classic example is when I drive up to work, I always put down my window. Now with Routines, it actually does that automatically. A couple of little quality-of-life features there are great. We’ve also launched a lot of beta experiences. One of the key things that we launched was a ChatGPT beta last year in the US, and we’ve learned quite a lot from that experience as we’re trying to understand this GenAI world.

Scott Kirsner: 

Talk about what the ChatGPT experience does. I’m also curious, maybe we can talk a little bit about build/buy/partner, and how you think about that at this moment when technology is moving so quickly.

James Liu:

Absolutely. So we have a global team, we have R&D centers all around. I’m currently in Germany right now, and we have a large R&D team sitting in Sunnyville. It’s actually that crew of people that previously developed our speech platform. These are engineers. Actually, they had some free time and really wanted to explore. They were able to use our speech platform to integrate ChatGPT as an experience and we built it. We launched it down to our public, our beta fleet in about three months. They’re a really interesting group of people. We, as a global company, try to capture the spirit of the zeitgeist from all of our different parts of the world. If you talk about China and our R&D team, there are different providers, different partners, and different technologies that they’re also exploring on that side of the world. 

To your partner question, our overall philosophy is that we are the architects. What that means is, at the end of the day, as an OEM, our core business is to invent new AI models. That’s not what we do. We want to work with the best partners out there, and to bring the best technology to our customers, especially with things like ChatGPT, with Microsoft, and with Google. We have great partner relationships with both of them. Everywhere from cloud to ADAS [advanced driver assistance systems.] For us, it’s really about working with all these partners together to find the best experience to give our customers. Even this last week, we were at Microsoft Build and we’ve announced even a couple of new betas that we have working with Microsoft to bring to our US customers in the next couple of months.

One of the core learnings that we had from there was if you’ve experienced first-generation speech technologies, if you talk with your Alexa or your Google Assistant or Siri, often, you’ll run into these white spots or dead ends, where they say, ‘Man, I can’t help you with that.’ …What’s awesome about GPT…is that it always has an answer. What that does is it builds trust in our customers. 

Scott Kirsner: 

So give us an example or two of like, some of these betas, whether it’s with ChatGPT or Microsoft technology. What some of the things are that they can do?

James Liu: 

A lot of this comes from our learnings with ChatGPT. One of the core learnings that we had from there was if you’ve experienced first-generation speech technologies, if you talk with your Alexa or your Google Assistant or Siri, often, you’ll run into these white spots or dead ends, where they say, “Man, I can’t help you with that.” It’s not magic. We have to build and train all these models to understand what you’re saying. What’s awesome about GPT in that context is that it always has an answer. What that does is it builds trust in our customers. 

For the customers that sign up with beta, that use ChatGPT, surprise, all of their engagement went up. We were able to reach 99.9% consistency and quality. You’re always getting a response back and more trust in the system. We learned a lot of customers asked lots of different things, as opposed to older systems where they would only narrowly define that they could use speech for phone calls, or I can use it for navigation. They started to ask a lot of questions. One of the betas that we will be launching is the “Ask me anything” feature. We’re grounded with the idea that we are giving customers the ability to query the internet and give a great answer back in the voice context. 

It’s not a brute force method, we can use the power of Open AI, and Microsoft to give that experience to the customer. We’re also exploring a lot with generative images. One of the fun experiences that we’re looking at is getting customers to generate pictures in their car and deliver that to the car itself for a little bit of fun.

Scott Kirsner: 

Were there some debates that you needed to have or challenges you needed to work through? When you are working with a technology like Open AI that is pretty new, is pretty unproven, people obviously talk about how it can answer any question, but sometimes it gives you an incorrect answer. 

It is a squishy technology that is changing by the day. What were the discussions or what were the ways you had to get clearance to work with it?

There’s a good reason we have not yet fully productized it. We’re in a beta period where we are learning with the technology.

James Liu: 

There’s a good reason we have not yet fully productized it. We’re in a beta period where we are learning with the technology. Everything, of course, from quality, I’ll put that under the perceived quality umbrella, from latency, it’s not just a sitting with our cars, we’re not at a Wi-Fi connection, these are cars on the road driving through the city driving through the country and accuracy. All of this is huge learning for us and everything we’re working with. For something as simple as depending on the scale of how many questions are acquired, and how many people are acquiring our ChatGPT beta,  we need dedicated customers. Even scaling the ability for customers to use this system at low latency across the U.S., there are huge amounts of learnings there. 

From a quality hallucination perspective, that’s lots of fine-tuning. Just using ChatGPT and APIs out the door, that’s not gonna work right away like we expect. We really want to have a high bar for quality for our customers. A lot of the work we’re doing is to make sure that the data is grounded. We’re not just using an open model to generate information, but actually queering a trusted model, using those APIs, and then formulating a response to the customer. That sort of chained action is a really huge part of this field and how it’s all developing. I think it is for us to just learn about all these technologies. We’re really trying to experiment out within the public, and really learning from this and finding the right way for us to then productize that then to bring it to the rest of the world.

Scott Kirsner:

How do you personally stay up to speed with generative AI? Are there some tools or platforms you find yourself spending a lot of time with these days? 

James Liu:

I think I’m very fortunate to have a group of really dedicated very curious individuals in our teams. They do a great job keeping me up to date. That’s probably the most important thing. Our design team they love staple diffusion and they love hugging face. They’re always in the community. They’re really looking at what artists are doing. 

We’re doing some really cool work training our own Mercedes-Benz model on Stable Diffusion. What is our brand language? There’s lots of fun work there. We have huge, deep partnerships with both Microsoft and Google. We work with their engineers directly. Even for the ChatGPT-o model, our team had early access to that and we we’re working with them directly and understanding how to use it in our context. 

I feel very fortunate, of course, that we get to see it from all sides, both in public as well as from the inside.

Scott Kirsner:

You’re somebody who’s worked in technology for a long time. It feels like we’re at this interesting moment where the technology is moving so quickly, it’s also because AI is a probabilistic technology, you sometimes feel like this software gives me a great answer one day and the next day it’s not as good, and then the next day, it’s better than it was on the first day. There’s a lot that feels like it’s not just evolving, but it feels different than maybe previous waves of technology. How do you think about that? 

James Liu:

I was working in gaming, when the game industry made that shift from PC console based games to to mobile games. I think a lot of people bring up the mobile transition as maybe something that we’re in. It’s really interesting, but not in the sense that it is disruptive. 

I think we’re seeing the whole industry put its dollars behind trying to understand this platform transition. Of course, things like “Hey, will there be a GPT store?” People will stop using applications and things like that. I will not venture to say I know exactly what’s going on, but you can absolutely see whole industries reacting to this platform change. 

I think what was really interesting in the gaming space was we learned that, turns out, people, behaviors don’t really change. A lot of the first or even second generation games are mobile where people are trying to bring AAA games to mobile because gamers love playing Call of Duty on the console. They’re gonna love playing Call of Duty on the on the mobile phone. That wasn’t true. They had to adapt, you had to create new mechanisms and your customer experiences, the whole industry had to learn how customers behaviors would change or not depending on the context of how they use their devices and technology. 

We’re really discovering those things as well. I can absolutely believe that, at the end of the day, we’re still going to use the devices that we have; screens, buttons and UI. We’re highly visual, tactical people, we’re always going to want to look at things, we’re always going to look at a list. Not everything is going to be voice-based. I absolutely think that the interface itself will adapt.

Humans are humans. We will still have a lot of behaviors that are consistent with what we have we see today in the world, from a user-experience perspective. I don’t know if that’s a very clear answer, but I absolutely do believe that we’re gonna see some behavior change. I suspect, like with all technologies, people are still going to behave the way they will. This technology will simply allow us to provide value to them in a faster, cheaper, better way.

Scott Kirsner:

One of my hypotheses, which it sounds like you believe, is we’re going to continue living in a hybrid world. We’ve had a couple of decades of using our mouse and pointing and clicking and pulling down menus, but that’s not really what we want to be doing all day. We maybe just want to be speaking, or maybe sometimes typing questions and having systems do things for us. You seem to be arguing that the graphic user interface may not vanish overnight, even as speech driven interfaces get a lot more capable.

James Liu:

There’s been a ton of research on this. You look at recognition over recall, voice systems are fantastic for recall. I know exactly what I want, there’s this perfect song that I want to play, I can ask my device to play this one song on this album. You can spearfish a piece of content really easily and that’s awesome. You can basically take something that would take a lot of clicks and a lot of mouse tracks, a lot of actions, string it together and just get the precise content you want. Human communication is very much based on recall. 

“Hey, I just saw Scott the other day, let me ask you about something.” But, of course, as humans, not only recall is part of our the way we interact with the real world, but recognition is a big part of it. I see a new environment, I see a new webpage. It gives me a lot of visual information about content, about lists, about pictures, we need that as humans to make decisions what to do. I really believe that’s not going to fundamentally change. We’re always going to want to see videos, we’re always going to want to see pictures, we’re always going to want to see lists in order to make this a purchase decision. 

I do believe those hybrids, the best of both worlds, will exist. Obviously, not having to choose will be a revolution. We’re gonna see a lot of multimodal interactions. I absolutely believe, as humans, we’re going to have to use the best of our nerve endings for our eyes, and our voice, and our fingers to really interact with the world. It will certainly be more natural, but I don’t think it’s all going to go away and we’re only going to have one interface that you talk to.

Scott Kirsner:

What advice do you have for other people in large companies? Maybe they’re in the technology organization or innovation R&D who are trying to set up pilots and eventually rollout AI technology, which is seems like it’s a fair amount of what you’ve been focused on over the last year or so.

James Liu:

The most important thing is you’re just not going to get it right the first time. I think this is where I think learning quickly, learning in the public, I believe, is very important here for a traditional OEM. I think that’s one of our biggest challenges, trusting both our customers and of course ourselves and saying, “We want to bring this technology, we want to experiment together with our customers, and really adapting and learning from that.” That is a cultural challenge a lot of companies I’m sure face. Especially with these technologies moving so fast, I think we really try to preach patience. 

We’re gonna have to learn a lot first, before we really make these decisions of how we want to scale what we want to do. This space is so exciting that a lot of people are trying to say we just bring this technology and we’re going to solve this problem, we’re going to save all this money. The reality is, you’re just going to have to learn to be a little bit patient. We have to get a lot of data. I think that the organizational cultural challenge is a little bit larger than the technology one because the technology will keep advancing. 

The best thing you can do is to be a platform. Build a platform that allows you to really experiment quickly, that allows you to get data to not be locked in by particular partner or vendor. That gives you a lot more freedom of flexibility. We’ve been very lucky as Mercedes-Benz, as an automotive context, we work with lots of partners. That mentality has really enabled us to explore with a lot of partners, and to learn from so many different great companies out there. The last word of advice is what I said before, patience is the big one. You have to experiment and be patient.

Scott Kirsner:

The last thing I want to ask you about is some of the things that are going to be evergreen or constant, you can’t get so excited about a new technology and putting it out there in the world right that you forget about, as an example, what is the job the customers trying to do. You touched on brand expectations, what do you think are the constants that people need to stay focused on even in this era of crazy chaotic AI evolution?

James Liu:

There’s nothing like really knowing who your customers are. That’s always both a challenge and a struggle, especially when you’re when you’re developing lots of products for lots of people. For us at Mercedes-Benz, we’re a luxury brand. Customers don’t buy our cars because it’s just a car. They buy it because of its appeal, they buy it because of how beautiful it is. They have expectations of safety of quality of technology. As long as, for us, making sure that whatever we build really speaks to that core value of who we are as a brand, and why customers bought us in the first place, I think is super important. 

Obviously, there’s a tension, as the automotive industry itself is also in transition from ice to bev, and so customers expectations are changing. For us, it’s really important to say, “What are the core values that we cannot lose?” and make sure that those things help set our high bar for why we are experimenting or we on the brink of product. Making that bar very clear and being patient with saying, “We don’t want to just launch this just for the sake of launching,” it really helps us build and maintain that trust with our customers. 

Where we are, it’s important for us to really just understand what is it that we are doing as a company, to our customers, for our customers. We are not a general AI company. Just because we can do all this crazy stuff with Gen AI, it can be very easy for us to confuse ourselves what is actually the value that our customers are seeking. I think the the real value is in the entire product experience and we want to make sure we can augment that.

Scott Kirsner:

James, I’ve so enjoyed this conversation. I just want to thank you for taking time to be on the podcast, I really appreciate it. 

James Liu:

Scott, I had fun as well. Thanks so much. 

LinkedInTwitterFacebookEmail