In this Digital CxO Leadership Insights series video, Mike Vizard talks to Faction CTO Matt Wallace about the need to extend the scope of data that generative artificial intelligence (AI) platforms can access beyond the Microsoft Azure cloud service.
Mike Vizard: Hello and welcome to the latest edition of the Digital CxO Leadership Insight series. I’m your host, Mike Vizard. Today, we are with Faction CTO, Matt Wallace, and we’re talking about ChatGPT, data, and all the sources there of. Matt, welcome to the show.
Matt Wallace: Yeah, thanks for having me.
Mike Vizard: I think one of the challenges that people are having is that, while this is all wonderful advancements, the platform in question all comes back to Microsoft Azure. And a lot of shops, well, they may not be using Azure. So, the question then becomes, how do I, kinda, take advantage of similar generative AI capabilities using other platforms, what’s involved in that, can I even do that at this point? What do you think’s gonna happen?
Matt Wallace: Yeah, that’s a great question. It really goes right to the heart of how I look at this from the Faction lens because, you know, if you think about cloud providers and their history they spent a lot of time building capabilities that were very very similar to one another, right? Everybody had to have the same instance types and block storage and so on.
You look at the partnership between OpenAI and Azure and now you have this set of OpenAI APIs. They’re being exposed natively inside of Azure and it’s really the only public cloud where you can do inference on those models. But, I think that’s part and parcel of a broader trend here of there is an explosion of AI capabilities out there right now, literally, hundreds of new funded start ups that are working on various capabilities. And, we know, obviously, some of them in the generative AI space are pretty famous now, things like Stable Diffusion, things like Midjourney, and of course, ChatGPT itself is having it’s competitors, right? Like the LLaMA model, and the Bard service from Google that’s based on it’s model and so on.
And, I think a lot of people are wondering, how do I know which of these is gonna be best for my app or applications? Will it depend on which team and what they’re building? And all those questions lead them to wonder about access to these services because they know there’s not gonna be parody between what every cloud offers. And, I think that leads towards this question of, you know, I’m already multi cloud, we know 85 percent of enterprises are. A lot of that was what I’d almost call disconnected multi cloud, or it was a multiple clouds, right? Where they had different apps and different teams using different clouds for different purposes. And, that’s fine, but it makes them look for holistically at a, what I’d call, true multi cloud where they’re integrating services across multiple could because they know they’re gonna wanna leverage a certain API in Azure and a certain API in Amazon, and another API in Google. And, I think answering that question becomes critical for developing high velocity applications that are based on these different AI services.
Mike Vizard: Do you think as we go along that, not only will we invoke APIs to access multiple generative AI models on these different clouds, but I’ll need some sort of abstraction layer to, kinda, make that happen? And, what’s your vision for how that whole multi cloud world might present itself?
Matt Wallace: Yeah, I mean, it’s not even a hypothetical, it’s happening right now. And, some of this is happening in the start up space and I would point, just as an example of something I dabbled with just a tiny bit, there’s a service called Eden.AI. Just to kinda, scratch a personal itch I went to go build an app and one of the things I needed to do was speech to text translation. And, Eden actually has an interface where you send it the speech and it will actually send it to 11 different services that do speech to text and then return all of the results to you in one go and that’s her abstraction right there.
Now, you know, I found it be, obviously a little slower because, maybe you’re getting this, kind of, intermediary, right? So, you’re adding some extra steps and so, you have to make a choice, do I want variety and completeness, or do I want speed and performance? And I think the question, will you integrate multiple services, again, you know, I was doing this personal project and I wanted to build a text to speech, speech to text interface for the ChatGPT API ’cause I wanted to converse with it. And so, it was actually fairly easy to write the code, but I found over time I wanted to actually use OpenAI’s Whisper model on the OpenAI API to do the speech to text conversion, and then, of course, call ChatGPT API to get an answer. But then, I wanted to use Google’s API to do text to speech because I found that the Google voices that were generated were very good and human sounding compared to some of the robotic voices.
Now, it’s just me and I’m just doing this as a hobby thing for fun, and yet, how many enterprises are facing that exact same question, right? They’re asking questions of, how do I build a better customer service experience, right? I want a more realistic voice, I want faster response times. And, they’re going to want to mix and match whatever makes sense for that use case. And, I think, even aside from questions of performance and compatibility there’s a very real evolution going on in AI to provide some capabilities to tune these models so that they are better for specific vertical cases. Now, if you know that you’re in retail, the base model can look a little bit different so that you can incrementally train it, but of course, you want it to already understand what it looks like to be good at customer service so you don’t have to build that from scratch.
Mike Vizard: Will we wind up needing to score these responses because, otherwise people will be just, kinda shopping around and picking things based on random choice, or maybe even some deliberate bias?
Matt Wallace: You’re kinda thinking about the end consumer, right? If they’re having an interaction, or are you talking about the enterprise and trying to select between AI services?
Mike Vizard: I think it’s gonna be on a scale, but both.
Matt Wallace: Yeah, I mean, I think either way, right? Consumers, I think there’s a whole interesting question around – it’s such an interesting thing to think about, right? Because, if you go back to the early 2000s there were a ton of questions about, how does Google’s algorithm sort pages, right, how does page rank work? And, there was, obviously, a war between Google trying to get end users the best search results and people who were building webpages trying to use different techniques to gain those results to, kinda, raise their rankings in the search. And then, of course, there were the sort of, meta questions of, is Google getting paid for any of this in a way that’s not transparent, are they honest, are there some biases embedded in some way? You know, or are they trying to have something that’s truly neutral and is more based on content or getting the right thing for the user?
I think we’re about to have that same experience with AI. If you got to ChatGPT and go, “What are the best five cordless vacuums I can get?” I mean, it’s gonna recommend something to you, I’m sure, but, where are those results coming from? Is it derived from the training and what sort of biases and selections does the training have that makes it come up with that result? So, that’s a really interesting question.
And, on the enterprise side, of course, there’s gonna be competing models, right? Just like my experience of going, “Well, which or the text to speech things do I wanna have?” There’s gonna be an amazing proliferation of base models for these generative use cases, right, what’s my base? Is it a GPT base from OpenAI, whether I use OpenAI or Azure? What about for images, and how is it tuned for my vertical? And, there’s gonna be so many choices. And, I’m sure things like UI path, for example, they made this acquisition of few years back of cloud elements and they already have an adapter for saying I wanna have one API call where I can touch, every file service like say, Dropbox and Box. I think providers like that will absolutely look to enable that sort of flexibility at the developer level. Say, you can write your code one way and then just, by configuration you can change the back end that it talks to. And so, that may have a big impact as well.
Mike Vizard: How feasible is it gonna be for all of us all the time to start consuming these AI generative models? Because, they do, kinda, consume a lot of data and cloud infrastructure. So, are we gonna need some mechanism to optimize the consumption of generative AI models? Because, otherwise, this may get prohibitively expensive before we know it.
Matt Wallace: Yeah. You know, I have two thoughts on that and the first was, “Wow, I can’t believe how inexpensive the GPT 35 turbo model is.” So, I started building this app and I was just sending request after request, some of which my code was effectively throwing away ’cause I was working through the kinds in the code, it was just discarding things for example. And, after days of heavy use developing I’d go look at my OpenAI bill and my usage was, like, $0.06 and I’m just like, “Wow, this is so easy.” Now, GPT 4, totally different story, I wanna say rough and tough, it’s 50 times as expensive, right? And so, you’re paying a lot for that extra horsepower.
But, if you look at just yesterday, you had the keynote at GDC and Jensen Huang is talking about their new hardware specifically to do inference. So, already Nvidia was a huge player, obviously. The dominant player you might say in training these large models so you get to that final version that can start answering questions. Now, they’re actually turning their sites toward, “Okay, the adoption on some of these things is geometric, what do we do to tune the hardware for giving answers for the model, not just training a model?” So, you’re seeing that hardware proliferation happen right now.
I think your question, though, is incredibly apt because I actually am looking to do a upgrade, there’s a new tool that came out called Rewind.AI and it basically lets you capture everything that you hear, say, or see on your screen and indexes it. So, like, constant speech to text, text indexing, so that everything that ever touches your display or your audio becomes searchable. Mind blowing. And, it only really works because the M2 chips have Apple’s neural engine that accelerates all this ML process in the background. So, in a weird way, I think there’s actually an incredible movement that you see with Apple and others.
Maybe a week or so ago, couple weeks ago, Qualcomm had announced that one of their new chips, I think it was something out of the Snapdragon family, is actually able to run stable diffusion locally on a phone. So, there is a ton of work, I think, to get this stuff working at scale, but, to your point, this stuff is incredibly computationally expensive. And, if we all want to tote these personal assistant use cases with us, I mean, just think about how iPhones completely took over our lives, the equivalent of that smart phone and how everything changed between, let’s say 2006 and today and now we all carry what’s kinda the equivalent of a super computer in our hand. Now we want a whole other super computer because we want all this AI stuff probably to run a lot all the time, and I think it’s only gonna grow.
If you look at things like Apple reality, the glasses that are supposedly coming. And, I start to imagine what happens when your Apple glasses can start indexing and contextualizing the world around you, I think the sky’s the limit about how much computer horsepower we could consume. It could be astronomical.
Mike Vizard: Do you think the sustainability people will start freaking out once they understand what’s going on here? ‘Cause, consumption of infrastructure equals carbon emissions.
Matt Wallace: Yeah, that’s a great question too. I mean, again, I think this is maybe less mature, but it’s a work in progress. There’s all these efforts around ESG for folks to understand all the scope of their emissions. It’s being, sort of, legalized as a requirement depending on your geography, you have hard requirements to report on your emissions with a certain scope, right? And, there’s that scope one, scope two, scope three, and it depends on, is it directly what you do, is it what you do plus the things supplied to you plus the whole supply chain coming toward whatever you ultimately do? I think all those things are on their way. I think there’s definitely a question of, how are the providers of these services going to get green, right?
Obviously, there’s a ton of innovation going on with power, in fact, and this is a little bit of an aside, but if you read the really neat state of AI report from 2022 you realize these MO models, aside from doing really cool things like ChatGPT, folks actually, last year, built a model that an MO model to fine tune the containment magnetic field for a plasma torque in a fusion reactor. So they’re literally trying to apply MO to make cold fusion work to give us, effectively, limitless clean energy. So, you can look at this from all angles. But, I think the question of how the big providers, big data serve providers in particular, get green is a really big question. And, I think as regulatory requirements hit people they will be looking to the data center partners and the cloud partners to tell them a good story around how they lower emissions, how they get green, how they get carbon neutral et cetera. ‘Cause it is a really critical aspect that absolutely – if nothing else changed I think all of this horsepower being put toward this would drive up emissions in, at least, a measurable way.
Mike Vizard: All right, so you been playing around with these models and generative AI and I guess the big question of the day that everyone seems to be asking me, I don’t know about you, is prompt engineering a real job?
Matt Wallace: Yeah, that is an interesting question. I don’t know that I know the answer to that. I think today, certainly, you can tune prompts and get incredible results that vary a lot depending on what you prompt. And, actually, depending on how you slice and dice prompt engineering I think there’s almost a view from my own interactions where there are things you can do to ask, essentially, decompose a question that you want or a workflow to ask one thing and then use that to ask another thing.
I, myself actually, have a process of interacting with ChatGPT, wrote a piece of code to take the prompts and the responses and actually feed them into ChatGPT and say, “Distill this down, I want you to rewrite it for me so that it has the exact same syntactic and semantic meaning but it’s as small as possible.” It turns everything into an abbreviation, it takes out punctuation for me and that allows me to fit more context in the limited amount of tokens that it can process, right? With ChatGPT it’s 4000, and so, if you wanna deal with long form content it allows you to squish that down. That’s one example of dabbling with prompts.
But, even just that question of, what do I say? How do I tell it, “Write in this voice.” or, “Act as this expert.” in order to get a better answer, there’s tons of that going on. But, the real question is, in the long term is that necessary? And, what I see from a lot of experts is to say, “It’s necessary now but we’re going to get to the point where that real human feedback on that results will eventually get rid of prompt engineering because it will get better and better at understanding intent and skip all those steps.” So, I think TBD.
Mike Vizard: The other thing that I was just made aware of and I’m scratching my head about is, so I can use ChatGPT to create a two page e-mail outlining a document and a strategy and then I send it to you and then you use ChatGPT to shrink my two page document down to two paragraphs to get the consensus of it. So, why didn’t I just send you the two paragraphs in the first place?
Matt Wallace: That’s a very interesting question. I think part of that fits into, how are we gonna even talk to people about where content comes from? There’s already, sort of, a best practice that OpenAI recommends directly, it doesn’t require it but recommends about disclosing that content is generated by AI. And, you have to wonder at some point, what’s the bright line, what sort of expectations will emerge about how you reveal that, when you reveal that? How much human intervention means it’s no longer AI driven, right?
If I’m really specific and I give it a super detailed outline that has all the salient points and it’s really just rewriting it for grammar and fluidity, does it matter that it’s generated by AI? Versus, I brainstorm a whole set of ideas or come up with things that are really I’m just giving it, kind of, a nutshell and letting it flesh out. And, it’s probably lower quality, I expect, as a result. Do I need to disclose that? It’s a bunch of really kinda interesting questions. I think what you asked also leads into another really fascinating question is, what is the AI to AI interface of the future gonna look like?
There was a really interesting demo where one company took ChatGPT and wrote a browser plugin and it went to a vendor’s website, consumer vendor, right, for a cell phone. And, they basically told it, “You’re gonna interact with this customer service agent as a slightly upset customer, you’re gonna seek a discount on your service and you’re gonna be willing to cancel if they won’t give it to you.” and, it proceeds to spend 20 minutes going back and forth negotiating with the cell phone provider until it does, in fact, get a $10.00 a month discount on it’s bill. Is there gonna be some kind of battle, some war between dueling AIs? Because, honestly, the provider side probably was an AI already at that point, and what happens with those things? Will people close APIs because they don’t wanna make it easier for AI to go interact with their service? And, if so, does that lead to services that start causing AIs to interact much like a selenium test suite for a browser does so they can fake being a user and give keyboard mouse type things to a browser? There’s a really long tail of crazy questions that emerge from AI capabilities growing so much and everybody being woefully unprepared for the long tail of consequences.
Mike Vizard: All right. More is unknown that known and I’m almost certain that somebody’s gonna take this video and run it through a generative AI program to get the two minute version of it. But, if you do that, you’re just gonna miss all the fun in the first place.
Matt Wallace: Yeah, that’s true.
Mike Vizard: Hey, Matt, thanks for being on the show.
Matt Wallace: Absolutely, thanks for having me.
Mike Vizard: And thank you all for watching this latest edition of the Digital CxO Leadership Insights series, I’m your host Mike Vizard. You can find this one and others on digitalcxo.com and we’ll see you all next time.