One of the primary reasons so many Digital CxOs are heavily invested in artificial intelligence (AI) is that they are chronically understaffed. The reason for that may have to do with everything from a general lack of people with the skills they require, high turnover rates or a simple unwillingness to pay higher salaries.
Regardless of motivation, the general expectation is AI will create new revenue opportunities as well as reduce the cost of labor. One of the biggest segments of business that will be impacted by AI is customer service. Organizations are already deploying various types of bots to automate customer service with varying degrees of success. The next big opportunity to automate those processes will be to employ AI to replace bots with avatars that can have a meaningful interactive conversation with an end customer.
During an online GPU Technology Conference (GTC) this week, NVIDIA unveiled NVIDIA Omniverse Avatar, a platform through which organizations can build avatars that leverages speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies to enable organizations to provide a better customer experience.
NVIDIA CEO Jenson Huang said machines running the NVIDIA Omniverse Avatar can now verbally respond to questions in about half a second. The challenge is that AI platforms still have to hear the entire question before it can frame a response, but as graphical processors units (GPUs) continue to be connected across a distributed computing environment, that ability to interact with customers should come down to about half a second.
Projects that are making use of NVIDIA Omniverse include Project Tokkio for customer support, NVIDIA DRIVE Concierge for always-on, intelligent services in vehicles, and Project Maxine for video conferencing.
At the core of NVIDIA Omniverse Avatar is Megatron 530B, a customizable language model that NVIDIA created in collaboration with Microsoft as an alternative to the GPT-3 research project. With little or no training, it can complete sentences, answer questions across a large domain of subjects, summarize long, complex stories, translate between languages and be trained to address additional knowledge domains.
Other elements include NVIDIA Riva, a software development kit that recognizes speech across multiple languages; NVIDIA Merlin, a recommendation engine; NVIDIA Metropolis, a computer vision framework for video analytics; and NVIDIA Video2Face and Audio2Face, a set of 2D and 3D facial animation and rendering technologies. These frameworks are composed into an application and processed in real time using the NVIDIA Unified Compute Framework.
It may be a while before avatars are widely employed and universally accepted. GPUs are still fairly expensive to employ, and not every end customer is willing to engage with a machine rather than a human. However, for every customer that prefers to engage with a human, there is another that would just as soon have their issue resolved without ever talking to anyone whenever possible.
Avatars, arguably, represent an effort to bring a more human-like experience to a current bot experience that still leaves much to be desired. The challenge and the opportunity now will be to train those avatars to not only answer questions correctly and even upsell services, but also empathize with customers that are already likely to be feeling a level of frustration no existing bot could ever mollify.