Key points to remember
- Seasalt develops customizable speech recognition technology for enterprise call centers.
- The founders sold their last startup to Baidu in 2017.
- The company is partnering with cloud communications giant Twilio.
After selling their latest startup to Baidu, a pair of tech veterans are stepping back into the cluttered speech and voice recognition space with a company called Seasalt.AI.
The startup sells a software platform to companies with contact centers and initially focuses on the Southeast Asian market. Developers can use Seasalt to build apps, devices, and services that communicate conversationally with users.
The company was founded by Guoguo Chen and Xuchen Yao, experts in the field of speech and voice recognition software. Chen created the “OK Google” keyword for Android and co-wrote a speech recognition project called Kaldi, which Nvidia eventually integrated into its graphics card. Yao, meanwhile, has a PhD from Johns Hopkins University who previously worked at the Allen Institute for AI (AI2) incubator in Seattle.
In 2015, Chen and Yao co-founded KITT.AI, a startup spun off from AI2. One of the company’s most popular products was a customizable wake-word engine called Snowboy, a software toolkit that allowed developers to add verbal keywords to their own hardware. The startup also launched ChatFlow, a framework allowing developers to create chatbots.
Baidu acquired KITT.AI in 2017. Chen and Yao worked for the Chinese tech giant for two years, leaving in 2019.
Seasalt, which has 22 employees, provides a customizable voice recognition engine. The startup describes its technology as the “next generation” of conversational AI. The founders invested nearly $1 million of their own cash behind SeaSalt, which also raised funding from Seattle venture capital firm Unlock Venture Partners when it first launched in January 2020.
The company sells six apps, which work in tandem as part of its full suite of services, listed below, called “SeaSuite.”
- SeaChat allows users to create a framework for automated chatbot responses.
- SeaCode is a software development studio for conversational AI. Users can use the platform to create tools such as chatbots.
- SeaVoice is a speech-to-text (STT) transcription feature that can be customized to understand different languages and nuanced speech, among other uses. This tool also has a text-to-speech (TTS) feature, which can be customized to sound like Tom Hanks or David Attenborough.
- SeaMeet’s secretary-like features can be used during conferences and meetings. It can identify up to 12 unique speakers in the room. Users can train the model to provide automatic meeting minutes and follow-up notes, among other actions.
- SeaWord can be customized to scan text to extract meaningful information. The tool can also be used to highlight and redact words as identifiable information.
- SeaX is a tool designed for contact centers. It can automate responses to incoming messages, calls, and social media, among others. The software also includes a tool that call center agents can use to transcribe and categorize incoming customer calls.
Seasalt aims to offer tools capable of understanding nuances in both speech and text. Its primary use case is in enterprise contact centers. These companies use the software not only to monitor and assess their agents, but also to aggregate voice data to extract insights.
Global companies must operate call centers in hundreds of countries, which means they will inevitably encounter low-resource languages and accents wherever they operate. In America, for example, there are at least 24 English dialects.
Seasalt’s customers include major corporations such as Cathay United Bank, McDonald’s Taiwan, and Oppo.
“For any business, if you have really weird spelling or technical jargon, we can take care of that,” Yao told GeekWire.
The company generates revenue from both professional services and recurring subscriptions. “It’s not a pure software-as-a-service model, because enterprise contact centers are typically very complex,” Yao noted.
Seattle has become a hotbed for NLP-focused startups, many of which have spun off from AI2, which specializes in this type of research. Seattle-area companies developing NLP technology include Xembly, Read, Unwrap, and Augment, among others. Spoken Communication, a Seattle startup that sold voice recognition technology to call centers, was acquired by Avaya in 2018.
Yao said that due to the pandemic, many cross-border e-commerce sites are popping up in North and Southeast Asia, selling their products overseas. This creates a tailwind for Seasalt, he said. He added that the region is currently underserved by competitors.
The call center technology market includes many existing players. Tech giant Google sells its natural language capabilities in a packaged solution called Contact Center Artificial Intelligence. Amazon and Microsoft also have their own services: AWS Contact Center Intelligence and Azure Cognitive Services. Other notable players include Deepgram, Five9, Avaya, and 8×8.
Yao said Seasalt wouldn’t have the “muscle” to compete with Five9 or other listed companies if it didn’t have its partnership with Twilio, which strengthens its sales channel. He explained that one lesson he learned from KITT.AI was that the software itself isn’t what’s going to create a moat for the startup. Instead, he added, it comes from its current distribution, marketing and customer base.
Editor’s note: Additional investor and customer information has been added to this story.