Speechly is a developer of a voice command software designed to translate spoken words into actionable tasks. The company’s software utilizes audio processing, speech recognition, natural language processing, and voice activity detection for command recognition. The API developed by Speechly is a streaming, spoken language understanding API that works to understand complex tasks with a multimodal interface available across platforms for use of voice in e-commerce, VR, gaming, digital health apps, and professional work.
Speechly is also a technology. Speechly is the tool our founders started to build in 2017. It’s a set of tools and APIs that enable developers to build voice-enabled user interfaces to their apps and services on any platform. It can be used in gaming, digital health, VR, point-of-sale terminals and more. it have ready SDKs for iOS, Android, Unreal Engine and web. they call this tool a spoken language understanding tool.
One of the most unique ideas in it technology is that it combines speech recognition and natural language understanding technologies. Most other providers do these tasks separately. This allows our tool to be faster (almost real-time!) and more accurate than other providers. What does it mean?
When the user starts talking, most other providers start listening to the audio stream and translate the stream into speech. Many of them do this pretty well and after the user stops talking, they give out a text string that represents the sentence the user just said. This is called speech recognition.
After we have the sentence we need to make sense of it. This step is called natural language understanding (NLU). Let’s say we have two sentences “turn off the lamp” and “shut the lights”. Even though they look different as text and if said out loud sound very different, they mean the same.
Current smart speakers transform audio first to text and from text to intent.
Traditional providers work like in the picture. They first listen to the audio stream and once that finishes, they send the data to another service that extracts meaning out of it. After we have the meaning, we know what the user wants to do and can fulfill the request.
This sounds simple, but it’s not how we humans talk. We don’t wait for others to say what they want to say and only after that think what the other was trying to say. We listen and understand simultaneously.
If we hear someone saying “Olive, the other reindeer”, we might guess that we misheard and what we were supposed to hear is actually “All of the other reindeers” (at least if we were having a discussion on Christmas carols or Rudolph the Red-Nosed reindeer). Understanding and context make hearing easier. it also give feedback to others about our hearing and understanding as we speak: we nod, say “a-ha” or look confused based on what we have just heard (or rather what we think we heard).
it technology does something similar. When the user starts talking, Speechly starts evaluating the intent of the user. When Speechly hears “Turn off –” it can already guess that the user intent is about shutting off something, maybe a light or other device somewhere. When the user finishes “– living room TV”, Speechly has all the information it needs and can start proceeding with the actual task of shutting it down.
Speechly combines the mechanics of speech recognition and natural language understanding for faster and more natural user experience
Another feature of Speechly is multi-modality. By multi-modality it mean that users should be able to interact with their apps using different modalities in different situations. Say you want to order a pizza. You don’t want to hear a long list of pizzas they have on their menu but rather see a list of pizzas and select the one that looks the best. But when it’s the time to give your address for delivery, it’s easier to say it out loud than to type it on the clumsy mobile phone screen. Multi-modality is the option to choose the best interaction type for each task. they technology is built for multi-modality from ground up.
they founded Speechly in 2016 with the firm belief that voice is the future of interaction. Our mission is to use voice technology to enable more delightful human computer voice interactions, and to empower and enhance communication between people.
Speechly’s efficient, real-time feedback and listening lets the technology do the work, leaving the users free to speak naturally. Our flexible Voice API gives companies total ownership of the experience, offering the simplest way to add AI Voice Technology and Reactive Voice User Interfaces to any experience.
Our technology team consists of expert machine learning researchers with academic backgrounds, combined with industry experience working in the companies that created the previous generation of voice technologies. We have deep empathy for developers – because we are developers. Our platform was built with the needs of our developer partners in mind, and our team is constantly working on new ways to improve the developer experience.
Easy-to-use client libraries and UI components, and a simple configuration language.
Our patented Spoken Language Understanding® technology is the fastest on the market.
Base accuracy is on-par with industry leaders. Custom tuning and modeling improves accuracy by up to 60%, surpassing competitors.
Support all interaction modalities; touch, vision, and voice when you build with Speechly.
Our API is fully streaming and supports arbitrarily long utterances.
The Speechly API is GDPR and CCPA compliant and takes a security-first approach to all development.
Google LLC is an American multinational technology company that focuses on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics.
Apple Inc. is an American multinational technology company that specializes in consumer electronics, software and online services headquartered in Cupertino, California, United States.
Amazon.com, Inc. is an American multinational technology company which focuses on e-commerce, cloud computing, digital streaming, and artificial intelligence. It has been referred to as “one of the most influential economic and cultural forces in the world”, and is one of the world’s most valuable brands.
Yahoo! (/ˈjɑːhuː/, styled yahoo! in its logo) is an American web services provider. It is headquartered in Sunnyvale, California and operated by the namesake company Yahoo Inc., which is 90% owned by investment funds managed by Apollo Global Management and 10% by Verizon Communications.
Speechly provides accurate real-time Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) tools and APIs. Ideal for extracting information from spoken language or creating voice-enabled multimodal apps.