Multilingual video conferencing platform KUDO recently raised $21 million. But as many platforms build automatic captions and translation into their services, where is that investor confidence coming from?
Both established event tech companies and new startups have experienced a lot of growth and funding throughout the pandemic, and the momentum hasn't stopped. KUDO, a video conferencing platform that provides live interpretation in over 100 languages and 147 sign languages, closed out a $21 million Series A funding round last month.
KUDO was founded in 2017, before Covid forced virtual events onto the industry, and previously raised $6 million in seed funding in July 2020. This most recent investment round at $21 million ended up oversubscribed — according to KUDO CEO Fardad Zabetian, they initially only intended to raise $15 million — which is relatively uncommon and speaks to the interest that the company has drummed up.
KUDO is not the only company combining technology and human interpretation, Interprefy and Interactio both offer app based solutions. However, with automatic captions and translation being built into more and more virtual event platforms, what does this investment say about the utility of human interpretation?
Human Interpretation vs. Automatic Translations
Multilingual features such as AI-generated captions and real-time human interpretations are important when it comes to consuming content online as they help make event content more accessible to both diverse audiences and attendees with disabilities.
Automated captioning features have been available on platforms like YouTube for years, and many out-of-the-box virtual event platforms now include or are adding these capabilities. However, human interpretation like that offered by KUDO still has important use cases.
For one thing, in order for attendees to be able to truly listen to a live session in another language or watch a sign language translation, human interpreters remain the only option. And while any type of captioning and/or translation feature is better than nothing, AI-powered features are not as precise as human interpreters (at least for the time being).
This is an issue particularly for events in specialized industries, such as medical conferences, that use a lot of jargon or technical terms. However, as automated offerings inevitably continue to improve, human interpreters may not always have a leg up in this area.
For example, many auto-generated captioning tools currently allow users to input keywords that will be used throughout the event to help the AI recognize them and get them right. Some platforms also enable the AI to learn what terminology is being used over multiple events by the same organizer and improve itself for the next ones.
Automation also wins when it comes to scalability — human interpreters inevitably require much more overhead, even in a remote setting. For events that only require translation or interpretation into a couple of languages, hiring human interpreters may be preferred. However, for large, multi-day events with content being translated into many languages, human interpretation can quickly become prohibitively expensive and logistically complicated.
That said, inventors are clearly betting on demand for human interpreters for virtual and hybrid events continuing in the future. And it likely will, but to what extent remains to be seen as it will increasingly have to compete with automated features from other virtual platforms.
KUDO's most recent investment round proves that there is a lot of confidence in the long-term viability of human interpretation services for online events, even as automated translations and captioning become more common offerings on virtual event platforms.
There are limits to having human interpreters for events, but there are also benefits that automation can't (yet) provide. While automation will undoubtedly become more appealing for certain use cases, investors are still confident about the future growth potential of live interpretation — just like nothing can replace face-to-face, there is an undeniable advantage to having a human speak/sign to you instead of reading captions that may not be accurate.