Breaking Language Barriers With AI: How Alexander Konovalov And Vidby Are Reimagining Global Communication

by Annetta Benzar
May 21, 2025
Breaking Language Barriers with AI Vidby

Can a business operate globally without relying on English? Alexander Konovalov thinks so, and he’s building the technology to prove it. As the founder of Vidby, a Swiss-based AI translation platform, Konovalov is challenging long-held assumptions about language translation, communication, and the reliability of AI-powered software.

What began as an experiment in real-time voice translation has grown into a full-spectrum service used by enterprise clients, institutions, and global content creators. Along the way, Vidby was named a recommended vendor by YouTube and contributed to the early information efforts during the war in Ukraine.

In this exclusive interview with The Future Media, Konovalov reflects on the lessons of the past year, the role his team played during Ukraine’s time of crisis, and why the future of AI translation is not about eliminating barriers but about making every language count.

How did the idea for Vidby come about? What prompted your interest in automatic video translation?

In 2011, I started thinking about going global. I wanted to build a business that wouldn’t be tied to any one country, which naturally led to the idea of learning foreign languages. I considered learning English, but quickly realized that wouldn’t be enough. There are too many languages in the world, and I neither had the time nor the desire to learn them all.

After buying an Android smartphone and installing Skype (there wasn’t even an official Skype app for Android yet!), I thought: what if you could place an online translator between the microphone and Skype that could enable real-time call translation. Back then, in 2011, it was just an idea.

The ideas grew. One of them was voice-controlled devices. I even filed a patent for it, which was recognized by Forbes Ukraine as one of the country’s top five patents. But it wasn’t until 2013 that I decided to put everything aside and fully commit to launching my own startup. I chose to begin with the idea of an automatic call translator. The first version was developed specifically for Skype on Android to go full circle to how it all started.

Was there a defining moment when you realized Vidby had the potential to scale?

Yes, I think it was summer 2023. That’s when Vidby became the first company in the world to be officially recommended by YouTube for AI-powered video translation. If you check YouTube’s official support page, you’ll see they initially listed only five companies, and Vidby appeared twice, once directly and once through AIR, our distribution partner. 

What made it even more significant was that YouTube had its own internal translation project, Aloud, which didn’t make the list. But we did. That recognition was a clear sign that we were on the right path. And, of course, it confirmed the high quality and reliability of our technology.

By the end of 2023, we realized our then target market for whom we had initially created the project (YouTube creators) was not ready and that the market no longer existed. In fact, YouTube creators who use multilingual audio tracks end up penalized by the algorithm. They publish a video, the translation isn’t ready, users click away, and YouTube interprets that as low engagement. Then, when the translation is uploaded, there are no new notifications. The system punishes creators for even trying. Until YouTube fixes this, it simply isn’t a viable market for sustainable growth. At some point, YouTube will probably address this, but as a startup, we couldn’t afford to just sit back and wait for that day.

So, we turned our focus to the enterprise segment, where the market is larger and more mature. We began developing complementary solutions around video translation, for example, real-time interpretation and multilingual video conferencing. That led us to position Vidby as an all-in-one solution.

The arrival of tools like ChatGPT has disrupted many AI companies, but you mentioned this led to new opportunities for Vidby. Can you further comment on this?

We prefer to think in terms of augmented intelligence, not artificial intelligence replacing humans, but tools that make people more effective. As the saying goes, “AI won’t replace you, but someone using AI will.”

Before ChatGPT and the broader AI boom, we were facing two major challenges. First, the general perception of machine translation was, honestly, overwhelmingly negative. Even many of our potential clients would say, “We don’t even want to try it because we’re sure it is as bad as Google Translate.” Second, investors were skeptical because of the lack of competition. In their minds, if there were very few others working in the space, then maybe there wasn’t a real market for the service.

But once OpenAI came into the picture and proved that AI can be both useful and, most importantly, of high quality, and then new competitors started to pop up, almost ten a day. Both challenges started to resolve themselves. It removed long-standing resistance and accelerated our growth. 

It also gave us the push or momentum to improve our own product. We started to integrate new technologies, both proprietary and third-party, and this raised the quality of our translations to a level that we believed was out of reach before. In translation, quality is everything. Some translators might talk about the “98–99% accuracy” rule, but in translation, a 1% error rate means 1.5 to 2 mistakes per minute. That’s 15–40 mistakes in a 10-minute video, which is completely unacceptable. 

Our enterprise clients won’t accept even a single error in 10 minutes. That’s why we continue to offer human quality control as an option.

Your collaboration with the Ukrainian President’s Office drew attention. Did that experience prepare you for today’s challenges?

It actually happened by chance, and we went into the collaboration without commercial intentions. It was our way to support Ukraine. I’m from Ukraine, and most of our team is based there. When the war began, and President Zelensky started delivering daily addresses in Ukrainian, I felt it was vital for the world to hear his message in their own languages.

For the first three to four months, we translated the President’s daily videos ourselves and distributed them to media outlets around the world. We tried to establish contacts with journalists directly, but it was difficult to gain any traction. Even within Ukraine, it took time to connect with the media, understandably, given that in the early days and weeks of the war, people had other priorities. 

It wasn’t until four or five months in that we finally managed to establish a connection with the President’s Office, through the Ministry of Foreign Affairs. From then on, we began providing translations on request, always free of charge. That continued for about seven to eight months, until they set up an internal system as part of a more coordinated state communication strategy, and our involvement came to an end. 

This was simply our small contribution; we did what we could, and I hope it made a difference for people around the world to access information during some of the very difficult moments of the war.

What is Vidby today? How has your structure, team, and focus evolved?

Today, Vidby has transformed from a video translation service into a comprehensive, all-in-one solution for translation. We handle videos, documents, live interpretation, customer support, live streams, and calls. Some competitors show off slick demos, but their tools often fail in real-world conditions. We hear this from clients all the time. Many clients come to us after trying solutions that were promised to them, but even big-name companies failed to deliver on them. That’s why we focus on what we call “hybrid quality,” having AI at the center but with human oversight.

What do you see as Vidby’s key competitive advantage?

Our biggest advantage is that we offer a complete, all-in-one translation solution. As far as I am aware, no other company provides a full suite of AI translation services like Vidby.

Some focus on video, others on calls, or documents, but offering all of these in a single platform is something only we do. 

What makes us different is our sole focus: translation. ChatGPT, for example, handles one-on-one dialogue translation, but it’s just a slightly improved version of Google Translate, and its main function is really chat-based interactions and search. ElevenLabs specializes in voice, Heygen in avatars, and so on. In all these cases, translation is an additional feature. For us, it’s the main event. It’s where we concentrate all our efforts and energy. 

What is your pricing model?

Our pricing depends primarily on two factors: the volume of content and the required quality level. So, we offer four tiers of quality, depending on what a client needs. The price per minute of video translation can range from as low as 0.90 to around 15. In subscription models or bulk packages, that can drop to around 0.50 per minute.

For live interpretation and call translations, we typically charge about 10 per hour. That can cover multiple languages at once. It’s important to us that even enterprise-grade solutions are cost-effective, especially compared to traditional human interpreters, which can run $150 an hour or more.

We also offer monthly subscription pricing for specific industries. For example, in hospitality, we are developing a solution where hotels can install a simple room service QR code that connects guests with reception, restaurants, or gyms, in the client’s own language. That service is expected to cost around 3–5 per room, per month.

We’re trying to be thoughtful with pricing, not a one-size-fits-all approach.

Where do you see the sectors of tech, video, and language heading in the next five years?

I believe we will see mass adoption. There’s still quite a bit of mental resistance when it comes to communication today. Americans often assume the world speaks English. And around the world, people tend to believe that to communicate or collaborate, everyone needs to speak the same language.

You can test that yourself. Look at the contacts on your phone or email list. You will most likely not find someone who speaks a language you don’t. But I have many contacts with whom I don’t have a common language. This reveals a major social problem but also a massive opportunity.

The most transformative companies from around the world are not necessarily the ones to build new technologies. They are the ones that change the behaviors and habits of people, and even their vocabulary.

Vidby is about creating a new habit: allowing people to communicate with the world in their native language. Our mission isn’t to remove language barriers, as people often assume. It’s the opposite. We want to give everyone the ability to speak to the world in their native language, preserving linguistic diversity across the globe. In this sense, our technology doesn’t erase, it protects. The ones eroding diversity are those pushing everyone to speak English. 

I’ve recently proposed to Swiss authorities a pilot project to allow expats to take language exams using AI-assisted tools. The point isn’t to cheat the system. It’s to modernize how we define language proficiency in a world where digital fluency can be just as valuable as memorization. We’ve seen strong early adoption in sectors where communication across languages is essential. For example, migration centers in Switzerland use our tools to help newcomers access services. Some hospitals are also testing our live translation for patient interaction. Even high-end hair salons have shown interest. The need for seamless communication is everywhere.

We believe that if everyone can communicate in their own language, there is no need to learn a single dominant one. Vidby is already a ready-to-use solution for that.

Back

Become a Speaker

Become a Speaker

Become a Partner

Subscribe for our weekly newsletter