Service

AI live translation for conferences.

Real-time captions on the room screen and voice translation for online viewers. One speaker, many languages for the audience, one team on event day. Runs on our own econf.ai platform, built for conferences.

.md →

Two delivery modes

1. Captions on the room screen

Text is translated live and shown on the main screen or a second screen next to the speaker. Useful for mixed audiences where some delegates listen in the source language and others want to follow in their own. The same caption stream supports attendees with hearing impairments and helps the event meet European Accessibility Act (EAA) requirements.

2. Voice translation for online viewers

Synthetic voice is delivered to the online stream only. The viewer picks a language from a menu, while the main video feed stays clean. Voice translation is not played in the room because it would overlap with the speaker. Two, three, or six languages can be added per event without changing the rest of the production.

What to expect on event day

  • Latency from word to caption: roughly 6 to 9 seconds. Online captions stay in sync with the picture; in the room, captions trail the spoken sentence the way a human interpreter does.
  • Accuracy depends on audio: a close-talk mic gives a better result than a room mic. The clearer the speaker and the better the mic, the cleaner the translation.
  • On-site monitoring: our technician watches the stream during the event, checks that captions are flowing, and intervenes if anything looks off on screen.
  • Failover stream: if the primary translation system fails, a secondary one takes over automatically without operator intervention.

Languages

For Lithuanian events the most common pair is EN ↔ LT, often with RU added. International events typically include PL, DE, UK, FR, ES, and others on request. The viewer menu shows only the languages enabled for the specific event.

  • EN ↔ LT
  • RU ↔ LT
  • PL ↔ LT
  • DE ↔ LT
  • UK ↔ LT
  • FR ↔ LT
  • ES ↔ LT
  • others on request

When it fits, when it does not

Good fit

  • Hybrid conference with online viewers in different countries.
  • Training, presentations, professional conferences with structured delivery.
  • Events where attendees with hearing impairments are present, or EAA compliance is needed.
  • Budget cannot fit interpreter booths for two or three languages.
  • Multilingual audience that should not be split into separate rooms.

Not a fit

  • High-stakes diplomatic events where every reference and nuance carries weight.
  • Fast panel discussions where speakers cut in over each other.
  • Narrow technical domains with novel terminology and no context.
  • Single-language audience with a single-language speaker, where it adds nothing.

What is included with ProConf

  • Audio feed into the translation system, taken from the same desk that already runs room sound.
  • Caption output to the room screen, second screen, or a strip on the main screen.
  • Online stream language menu, configured for your event's language set.
  • Monitoring on event day, our technician watches the stream and reacts if something drifts.
  • Failover stream, takes over automatically if the primary fails.
  • Caption archive, in CSV and SRT, per language, after the event.

Pricing

AI translation is part of the event package, not a separate subscription. A hybrid event with filming, streaming, and econf.ai platform services (AI translation, Q&A, registration) starts from:

from €1,200 excl. VAT

The final price depends on the language set, camera configuration, mic count, and scenography. LED screens and stage are sourced via partners and added separately. A standalone AI translation rate (for integrating into another team's AV setup) will be published with the pricing calculator in the coming months.

Frequently asked

How does AI live translation work at a conference?

Audio from the speaker mic is continuously fed into the translation system. Text is translated in real time and sent both to the room screen and to the online stream. Each online viewer picks their language independently from a menu, so adding more languages does not slow the main video stream.

How is this different from a human simultaneous interpreter?

A booth with two human interpreters is the better choice for high-stakes events where nuance, irony, and rhetorical turns matter. AI translation is the better choice when you need many languages at once (e.g. 200 online viewers in nine countries), when the budget cannot fit interpreter booths, or when speaker delivery is structured (presentations, training, panels). In practice we often combine both: human interpreters for the room, AI captions for online viewers.

What languages are supported?

For Lithuanian events, EN ↔ LT is the most common pair, often with RU added. International events typically include PL, DE, UK, FR, ES, and others on request. Only the languages relevant to your event appear in the viewer menu. Adding more languages does not require additional infrastructure.

What is the latency?

Roughly 6 to 9 seconds from spoken word to caption on screen. The online video stream is usually delayed by a similar amount, so captions appear in sync with the picture. In the room, captions naturally lag the spoken sentence, the same way a human interpreter does.

How accurate is it?

Accuracy depends on three things: mic position (close-talk beats room mic), speaker clarity (speed, accent, pauses), and terminology (narrow domain with novel terms is harder). For typical English or Lithuanian talks with good audio, the output is good enough for presentations, training, and most professional conferences. For high-stakes speeches where every reference matters, we recommend human interpreters.

Does this help with European Accessibility Act (EAA) compliance?

Yes, real-time captions make the event accessible to attendees with hearing impairments, one of the core EAA requirements for public training and conference events. The post-event caption archive (CSV/SRT) is available for documentation.

Do we need extra equipment in the room?

No. The caption signal comes from the same control desk that drives the main screen. We need one extra HDMI or NDI output to a screen, typically a second screen or a strip along the bottom of the main screen. The online stream needs nothing extra, captions and voice are wired in software.

What do we get after the event?

A full caption transcript in CSV and SRT, per language. If you order an Event Intelligence Page, the transcript becomes the basis of a post-event page with a Q&A summary and per-speaker session excerpts. Publish it openly or keep it private to attendees.

How much does it cost?

AI translation is bundled into the event package, not sold as a separate subscription. A hybrid event with filming, streaming, and econf.ai platform services (AI translation, Q&A, registration) starts from 1,200 EUR excl. VAT. A standalone AI translation rate (for integrating into another team's AV setup) will be published with the pricing calculator in the coming months.

Talk through your event.

Send us the event description, the language set, and an estimate of online viewers. We will include AI translation in the quote with concrete numbers.

kristijonas@proconf.lt

+370 601 45 516 · Kaunas · Reg. 303469970