Компания Google представила свою новую модель Gemini 3.5 Live Translate, которая предназначена для голосового перевода в режиме реального времени. Эта нейросеть способна автоматически распознавать более 70 языков, что делает её универсальным инструментом для международной коммуникации. Одной из ключевых особенностей новой модели является способность сохранять оригинальную интонацию, темп и высоту голоса спикера, что значительно улучшает качество перевода и делает его более естественным.
Gemini 3.5 Live Translate может быть полезна в различных сферах, включая бизнес, образование и путешествия, где требуется мгновенный и точный перевод. Использование такой технологии может облегчить общение между людьми, говорящими на разных языках, и способствовать более глубокому пониманию между культурами. Google продолжает развивать свои технологии в области искусственного интеллекта, стремясь сделать их доступными для широкой аудитории.
Google has introduced a new model, Gemini 3.5 Live Translate, designed for voice translation in real time. The neural network automatically recognizes more than 70 languages and generates a translation, preserving the original intonation, tempo and pitch of the speaker’s voice.
Image
Unlike systems that wait for the end of a phrase, the new algorithm processes the audio stream synchronously. The delay between the original remark and the translation is several seconds and allows you to avoid unnatural pauses in the dialogue. The model is adapted to work in noisy conditions and does not require manual adjustment of additional parameters. For security purposes and to combat disinformation, all generated audio recordings are marked with an invisible digital SynthID watermark.
The tool is already available to developers in public preview through the Gemini Live API and Google AI Studio, and integration with specialized platforms such as Agora, LiveKit and Vision Agents allows you to create applications without having to build your own complex media streaming infrastructure. The technology is also currently being trialled by Asian taxi and delivery service Grab, which handles more than 10 million calls per month, to enable communication between drivers and customers.
In the corporate segment, the implementation of the algorithm will begin this month as part of closed testing for Google Workspace subscribers in the Google Meet video conferencing service. The update will allow conversations to be translated using more than 2,000 language combinations within a single meeting, whereas the previous version of the system only supported five languages and required English. A wide release for business is planned for the end of the year.
Ordinary users will have access to the feature through a global update to the Google Translate application on iOS and Android platforms. When any headphones are connected, the system will broadcast translated speech directly to the interlocutor. For owners of Android devices, there is also a special “listening mode” that allows you to hear the translation directly through the phone’s speaker. As Google explains in a blog post, this feature can be useful in situations where "you need to quickly hear the translation without attracting the attention of others, and you don't have headphones handy."
If you notice an error, select it with the mouse and press CTRL+ENTER.