HAIKO leads project on Automatic Transcription and Translation using Artificial Intelligence

STREAMS PROJECT – Live Streaming with Automated Multilingual Subtitling

Following our strategy in HAIKO Technologies to be a reference in Artificial Intelligence based Services and Solutions, from our R&D department we have been part of the STREAMS – Live Streaming with Automated Multilingual Subtitling project consortium.within the scope of SPRI’s HAZITEK program, during the years 2021, 2022 and 2023.

HAIKO has acted as leader of this project in a consortium formed by the companies GOIENA, JARKATZA, MIXER, MONDRAGON LINGUA, NOTICIAS DE GIPUZKOA and ONDA VASCA, with the technical collaboration of the research center Fundación Vicomtech, member of the Basque Research and Technology Alliance (BRTA) and RVCTI.

Artificial intelligence technologies, machine learning, neural networks and Masked Language Models have been used in the execution of the project.
As tangible results of the project, a technological solution with the following capabilities has been implemented:

  • Transcription of audio and video content in Spanish, Basque, English and French.
  • Automatic translation between the above mentioned languages.
  • Automatic generation of multi-language subtitles
  • Speech synthesis, or creation of Audio from text using Artificial Intelligence models

Use cases

The results of the project will allow the member companies of the consortium to generate new business opportunities. In the case of HAIKO Technologies, the technological advance that STREAMS represents will allow us to integrate Transcription (“Speech-To-Text”) and Automatic Translation in corporate environments, for use cases such as:

  • Recording and analysis of meetings.
  • Customer service.
  • Attention and support to internal teams.
  • Document management systems.
  • Internal and external training.
  • On-boarding processes.
  • Training of teams in multi-cultural and geographically distributed environments.
  • Deployment of multilingual services.
  • Groups with visual and/or cognitive disabilities.

Privacy and security

STREAMS allows us to run these services in private (“on-premise”) environments, without exposing our clients’ data to external cloud service providers, thus ensuring the privacy of the use of the content.

Scientific publications

As a result of the work of the scientific team, the researchers  David Ponce, Thierry Etchegoyhen and Victor Ruiz, members of the Vicomtech Foundation, Basque Research and Technology Alliance (BRTA) and the University of the Basque Country UPV / EHU, have carried out the following publication: Unsupervised Subtitle Segmentation with Masked Language Models.

It describes a novel unsupervised method for subtitle segmentation based on previously trained Masked Language Models, in which line endings and subtitle breaks are predicted based on the probability of occurrence of punctuation marks at candidate segmentation points.

The method obtained competitive results in terms of segmentation accuracy in all metrics, while fully preserving the original text and meeting length constraints. Although supervised models trained with own-domain data and with access to source audio information can provide higher segmentation accuracy, this method is highly portable across languages and domains and can be a robust solution for subtitle segmentation.

A video demonstration of this technology can be found here:

All the information on this scientific publication can be found at: https://aclanthology.org/2023.acl-short.67/

Te puede interesar:

We are thrilled to find out a groundbreaking achievement that’s pushing the boundaries of what’s possible in artificial intelligence. Groq’s team has successfully developed a

In an increasingly digitized business environment, software plays a key role in the success of organizations. From mobile applications to artificial intelligence systems, the

Today, AI has become an essential component in driving innovation and progress in a variety of industries. In this article, we will explore the current

Today, data security and the protection of digital assets are key concerns for organizations in all industries. In this article, we will explore the current

Somos Haiko

Somos Haiko

Casos de éxito

Actualidad / Blog