Automatic Generation of Audio Descriptions for People with Visual Impairments

Description:
Sub-project 4 (SP4) of the Flagship IICT (PFFS-21-46) focuses on the automatic generation of audio descriptions to improve the accessibility of televised content for people with visual impairments. AI algorithms are used to extract visual information from non-spoken scenes and generate structured text descriptions, which are then vocalized by a synthetic voice.

Experts/Researchers/Institutions:

Prof. Julien Torrent (iCARE)
Swiss TXT

Why it matters:
This project transcends traditional approaches by making audiovisual content accessible to marginalized audiences. AI enables real-time descriptions, a task that would be impossible to perform manually at a large scale, thereby enhancing the quality of life and autonomy of blind or visually impaired individuals.

Funding:
Innosuisse