Lingua Libre

From Wikipedia, the free encyclopedia
Lingua Libre
Lingualibre-logo.svg
Lingua Libre home page 2020-12.png
Overview of the website's homepage in December 2020
Type of site
Language recording tool,
Online linguistic media library
Available inMultilingual
OwnerWikimedia France
Created byWikimedia France and the Wikimedia community
URLlingualibre.org
AdvertisingNo
CommercialNo
RegistrationOptional, but required for recording
LaunchedAugust 2016; 5 years ago (2016-08)
Current statusActive
Content license
Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Lingua Libre is an online collaborative project and tool by the Wikimedia France association, which aims to build a collaborative, multilingual, audiovisual corpus under free license.

Description[]

Lingua Libre enables to record words, phrases or sentences of any language, oral (audio recording) or signed (video recording).

Words are presented to the speaker in the form of a list, created on the spot or in advance, or reusing an existing Wikimedia category. The speaker simply reads the word displayed on the screen, and the software moves on to the next word when it detects a silence after the read word.[1] This principle, borrowed from the open source software recorder with the help of its creator, Nicolas Vion, makes it possible to record several hundreds of words per hour. The recordings are then uploaded automatically from the web client to the Wikimedia Commons media library.

In spring 2021, Lingua Libre was offline due to a fire in Strasbourg,[2] but no audio recordings were lost.[3]

Use of the recordings[]

The recordings can be consulted either on Lingua Libre or on Commons. They are mainly used on other Wikimedia projects, for example to illustrate entries on Wiktionaries or proper nouns in Wikipedia articles.[1]

The re-use of the recordings in a language teaching context is envisaged.

The recordings are also reused in Natural Language Processing projects, for example to drive Mozilla's DeepSpeech speech recognition engines.[4]

Versions[]

Lingua Libre was initiated on January 23, 2015[5] and has had three main versions:

Lingua Libre v.1 (2016)[]

As part of the Languages of France project, which aims to document and promote the regional languages of France on Wikimedia and Internet projects in general, the conception of Lingua Libre started in November 2015, partly funded by the DGLFLF (General Delegation for the French language and the languages of France). The first version of the project is launched in August 2016. Only suitable for audio recording, Lingua Libre is shown during a workshop on Occitan language in December 2016,[6][7] and then presented to the online Wikimedia community[8] and at international events in 2017.

Lingua Libre v.2 (2018)[]

A complete rebuilding is launched at the end of 2017. The new version of Lingua Libre is based on MediaWiki, uses Wikibase and OAuth to better integrate into the Wikimedia environment. The interface is translated via Translatewiki.net so that the project can be used by a large number of communities. The new version of the site is ready in June 2018[9] and opens to the public in August 2018.

Lingua Libre v.2.2 (2020)[]

In 2020, important changes are made to the platform; a new look is developed especially for the site, the .org domain replaces the .fr domain used until then.[10] Lingua Libre now supports signed language through video recording.

Statistics[]

A recording session with a speaker of the Atikamekw language in 2017 in Montreal.

In the first two years of the project's launch, approximately 10,000 recordings were made. The transition to v.2 is accompanied by a sharp increase in the contribution. The number of recordings is multiplied by 10 in less than a year, exceeding the 100,000 threshold in May 2019. These recordings were made by 127 speakers in almost 50 languages.[11] By September 2020, the platform had more than 300,000 recordings in 90 languages with more than 350 speakers. The 500,000 recordings milestone was reached in June 2021, thanks to 540 speakers of 120 languages.[12]

See also[]

References[]

  1. ^ Jump up to: a b Sabine Buchwald (2019-08-04). "Wie Wikipedia Bairisch lernt". Süddeutsche Zeitung (in German).
  2. ^ "France : un incendie se déclare au datacenter OVHcloud de Strasbourg". Wikinews French (in French). March 11, 2021.
  3. ^ "Lingua Libre 2.3 - Phoenix Edition ǃ". Meta-wiki. March 19, 2021.
  4. ^ "Modèle français 0.4 pour DeepSpeech v0.6". Mozilla Discourse. March 10, 2020.
  5. ^ Rémy Gerbet (2018-05-14), "Lingua Libre : un nouvel outil collaboratif pour le public et les chercheurs", Culture et Recherche (in French) (137): 52, ISSN 1950-6295
  6. ^ French Ministry of Culture (2016-11-16). "Oc-a-thon 2016 : deux journées contributives sur l'occitan les 9 et 10 décembre" (in French).
  7. ^ Mathieu Denel (2016-12-21). "L'oc-a-thon, un edit-a-thon pour enrichir les projets Wikimedia et Lingua Libre en langue occitane". Wikimédia France Web Blog (in French). Retrieved 2020-12-03.
  8. ^ French-speaking Wiktionarists (2017-08-01). "Lingua Libre". Actualités du Wiktionnaire (in French). Retrieved 2020-12-02.
  9. ^ French-speaking Wiktionarists (2018-07-01). "Lingua Libre". Actualités du Wiktionnaire (in French). Retrieved 2020-12-02.
  10. ^ Sara Krichen (2020-06-02). "Lingua Libre fait peau neuve !". Wikimédia France Web Blog (in French). Retrieved 2020-12-02.
  11. ^ Miguel Trancozo Trevino (2020-04-15). "The many languages missing from the internet". BBC.com.
  12. ^ Lingua Libre's statistics page

External links[]


Retrieved from ""