Tutorial_WebMAUS_wav2vec2_whisper

This repository contains all necessary material for the tutorial: Jalal Al-Tamimi (2025): Arabic Forced Alignment: From WebMAUS to Whisper and wav2vec2 delivered during the 11th RJCP (Rencontres Jeunes Chercheurs en Parole) - Workshop TAL (LLF) on 5th of November 2025.

The slides of the talk can be found here.

Use this to cite this repository and the material in it:

Al-Tamimi, J. (2025). Arabic Forced Alignment: From WebMAUS to Whisper and wav2vec2 [TAL workshop Rencontres des Jeunes Chercheurs en Parole 2025 (11e édition)]. https://jalalal-tamimi.github.io/Tutorial_WebMAUS_wav2vec2_whisper/. DOI: https://doi.org/10.5281/zenodo.17485787

and cite original link for:

Forced-alignment with wav2vec2 comes from https://docs.pytorch.org/audio/stable/tutorials/forced_alignment_tutorial.html
Audio file used in whisper comes from https://github.com/bnosac/audio.whisper/tree/master/inst/samples

Here is a list of all required files:

Material for WebMAUS

Arabic WebMAUS.pdf: This is the PDF of the Al-Tamimi et al. (2022) LREC paper
example.wav: This is the wav file with the transcriptions of the Arabic text, from Al-Tamimi et al. (2022) LREC paper
example.txt: This is the text file with the transcriptions of the Arabic text, from Al-Tamimi et al. (2022) LREC paper

Forced alignemnt with wav2vec2

The original material for the forced-alignment with wav2vec2 comes from this website, which was adapted to include modules to install locally in addition to generating TextGrids with boundaries for words and sounds. Here are the required files:

forced_alignment_tutorial_JAT.Rmd: This is the Rmd file using python from within RStudio
forced_alignment_tutorial_JAT.html: This is the html output
forced_alignment_tutorial_JAT.py: This is the Python script to be used
forced_alignment_tutorial_JAT.ipynb: This is the Python notebook in ipynb
merged.TextGrid: This is the Praat TextGrid file generated with two tiers; one for words and one for segments

Automatic transcription with whisper + TextGrid

This contains original material to run whisper for automatic transcription. The files are written in the R programming language with python linked to RStudio. Here are the files required:

whisper_R.Rmd: This is an Rmd file allowing to run code in both R and the Python (towards the end)
whisper_R.nb.html: This is an R notebook with the output of results
jfk.wav: This is the audio file used to run the whisper code. The audio file came originally from here
words_whisper.TextGrid: This is the Praat TextGrid file generated with one tier for words

This site is open source. Improve this page.