Tutorial_WebMAUS_wav2vec2_whisper
This repository contains all necessary material for the tutorial:
Jalal Al-Tamimi (2025): Arabic Forced Alignment: From WebMAUS to Whisper and wav2vec2 delivered during the 11th RJCP (Rencontres Jeunes Chercheurs en Parole) - Workshop TAL (LLF) on 5th of November 2025.
The slides of the talk can be found here.
Use this to cite this repository and the material in it:
Al-Tamimi, J. (2025). Arabic Forced Alignment: From WebMAUS to Whisper and wav2vec2 [TAL workshop Rencontres des Jeunes Chercheurs en Parole 2025 (11e édition)]. https://jalalal-tamimi.github.io/Tutorial_WebMAUS_wav2vec2_whisper/.
DOI: https://doi.org/10.5281/zenodo.17485787 
and cite original link for:
- Forced-alignment with wav2vec2 comes from https://docs.pytorch.org/audio/stable/tutorials/forced_alignment_tutorial.html
- Audio file used in whisper comes from https://github.com/bnosac/audio.whisper/tree/master/inst/samples
Here is a list of all required files:
Material for WebMAUS
- Arabic WebMAUS.pdf: This is the PDF of the Al-Tamimi et al. (2022) LREC paper
- example.wav: This is the wav file with the transcriptions of the Arabic text, from Al-Tamimi et al. (2022) LREC paper
- example.txt: This is the text file with the transcriptions of the Arabic text, from Al-Tamimi et al. (2022) LREC paper
Forced alignemnt with wav2vec2
The original material for the forced-alignment with wav2vec2 comes from this website, which was adapted to include modules to install locally in addition to generating TextGrids with boundaries for words and sounds. Here are the required files:
- forced_alignment_tutorial_JAT.Rmd: This is the Rmd file using python from within RStudio
- forced_alignment_tutorial_JAT.html: This is the html output
- forced_alignment_tutorial_JAT.py: This is the Python script to be used
- forced_alignment_tutorial_JAT.ipynb: This is the Python notebook in ipynb
- merged.TextGrid: This is the Praat TextGrid file generated with two tiers; one for words and one for segments
Automatic transcription with whisper + TextGrid
This contains original material to run whisper for automatic transcription. The files are written in the R programming language with python linked to RStudio. Here are the files required:
- whisper_R.Rmd: This is an Rmd file allowing to run code in both R and the Python (towards the end)
- whisper_R.nb.html: This is an R notebook with the output of results
- jfk.wav: This is the audio file used to run the whisper code. The audio file came originally from here
- words_whisper.TextGrid: This is the Praat TextGrid file generated with one tier for words