I was looking for French podcasts recently and found it difficult to find ones that were interesting but still understandable. I thought this would be an interesting programming project, so I created a script to assess the oral comprehension difficulty of French podcasts.
How the script works
- It takes the Spotify URL for a French podcast link
- It downloads a 30-60 sec excerpt of the podcast using the Spotify API
- It transcribes the audio using Google Cloud Speech API (French words only)
- It checks what % of the words are uncommon. It does this by checking what % of the words are not in a list of the 1000 most common French words. Both the list of 1000 common words and the transcript are lemmatized so that conjugated verbs and plurals are still identified as the same word. E.g., so that “suis” and “es” both match “être”.
- It estimates the talking speed by calculating how many words were spoken per minute (number of French words in transcript / excerpt duration in seconds * 60)
Output
See output of the script run against 28 popular French podcasts below. Note that the script only transcribes French words. So podcasts that also contain English (e.g., the Duolingo podcast) will have a low amount of French words per minute. However, that still reflects an easier podcast to understand so kept it as is.
To make the podcasts easier to rank I added a score/rank for each metric. So the podcast with the lowest degree of uncommon words/words per min has rank 1 in that metric, the one with the highest has 28. I then combined the ranks/scores for the two metrics into a combined score/rank to make it easier to see which podcasts are easier to understand taking both metrics in account.
Overall this was a fun experiment. I might continue building this out with better logic in the future.
All the code is available here
Name | Combined score/rank | Pct uncommon words | French words per min | URL |
Duolingo French Podcast | 3 | 15% | 47 | Link |
One Thing In A French Day | 13 | 21% | 126 | Link |
InnerFrench | 16 | 25% | 117 | Link |
Hondelatte Raconte – Christophe Hondelatte | 19 | 26% | 115 | Link |
L’Heure du Monde | 19 | 22% | 140 | Link |
La société de minuit | 20 | 28% | 89 | Link |
Coffee Break French | 21 | 30% | 77 | Link |
Pépites d’Histoire | 23 | 26% | 130 | Link |
Mythes et Légendes | 24 | 25% | 148 | Link |
La Story | 27 | 30% | 124 | Link |
Easy French: Learn French through authentic conversations | Conversations authentiques pour apprendre le français | 27 | 27% | 130 | Link |
Learn French by Podcast | 28 | 35% | 50 | Link |
Podcast Français Authentique | 28 | 26% | 167 | Link |
HVF – Histoires Vraies et Flippantes | 30 | 34% | 109 | Link |
French Through Stories | 32 | 46% | 75 | Link |
Ces questions que tout le monde se pose | 32 | 22% | 245 | Link |
Les Baladeurs | 33 | 27% | 175 | Link |
Entrez dans l’Histoire | 33 | 25% | 189 | Link |
Transfert | 33 | 23% | 200 | Link |
Le Précepteur | 36 | 35% | 127 | Link |
Le Podkatz | 39 | 33% | 144 | Link |
Canapé Six Places | 41 | 32% | 174 | Link |
Passe le plaid | 41 | 31% | 178 | Link |
BURGER RING | 41 | 29% | 191 | Link |
French with Jeanne | 43 | 37% | 141 | Link |
Little Talk in Slow French : Learn French through conversations | 43 | 31% | 181 | Link |
J’ai peur, donc j’y vais | 47 | 31% | 197 | Link |
Les actus du jour – Hugo Décrypte | 48 | 35% | 177 | Link |