Audio Rife with Errors: Needs a Native Speaker to Check Each Sentence

I love the Russian course and I think it’s great overall, but the audio is terrible. I’m not particularly good at speaking Russian, but even I at my very elementary level have spotted a ton of errors in the pronunciation.

The most common errors are mispronouncing “е” as “ё”, such as the word “все” being mispronounced as “всё”, and words that are accented on the wrong syllable, typically changing them to a different word that often is grammatically incorrect and makes it a nonsense sentence. For example in one exercise it recently mispronounced the adjective “друго́м” (masculine or neuter prepositional case, meaning “other”) as “дру́гом” (masculine instrumental case, meaning “friend”) and it does this sort of stuff all the time. Search
Another, “пора́” (a noun meaning time, often used to mean “it’s time”) is mispronouced “по́ра” (a noun meaning “pore”.)

I am virtually certain that there are far more errors than I have reported or even noticed, because I’m pretty terrible at Russian myself and I am only able to detect the errors in the most common words that I have already mastered.

The prevalence of the errors also makes me extremely skiddish about continuing in the Russian course (or about recommending it to others) because I absolutely want to avoid creating a bad habit of hearing and internalizing a word with the wrong pronunciation. Learning things wrong and having to unlearn them is terribly inefficient and I think something it is important to avoid. It always takes me longer to unlearn something than to learn it. And it would probably take far less time for one person to fix all these things, than it would take for everyone taking the course to have to unlearn these things after the fact.

I would strongly prefer if the audio could be gone over, every sentence by a native speaker, and corrected, so that the course can be actually useful for us to practice and won’t be setting us up to have to unlearn things later.

3 Likes

Thanks for letting us know! Could you please send a short list of sentences with pronunciation errors? Even just 10 or so would be helpful. We’ll have one of our native Russian speakers check them out and we can try out some other TTS voice options.

1 Like

Is it more helpful for me to report using the in-exercise Flag button (what I’ve been doing), or some other way?

My main worry here though is that I don’t know Russian very well at all, so I doubt I am catching most of the errors.

1 Like

No worries - we’re just looking for a sample to double check + compare with other voices and to make sure we’re aligned with what you’re hearing :slight_smile: If you wanted to post them here or email me at mike@clozemaster.com, whichever’s easiest. We’ll of course also run through and check a bunch on our end as well.

Hi Mike,

Not to derail the conversation, but which languages would you have native speakers available for review? It would be cool to have natives have their input/review. Just a thought however :slight_smile:

I reported at least eight more errors today, I notice the audio is consistently pronouncing Тому wrong, it’s pronouncing it as тому, i.e. it’s accenting the second syllable, which makes it the pronoun, and not the first syllable which makes it the name.

I also caught the word происходит being accented wrong in two different sentences, on the last syllable instead of the second-to-last. And another two sentences where все was mispronounced as всё. The last vowel of the word радио was also mispronounced. The expression другие дела had the second word accented on the wrong syllable too.

I am getting a little frustrated because I took a slight break from the Russian course because there were so many errors like this, and I just came back today and I don’t see any evidence that this is progressing. 8 errors in 100 sentences is frustrating, it really slows me down, but more importantly, because I’m a beginner myself, it seems highly likely that there could easily be just as many errors that I am not catching, and thus I am learning the wrong pronunciation and will get confused and have to unlearn it later, which is a huge waste of my time.

Is there a way I could donate more money so this could progress faster?

Otherwise this course is pretty useless to me.

1 Like

Thanks for the reports and apologies for the frustration! We’re using the best text-to-speech available for Russian (as far as we know). If we come up with better voices we’ll of course use them. We’ll also try out some alternative voices for the sentences you reported to see if the pronunciation with those is any better. If you know of other apps/services with better text-to-speech please let us know. :slight_smile:

Perhaps a confidence boost - we’re working with two native Russian speakers who’ve both said separately that though there’s the occasional mispronunciation like you’ve mentioned (все was mispronounced as всё is perhaps the worst offender, primarily because всё isn’t always written with the umlauts when it could be), the text-to-speech is actually quite good.

For the most accurate pronunciations, the Cloze-Listening feature available for Russian might be the best option since it uses native speaker recordings.

Thanks again for letting us know and for the reports!

Sorry I missed this @datsunking1! Our moderator team currently covers Spanish, French, German, Polish, Italian, Russian, Ukrainian, Portuguese, Chinese, and Japanese. Agreed it’d be cool to get them more involved. We’re considering some options discussed here, and at the moment we’re thinking we’ll go with a way for mods to associate sentences with explanations, something like a Q&A database.

2 Likes

Is there currently no “manual control” over the Russian TTS? I.e. no way to send it more specific signals it if it’s pronouncing something wrong, such as signaling it which syllable to accent a word on?

It seems like this would be a basic feature that most people using a TTS would want and I would find it surprising to learn that such a feature wasn’t available.

1 Like

@mike, the ability to distinguish between “true” е and “dotless” ё, and to determine where the accent falls in words, is precisely what a Russian learner needs from a TTS. Since there are word pairs that are distinguished only by a difference between е and ё (like все/всё, mentioned above), and others that are distinguished only by emphasis (like пора́/по́ра, also mentioned above, which ordinarily are written without an accent mark), a TTS engine worth anything to us at all cannot rely solely on either dictionary lookup or algorithms that pertain solely to words in isolation. However, I don’t know anything about the TTS engines available, so I can’t say whether TTS engines like the ones we would need are even available, let alone at a reasonable price.

I wonder whether the native Russian speakers are evaluating the TTS engines on criteria that might be more important to a native speaker (such as naturalness of pronunciation) than to a learner (such as placement of stress in a word). Would it be possible for them to comment on the criteria they’re using? Also, in a discussion about the pronunciation field for Russian ( "Pronunciation" field useless for Russian, but could be made useful - #8 by mike - Russian - Clozemaster ), you linked to a list of 100 sample sentences that your native Russian speakers marked for errors. Could they do something similar for audio?

In the meantime, although I tend to use audio much less than text, I’ll listen to it when I can and report errors that I hear.

1 Like

This is exactly why I bring this up. To me, I think it makes the difference between Clozemaster being an outstanding tool and a mediocre tool.

And right now the Russian course is not up to the level of, say, the German and Spanish courses because these errors are so frequent. It’s more of an issue in Russian too because, in Spanish at least, the pronunciation (including accented syllables) is mostly notated. In Russian, it is not, so often there is no way for a learner to know whether or not a pronunciation is correct without consulting external tools.

This places a learner in the unfortunate position of either learning a lot of wrong pronunciations (which will hinder their progress and make them unlearn stuff later) or putting a lot of effort into checking each pronunciation manually using external tools. Either way, it wastes the user’s time, when it would be more efficient to fix all this stuff once and for all, centrally, before people start doing the courses.

And like I said, this stuff has a value for me. I’m willing to pay for it. To give you a sense of this…I am a paying subscriber, but it’s primarily because I’m focusing on the German course right now, and to a lesser degree, Spanish. I do not think the Russian course is good enough currently to justify me paying for it, because of this issue. I don’t even think it’s really good enough to justify me working on it, because of these concerns.

I keep dabbling every few weeks to see if enough of these errors have been fixed that I feel comfortable proceeding, and they haven’t, and it leaves me feeling continually frustrated.

1 Like

Thanks for the feedback! This is all helpful to know, and we’re working on figuring out possible solutions. Getting 250k+ sentences (and the many thousands more newer sentences yet to be imported from Tatoeba) for Russian proofread for pronunciation and then listened to with TTS corrections made using IPA (let alone if we eventually want to support multiple voices) is the challenge. A fix will likely take some time, but we love a good challenge and we’re working on it! We’ll follow up here as we have any updates. In the meantime please do continue to report the issues as you come across them - it’s super helpful as we check possible automated solutions and potentially better TTS options. Thanks again!

2 Likes

This makes sense, but I wouldn’t expect you all to be able to do this for 250k+ sentences in a timely manner.

Couldn’t you prioritize the early sentences in the Fluency Fast Track, like the most common 500, 1000 words, etc?

I can work on a course for months and still only be working in the first few thousand sentences. And the Fluency Fast Track itself is only 20k sentences.

To give you an idea, in Russian I am barely over 1,100 sentences in what I’ve covered. These errors are frequent in the sentences that teach the most frequently used words.

If there were some errors persisting in stuff above the 5000 word mark, I would be much less concerned. By the time anyone gets to this point, they’re not going to be a beginner, they’re approaching some degree of fluency and are able to learn by immersive learning in other sources, and thus they’re able to notice any mistakes and aren’t going to be as held back by them.

It’s most an issue in the early ones, I’d say the first 2000 or so sentences, possibly up to 5000 but probably not a huge deal past that point.

2 Likes

Your first mistake was expecting to learn perfect pronunciation from text to speech.

1 Like

You should be immersing from day one. The most common words are… the most common words. You’ll hear natives say them a gazillion times. So why exactly is it a big deal up to the first 2,000 or so sentences, but not after that?

cazort never claimed to expect to learn perfect pronunciation from TTS. cazort wants the accent to be placed on the correct syllable, which is important in any language. It may be that there are no Russian TTS engines available at the moment that can do it well, but there’s no point in giving up on the search prematurely.

As for the dictum “You should be immersing from day one”, isn’t that what cazort is trying to do? Not everyone has access to native speakers, so cazort is attempting to figure out if Clozemaster’s TTS infrastructure can be modified to help close the gap.

2 Likes

I have nothing against wanting tools to be better but he is saying he won’t do russian because the tts isn’t perfect. If you have the internet, you have access to native speakers. Netflix has russian dubs for a lot of shows. There are russian soap operas available for free on Youtube.

The topic of this thread is whether Clozemaster is a sufficiently good tool for serving one’s fundamental Russian-learning needs. Other resources are relevant to this discussion only if they can serve as auxiliary tools for addressing gaps encountered during a Clozemaster session, rather than distinct/competing modes of learning. For me, a Clozemaster session is actually a Clozemaster-with-forays-into-Wiktionary-and-Tatoeba session. But even though I run the risk of distraction each time I visit one of the other sits, I am generally able to come back to Clozemaster with some of my momentum intact.

For instance, I use Wiktionary to look up the accented form of words whose stress pattern I don’t know. I’ve gotten it down to a system:

  • select and copy the word in Clozemaster
  • click on “Wiktionary”
  • paste the word into the search field
  • hit enter
  • copy the accented form to the clipboard
  • go back to Clozemaster
  • edit the sentence
  • copy the accented form into the “Notes” field
  • press Save

That’s laborious and time-consuming enough, but there are additional steps when:

  • the word happens to start with a capital letter in the sentence but not in the dictionary form
  • the word does not have its own page at Wiktionary (in which case one needs to find another form and then make one’s way to a declension table, or a secondary listing, or a related page)
  • the word is not present at Wiktionary at all

I’m willing to do this extra work, but not everyone has the motivation, time, preexisting knowledge of Russian, and/or computer skills for it. So for some people who are learning Russian, Clozemaster won’t make the cutoff in terms of “Is it worth using?” or “Is it worth paying for?”

Watching Russian soap operas does not even belong in the same discussion. Even if I had the combination of motivation, interest, language skills, and time required by this very different activity, I couldn’t dip into a soap opera to resolve a question I had about the stress pattern of a word I found in Clozemaster. Furthermore, the apparent tautology “The most common words are… the most common words” doesn’t even hold. The most common words in the soap opera would probably differ significantly from the most common words at Clozemaster. Even people who could watch Russian soap operas every day would end up with questions about the accent patterns of words at Clozemaster. So that’s not a solution.

3 Likes

I think you may have misunderstood what I was saying and why I created this thread.

I’ve never had this expectation. I have been immersing from day one, and I didn’t even start doing Clozemaster until I had learned a fair amount.(It’s not the best tool to start from zero in any language.) Also Clozemaster has never been the only tool I use: I’ve been watching videos in Russian, interacting with Russians on social media, and conversing with some of my friends who are native speakers.

But the mistakes are jarring. They slow me down even when I’ve already caught them. And like I said, I’m not convinced I’ve caught all of them. And when I do catch them, I sometimes catch them only after I’ve temporarily “learned something wrong”.

It makes me do more work, unnecessarily.

And it makes Clozemaster specifically less useful. I haven’t stopped learning Russian entirely, I’ve just stopped using Clozemaster (and only for that particular language.)

And it’s frustrating, because it is a loss: in other languages, Clozemaster is a very efficient learning tool, probably the single most useful tool I’ve been able to find. Like, during this time I’ve been plowing through the German course and I’m now at a level where I can sit down and watch a talk show where they are talking about something highly abstract, something that would have been far out of grasp a couple years ago. And I think I have Clozemaster to thank for it.

But the Russian course over here is not quite at that level. It requires me to do more supplemental work on my own and is thus more burdensome and less efficient.

I made this thread because I love language learning and love Clozemaster and I want it to be the best it can possibly be for the biggest possible group of people. And I also think this would be a “low cost, high gain” thing. The amount of time and effort required for native speakers to manually check enough of the sentences to address my concerns here would be absolutely negligible relative to the amount of time learners already waste due to these clumsy pronunciations.

3 Likes

I did russian, became a paying customer to do it. I started from literally zero. I only did part of memrise’s russian 1, to have an idea of what the letters are supposed to sound like. I only did listening with multiple choice exercises. I only checked the standard sentence and keyword translations provided by clozemaster, fully aware that it wasn’t always correct. I watched a teen soap dubbed in russian with portuguese subtitles, and sometimes without the subtitles. I was so shocked by how quickly I learned that I decided to delete my russian and go back to japanese, which I had stopped because it is a pain in the ass to read. I notice in japanese sometimes the tts pronounces words in ways that I’m not sure if they are completely wrong, or are just alternative ways that I’m not familiar with. I’ve also noticed the definition of “word” is kind of fluid, so I use rikaichamp so I can just hover over stuff and see if the “word” is actually part of a larger “word”, and that helps provide context. I doubt the pitch accent of the tts is correct. I don’t try to look up the pitch accent of words. 3 seconds after I answer, I already forgot the word. But then I see it again, and it looks kind of familiar. And then I see it again, and it looks even more familiar. And then I hear a native say it somewhere and go “don’t I know that word?”. And then I hear it within a context that links the word to the meaning and go “oh yea, I know that word”. And then I forget again. But then I hear it again and it is kind of familiar. That is how you learn. You are focusing too much on individual sentences/words, trying to nail them down, when you could just keep pushing through.

You are never going to consciously learn accent patterns. You can’t stop to think about how you are going to pronounce a word. You can’t slow down speech to decipher individual words based on your conscious knowledge of accent patterns. You are not supposed to shadow the tts. Chinese people learn pitch before they know what pitch even is. Those things are instinctual, a byproduct of repeated listening, and then, repeated speaking. The purpose of clozemaster is to expose you to a variety of sentences/words in a context that allows you to become familiar enough with them that you recognize them in the wild. I can tell you with certainty that you’ll sound funny if you try to learn portuguese exclusively through clozemaster; I checked. This isn’t a russian problem. This is a language learning problem. Yes, it’s a problem. But clozemaster absolutely does work. You are letting the perfect be the enemy of the good.

3 Likes

Can I offer you a suggestion? Stop trying to master sentences. Set the max reviews per round when playing new sentences to zero, and then don’t do reviews, at all.