If you’re confused why you can’t currently download Ubuntu 23.10 despite the fact it’s been released (and blogs like mine are telling you it’s out) there is a reason.
[From Twitter]: “We have identified hate speech from a malicious contributor in some of our translations submitted as part of a third party tool outside of the Ubuntu Archive. The Ubuntu 23.10 image has been taken down and a new version will be available once the correct translations have been restored.”
Now, I’m not 100% certain but from poking around the Ubuntu Desktop Installer GitHub — I know, I’m nosey — appears to have been (sadly) the Ukrainian translation file that was hijacked. I ran the text through a translator and …Honestly, I wish I hadn’t.
It’s a broad range of offensive sentences touching on politics, sexuality, and current events. Though shocking, none of it is particularly coherent in scope. It seems to be written to be provocative for provocations sake – the sort of stuff people post on X to farm likes from far-right bots.
I mean honestly though, if there are code reviews, how hard would it be to just make a quick “translation review”, putting the stuff through a translator program, and verifying it’s not obvious bullshit? Especially for new/unknown contributors. Of course it’s additional work, again, but a sanity check should easily be possible.
Quite hard. We had Open Source’ish LLMs for only around six months, if they are even up to the task of verifying a translation is another issue and if they are up to Debian’s Open Source guidelines yet another. This is obviously going to be the long term solution, but the tech for that has simply not been around for very long.
And of course once you have translation tools good enough for the task, you might just skip the human translator altogether and just use machine translations.
I more meant that if something contains “fucking kill all ukrainians and trans people”, which it sounds like this was something like that, that should be possible to see even with bad translation tools.
It wasn’t, by the way. Though it could have been flagged by the dumbest of online translators (or even anyone who could read Cyrillic, since some of it uses English loanwords, like “sex” and “gay”). It should never have made it in release, but I disagree with categorizing it as “hate speech”. I feel comfortable posting it here, even though it’s pretty crude and #3 in particular is very vulgar. If anyone’s curious, here are the Google Translate translations of the vandalized parts (except for one of them, fullInstallationSubtitle, which I think is too offensive to be repeated here. It references the Israel-Palestine war):
I mean yeah, I was speculating. But what you posted also seems easily detectable :D
That’s a lot of pant removal.
But anyway, it should be quite possible to automatically screen the translation for something this blatant.