As part of the Imagining the Ordinary City project I’m working with interviews in languages I don’t speak. South Africa has twelve official languages and I understand only one, English. I need to work with interviews in Afrikaans and isiXhosa.
Adobe Premiere has a suite of features that Adobe calls ‘text-based editing,’ and one of these is automatic machine transcription. It’s not perfect, but reading and rearranging the transcription makes it quicker for me to edit interview audio to make it more concise, lucid and emotionally engaging. Captioning becomes a process that takes hours rather than days. It’s a handy time-saver for interviews conducted in English, at least.
I’ve also been using it to help my academic colleagues with interviews they are planning to publish in book form. When I offered to auto-translate an interview that transitions smoothly between English and isiXhosa the results were… unusual. Premiere’s auto-transcription supports only incredibly widely-spoken languages (and Danish). The algorithm recognises some English, assumes everything else in the interview is in English, and goes to some very strange places.
“But know that you climb a ladder to get a pot roast lamb might make your, your bit.”
“Anyway, Robert’s pattern in the tattoo of Ram and we have four letter to 12, panel five, number four Liverpool took I just I always found it so jarring, but oh hello mama mama mama mama mama.”
And “I got lost because England ends as well,” which sounds like a description of my last visit to Cornwall.
There is something to be said here about the inherent biases of the tech industry, the growth in LLM-generated text, the boost these technologies give to widely-spoken languages and the pressures they will exert on languages that are more regionally specific. Maybe this point can be made in surreal cut-up isiXhosa poetry mistranscribed into English?