Anonymous on Sat Apr 21 00:06:41 2001


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

No Subject


ESSAY | 7.25.00

After Babelfish

Random acts of senseless beauty? FEED columnist Julian Dibbell takes the
wonderful translation machine out for a spin.


LATE LAST MONTH IN THE SWISS CITY OF STUDEN, something very grave took
place. I'm not sure what, exactly. My only source -- a German-language
Associated Press account rendered into English by Babelfish, the popular
online translation engine hosted by Alta Vista -- reports that on June 23,
at a "zoo-logical garden" in Studen, a "22-jaehrige attendant" was
"resulted" by a "bear nut/mother." According to the article, "two attendants
fed the eight-year old bear nut/mother and its two six month's young animals
in the enclosure of the zoo-logical garden sea-devil at 16.00 o'clock. The
22-jaehrige coworker approximated the animals obviously too." In response,
apparently, to this obvious approximation, "the bear nut/mother attacked and
bit the attendant into the legs, levers and the basin area." The worker was
seriously injured, said a spokesman for the zoo-logical garden: "It suffered
deep meat wounds."

What's going on here? Was the unfortunate attendant born without a gender,
or was that too a casualty of the attack? Did the poor thing lose its
levers? The use of its basin area? And what of an earlier incident the
article refers to, a mishap at the same zoo-logical garden in 1997, "when
wild an uranium property on female become broke out and hurt two employees"?
What really happened there? And in whose fever dream? In the German version
of this story -- the one composed by an actual human being -- there may be
answers to these questions. But not here. For here we find ourselves
immersed in the world according to Babelfish, a place where meaning
sometimes seems to show up only by coincidence, and information frequently
declines to show its face at all. This chronic leakage of sense and
certainty is often held to be a failing of Babelfish's, and a grievous one.
It's an unfair criticism, I will argue, but it's certainly got pedigree:
So-called machine translation has a long and fabled history of disappointing
those who look to it for, of all things, the reliable conveyance of meaning
from one language into another.

Almost as old as digital computers themselves, the dream of fully automated,
high-quality translation (or FAHQT) sprouted from the rich soil of Cold War
imperatives and fifties techno hubris. Convinced at first that machine
translation was nothing more than a fancy version of the code-breaking
problems that computers had made relatively short work of during World War
II, early MT hackers soon began to realize that they were out of their
depth. Ciphers and languages, it turned out, were not at all the same
things, and on closer examination it appeared that thoroughly cracking the
latter would pretty much require explaining -- definitively and
mathematically -- what it means to be human. Not that techno-hubris, even
today, considers such a task beyond its reach, but, as Eduard Hovy,
president of the Association for Machine Translation in the Americas, puts
it: "It's going to be a long problem."

In the meantime, MT hackers have largely abandoned the ideal of FAHQT (say
that acronym out loud and you get a good idea of its prospects) and learned
to speak more pragmatically of their aims and accomplishments. "Machine
translation is an imperfect science," says Aston Fallen, vice president of
Systran, the company that developed Babelfish and also markets other, more
advanced translation programs. Capitalizing on the fact that even a very
murky automated first pass can give a human translator a leg up, Systran and
a few other machine-translation companies have built a small industry
selling their wares to governments and other high-volume translators. And
now, as the Web becomes less and less the exclusive domain of English
speakers, Systran stands poised -- via Babelfish and its less-celebrated
contracts with chat-room providers and online role-playing games -- to lord
it over a burgeoning consumer market in quick-and-dirty,
better-than-nothing, real-time translation.

Humility, in short, is paying off for the machine-translation biz. But where
exactly is the line between being humble and selling oneself short? Consider
Fallen's estimation of his own products' capabilities as literary machines.
Given the right kind of source text, he says -- a simply and precisely
written technical manual, for instance -- a Systran product loaded with the
appropriately specialized vocabulary can spit out translations of up to
ninety-nine percent accuracy. But anything as open-ended as a news report
remains a challenge, and never mind more nuanced texts. "If you take
Shakespeare and put it into the product as you take it out of the box,
you're going to get garbage," says Fallen. "You're going to get twenty-five
or thirty percent, or you're going to get some sort of word analysis that is
going to have little to do with the prose and the elegance, et cetera, of
what Shakespeare is all about."

But suppose, now, that Fallen has it exactly backwards. Suppose that the
unhinged flights of Babelfish at its nuttiest are in some sense very much
what Shakespeare is about -- or at least what translations of Shakespeare
ought to be about. Suppose, that is, that Walter Benjamin in fact had
something very much like Babelfish in mind when he wrote that translation
has but one true task: to catch a fleeting glimpse for us of that "higher
and purer" language of which all languages, after Babel, are mere fragments.
"In this pure language...," wrote Benjamin, "all information, all sense, and
all intention finally encounter a stratum in which they are destined to be
extinguished." And now suppose we want to do more than suppose. What would
it take to test the proposition that machine translation, far from muddling
along imperfectly, in fact comes closer to perfection at its task than any
human translator ever has?

The standard test has always been poetry. From Samuel Johnson to Roman
Jakobson, theorists of translation have taken verse to be the limit case of
the translatable. With its close interweavings of sound and sense, of rhythm
and reference, the well-wrought poem all but defies the translator to
reproduce its essence in another language. That's not to say that other
sorts of text don't throw up similar challenges -- just that poetry takes
those challenges to their definitive extremes. The long history of arguments
against the translatability of poetry, notes George Steiner in _After Babel:
Aspects of Language and Translation_, can thus be read as "simply the barbed
edge of the general assertion that no language can be translated without
fundamental loss." Which is to say, perhaps, that if poetry can't be
translated, nothing can.

The obvious corollary being: If Babelfish can prove itself adept at
rendering poetry as poetry, what else does it have to prove?


LET'S PICK A POEM, then. Any poem should do, so for today's experiment we'll
use a verse selected on the following random basis: I'm fond of it. It was
written by William Butler Yeats; it is called "When You Are Old"; and it
goes, in its entirety, like this:

When you are old and grey and full of sleep,
And nodding by the fire, take down this book,
And slowly read, and dream of the soft look
Your eyes had once, and of their shadows deep;

How many loved your moments of glad grace,
And loved your beauty with love false or true,
But one man loved the pilgrim soul in you,
And loved the sorrows of your changing face;

And bending down beside the glowing bars,
Murmur, a little sadly, how Love fled
And paced upon the mountains overhead
And hid his face amid a crowd of stars.

Read it over, if you like. You'll see there's nothing particularly edgy
about the poem. You'll see that it rhymes, and that its rhymes and rhythms
flow with ease across the one long sentence that comprises it. You'll see
that its diction is as plain as water, except where flavored with the odd
archaism or hint of Irish vernacular. You'll see the choreography of its
narrative: its slide from the quiet domesticity of the first stanza down
through the second stanza's lively recollection of romantic youth, and then
its lulling, momentary pause back at the hearth before it makes the
startling leap into the mythic, troubled imagery of the last two lines.

But Babelfish sees none of this. Pasted into the program's text-entry
window, Yeats's poem becomes a data set -- an ordered collection of inputs
to be examined without reference to rhyme or flow or anything like meaning.
The first thing Babelfish does with these inputs is pass them to its
English-language analysis engine (also sometimes called a parser). The
analysis engine, a kind of automated sentence diagrammer, runs the data set
through a complex algorithm designed to sort words into nouns, verbs,
prepositions, and so on, establishing their syntactical relationship to one
another as it goes. Encountering the datum when, the engine looks it up in
an internal word list, calls it a conjunction, and notes its place at the
beginning of the sentence. Next, the input you gets tagged as a pronoun and
as the subject of a dependent clause beginning with when. After that, are is
marked as a verb form modifying you, and so on to the full stop after stars,
where Yeats's long glide of a sentence comes to an end.

Next the program passes the marked-up data on to a dictionary module that
matches English words with their likeliest counterparts in the target
language. An analysis engine on the target end reads the syntax map
generated by the first parser and uses it to reorder and inflect the words
as necessary -- moving verbs to the ends of clauses in German, making sure
in French that <are> gets declined as second-person singular rather than
third-person plural. That done, the new sentence emerges as output, and
Babelfish has completed a translation.

But first we have to tell it to. And before we do that, we'll have to pick a
target language from the fairly Eurocentric range of options Babelfish
presents: French, German, Italian, Spanish, or Portuguese. We'll go with
Portuguese, again partly for random personal reasons (it happens to be my
second language, more or less because I wanted to be able to sing "The Girl