Jonathan Prince on 24 Feb 2001 01:12:42 -0000 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
<nettime> The Rosetta Project |
http://www.rosettaproject.org Fifty to ninety percent of the world's languages are predicted to disappear in the next century, many with little or no significant documentation. Much of the work that has been done, especially on smaller languages, remains hidden away in personal research files or poorly preserved in under-funded archives. As part of the effort to secure this critical legacy of linguistic diversity, the Long Now Foundation is working to develop a contemporary version of the historic Rosetta Stone. In this updated iteration, our goal is a meaningful survey and near permanent archive of 1,000 languages. We have three overlapping motivations in the project: To create an uniquely valuable platform for comparative linguistic research and education. To develop and widely distribute a functional linguistic tool that will help with decipherment and recovery of lost languages in distant futures. To offer an aesthetic object that suggests the great diversity of human languages as well as the very real threats to the continued survival of this diversity. Our 1,000 language corpus expands on the parallel text structure of the original Rosetta through archiving seven distinct components for each of the 1,000 languages. We have selected these components as the "minimum representation" most likely to be useful for future, linguistic archaeology as well as contemporary comparative research. This sketch should be understood as a modest frame that is possible to complete for a very large number of languages - a frame on which more will hopefully be hung later. The seven components are: Meta-data/description for each language: Origin and current distribution of language, number of speakers, family, typology, history, etc. Main parallel text: We are using translations of Genesis Chapters 1-3 as Biblical texts are the most widely and carefully translated writings on the planet. Vernacular origin story with interlinear gloss: A cultural specific counterpoint to the Genesis text with grammatical analysis. We will substitute other vernacular texts if a glossed origin story is unavailable or culturally inappropriate. Swadesh 100 word vocabulary list: A core word list typically collected in linguistic field work. Orthography: The writing system(s) of the language with pronunciation guide. Inventory of Phonemes: The basic sound units of the language. Audio file: Sample of spoken language with transcription and ideally a translation. We have finished the collection of Genesis translations for 1,000 languages as well as parsed the Ethnologue for corresponding language descriptions. We now need text contributions for all the remaining components and invite you to submit in your area of expertise. We also encourage suggestions for languages that currently are not on the list, but should be, given interesting structural features, genetic relationships, isolate status, etc. -- .. Jonathan Prince jonathan@killyourtv.com http://KillYourTV.com - it's bad for you http://GWBushSucks.com - he's bad for everyone http://USoutofColombia.org - stupid wars are bad ........................................................ "More than any time in history, mankind faces a crossroads. One path leads to despair and utter hopelessness. the other, to total extinction. Let us pray we have the wisdom to choose correctly." - Woody Allen # distributed via <nettime>: no commercial use without permission # <nettime> is a moderated mailing list for net criticism, # collaborative text filtering and cultural politics of the nets # more info: majordomo@bbs.thing.net and "info nettime-l" in the msg body # archive: http://www.nettime.org contact: nettime@bbs.thing.net