The Language Map

Story Margaret Kwan

Like people, languages die too. Unlike people, however, languages cannot increase in numbers. It is true the number of speakers for a certain language may increase over time, but the total number of languages in the world decreases. “Within perhaps two generations, most of the languages in the world will die out,” says David Crystal in his book Language Death. To preserve the diversity and history associated with each language, the Rosetta Project seeks to archive all documented languages.

The Death Of Languages

Some languages die because of physical disasters, such as the 2004 tsunami in the Indian Ocean. We not only witnessed the loss of over 225,000 people, but also a variety of languages spoken across Asia. According to Ethnologue, a language inventory that developed a unique code for every language, Indonesia — one of the affected areas of the tsunami — has 742 languages. According to Crystal, eight languages in the world have over 100 million speakers: English, Mandarin, Spanish, Bengali, Hindi, Portuguese, Russian, and Japanese. Approximately 96 per cent of the world’s population speaks just four per cent of the world’s languages.

When people die, they take the languages they speak with them. Crystal says fewer than 50 per cent of the world’s languages are effectively passed on to the next generation. Some languages die without any documentation. He mentions a linguist, Bruce Connell, who witnessed the stark reality of endangered languages. During his fieldwork study in Cameroon in 1994–95, he came across the language of Kasabe, which had only one remaining speaker in the world. Connell went back to Cameroon in 1996 hoping to gather more linguistic data about Kasabe, but upon arrival discovered that the last speaker of Kasabe had passed away.

The Rosetta Project

The Rosetta Project attempts to prevent situations like Kasabe from happening. The Project is part of the Long Now Foundation, a private San Francisco-based organization that aims to foster long-term thinking. The Long Now Foundation writes years using five digits, such as 02007, instead of the conventional four.

“The Rosetta Project started as a desire to maintain long-term storage data using analog instead of digital,” explained Rosetta Project researcher and archivist JD Ross Leahy. Leahy said the Project team got together to brainstorm how they could use this idea, and decided that linguistics was a fitting subject. The team wanted to create a modern version of the Rosetta Stone, an ancient stone discovered in 1799 with inscriptions in Egyptian and Greek. The Rosetta Project team had initially thought there would be a database of languages available to begin with, but soon realized they had been wrong. “No one had ever tackled anything like this before,” said Leahy.


“Approximately 96 per cent of the world’s population speaks just four per cent of the world’s languages.”

When the Rosetta Project started in 2000, its initial goal was to archive 1,000 of the 7,000 documented languages in the world. Linguists, researchers, and organizations from around the world provide all language information. The Rosetta Project has a core team of 50 people, and also 2,500 volunteers around the world, according to Leahy. It also teams up with organizations such as the Endangered Language Fund, the National Science Digital Library, and the Linguist List. When the team surpassed 1,000 archived languages, they expanded their vision, aiming to document all of the world’s languages. They currently have an archive of over 2,300 languages around the world, which is the largest collection of linguistic data. Approximately 500 of these languages are from Pacific Rim countries.

The project aims to provide a space in which documents can be stored safely. The first priority of those involved with the Rosetta Project is to store documents in their online archive. Once they receive the original documents, they create digital copies and make several back-ups. The back-ups are stored either in the Stanford Library or in the University of Michigan Library. The archive, which is accessible from their website, organizes each language by country, language family, name, and data type. If searching the archive by language family, nodes are available to navigate up and down the language’s family tree. For example, clicking on the language family Sino-Tibetan brings up the nodes Chinese and Tibeto-Burman. Further down, clicking on the Chinese node reveals more information on several Chinese languages. This is useful for both linguists and curious minds alike to navigate within a language family. Surfing the online archive allows anyone to discover the similarities and differences between related languages.

According to the Rosetta Project’s website, the project strives to maintain an online archive in which each language contains meta-linguistic data, such as information on its orthography, and a map to show where the language is spoken. It aims to include grammatical descriptions, vocabularies, translations of texts, and media such as audio files.

In addition to the online archive, the Rosetta Project also keeps the archive on a physical disk known as the Rosetta Disk. The Rosetta Disk uses micro-etching technology to store 15,000 pages from the archive onto a physical disk. Micro-etching refers to an etching of metal samples for examination under the microscope. The idea of the Rosetta Disk using this technology is similar to using microform images to look at past newspaper articles in libraries. In order for newspaper articles to be legible to the reader, the flat sheets of microfiche need to be magnified at about 25 times. Microfiche has an estimated shelf life of 500 years. In comparison, the micro-etching technology used to produce the Rosetta Disk can last twice as long.

Side of Rosetta Disk container.

Archiving languages

The Rosetta Disk is uniquely designed; it is made of nickel, flat, and only three inches wide. Etched in a spiral fashion, the outer part of the disk holds a message written in the world’s eight major languages. In English, it reads, “Languages of the world: This is an archive of over 1,000 human languages assembled in the year 02000 C.E. Magnify 1,000 times to find over 15,000 pages of language documentation.” Each message begins at human-eye legibility and becomes smaller as it spirals down.

At the centre of the disk is a globe, and then the pages (at 0.019 inches wide) form rings around the globe. Each page contains information on individual languages, just like the online archive. “All digital information is etched onto a small disc as analog information, and people can see it using a microscope instead of relying on a computer device,” says Leahy. A four-inch spherical container separated into two hemispheres protects the disk. The top hemisphere is made from optical glass, which magnifies at six times. The bottom hemisphere, made of stainless steel, has a cylinder that contains a ribbon. On the ribbon, the disk owner may have names and personal messages etched. This feature adds to the uniqueness of the disk, especially when it is passed through generations. Copies of the Rosetta Disk can be obtained by making a $25,000 donation to the Rosetta Project.

The Rosetta Project also shows interest in producing a reference book to archive the world’s languages. At the moment, however, according to Leahy, this part of the project is still on the back burner. Recently Alan Lomax, a well-known ethnomusicolgist, kindly donated his audio collection to the Rosetta Project. The collection contains 250 tapes of recordings of various languages. The Project will keep a digital version of the recordings while the original tapes will be sent to the Library of Congress.


The Rosetta Disk and its container can be publicly viewed at the Long Now Interactive Museum and Gallery at the Fort Mason Center in San Francisco. In the museum, visitors can also see examples of text from the archive and audio recordings that are yet to be available on the internet. If readers would like to contribute to the Rosetta Project, Leahy suggests it is best to contact them by email for further guidelines.

It is hard to imagine that all the endangered languages will be saved and revived, but hopefully with the help of the Rosetta Project and the micro-etching technology, we can preserve much of the flavour and uniqueness each language brings to the world.