My preceding post was about a collection of poems by an 18th-century Vietnamese concubine, Ho Xuan Huong, who wrote in an ideographic script a thousand years old that’s now nearly extinct in Vietnam. This script, Nôm, was replaced by a Latin-based script, quoc ngu, created by the Jesuit missionary Alexander de Rhodes in the 17th century. John Balaban’s Spring Essence is the first book in history to have Nôm printed as type, and each poem appears with quoc ngu and English translations.
“From about the tenth century and into the twentieth,” John writes, “[Nôm] was the repository of Vietnamese literature, political essays, and philosophy, as well as religious and medical treatises…Today, out of eighty-four million Vietnamese, perhaps only a few dozen can read this thousand-year heritage….”
In 1999, John, with two Vietnamese IT experts, founded the Vietnamese Nôm Preservation Foundation (VNPF) to encode the ideograms of the ancient script so that this 1000-year heritage could be displayed across the Internet and printed in the modern sense. The VNPF, with the further help of volunteers around the world, published the first dictionary of Nôm, provided scholarships for its study, and organized international conferences to bring attention to the importance of Nôm writing. The Foundation is a leader in standardizing Nôm computer script and digitally preserving Nôm texts, in which centuries of Vietnamese philosophy, history, royal decree, music, drama, medicine, poetry, and religious discourse still largely remain beyond the grasp of modern readers.
Today it’s my great pleasure to post my interview, done by e-mail, with Ngô Trung Viet, Vice President of the VNPF since 2002.
Welcome, Viet, and many thanks for your time. How did you get involved with the Vietnamese Nôm Preservation Foundation?
I was invited by John Balaban and Ngô Thanh Nhan, my old friends and officers of the Foundation. In fact, I have been interested in Nôm since 1993, when one of my other friends, James Do Ba Phuoc, an ex-vice president of the Foundation, told me about the project on Nôm standardization for computers and invited me to take part in the standardization of Nôm script into Unicode.
Can you tell us about your background?
I graduated from the Hanoi Pedagogic University in 1973 and have been a programmer of COBOL, FORTRAN, BASIC, C, C++, Java, etc., and from 1990 a system analyst for information systems, participating in projects on computerization at many organizations. I have been an editor of national standards on Vietnamese Latin-based script and ideographs and am now a member of the ISO-IEC/JTC1/SC2/WG2/IRG. I am also a lecturer on programming languages, programming methodologies, information systems, software engineering, e-learning, project management, knowledge management, and enterprise architecture at many universities and training centers in Vietnam. I am married, with a daughter and a son.
What is your main role in the current Nôm project, and what are the technical challenges for you?
My role is to run the Nôm Na office of the Foundation in Hanoi and participate in the Foundation's projects in Vietnam. We are working on introducing Nôm script to the Internet. The technical challenge for me is to build a bridge between the traditional characters of Vietnamese scripts and modern information technology. The challenges are real, because most companies in Vietnam, as well as state organizations, lack specialists who know both Nôm script and IT. And it is necessary that Nôm scripts should be available on computers so the young generation can access them. That will be the main tool for keeping Nôm alive with the rest of the country.
Your bio mentions standardization. What is that, exactly?
It is encoding Nôm characters into the international multilingual characters sets, like Unicode and ISO 10646, so that computers can process them and people all over the world can access Nôm materials.
We select Nôm characters from many sources and propose to put them into the IRG repertoires. By standardization, these characters would be unified and checked by IRG before having their own codepoints in Unicode and ISO 10646. It means that every existing Nôm character could be collected. We preserve their multiple forms, but we do not standardize Nôm script itself.
Is part of the standardization effort to ask Unicode for recognition of Nôm fonts?
No, Unicode focuses only on assigning an ideograph a codepoint. The font issue belongs to vendors. So we work with Unicode for their acceptance of Nôm characters, which means we propose characters that do not exist in previous Unicode characters sets.
Literature is being digitized all over the world. What’s the urgency in regards to Nôm texts?
Digitization of Nôm script is urgent because:
- Original documents are being ruined by time and weather. Technology could help slow the loss.
- People who can read Nôm script are fewer and fewer, so if we have nothing to encourage young people to learn Nôm, it won’t be very long until the Vietnamese cannot read what their forefathers wrote.
- Putting Nôm scripts on the Internet is very important. It lets a larger number of people have access to them.
Can you describe the physical process of digitizing the Nôm texts at the national library at Hanoi?
We are developing and documenting the process, so it is not yet fixed. In general, we have to photograph all books and documents, build a database for the pictures along with their titles and content, and put them on the Internet so everyone can access the collection.
How long is it expected to take?
Where do the staff and the money come from to do this?
We have four staff members in the Hanoi office, and of course the project will invite more experts from around the world to join in. Money for the project comes in the form of donations from many people and organizations in the US and around the world.
You use cameras, then, not scanners, to image the texts in the library?
Cameras, in most cases, because the old books are fragile. I am afraid we could damage them if we used normal scanners.
What resolution are the files?
We start with the highest possible resolution, so we can reproduce the books. By this, I mean we can print out the content on "do" paper, a kind of Vietnamese paper, so average people can read books that actually look and feel like the originals. Second, we lower the resolution to a mid-range, so that researchers can access these pictures for their studies. (We cannot easily send a big file of tens of megabytes via the Internet). And third, we make smaller, low-resolution pictures of them, so that everybody can access them on a computer without waiting a long time for downloading.
Is there a parallel effort to catalog those texts (as print texts now recognized as being written in Nôm) with traditional library methods, or do those texts return to their previous place on the shelves?
Yes, of course. We use both traditional library catalog methods and methods that allow for Internet searches, so we record information about the texts, as well as capture images.
Who maintains the resulting digital files and their servers? And where are those servers?
Are these texts being transcribed? (Forgive me if this is not the correct term. I mean the difference between taking a digital picture of a book page, and having the text in a word-processing document for manipulation.)
At the moment we have no Nôm optical character recognition software, so we cannot convert pictures into documents for manipulation. It takes significant work just to list a few words of content with the pictures for cataloging purposes.
Are these texts being translated as they’re being captured, so someone other than the 80 Nôm scholars left can read and use them as they come on-line?
We don’t have the resources currently to translate them into modern scripts. Our job now is to get them in their original form on-line and add the information to permit searching.
Did you develop your own software for this project, buy it, or use open source?
Up to now we have not decided about this. I think we will use open source.
Is anyone working on the design of interactive media to allow the sharing and easy use of these digitized texts?
We’re thinking already about this. We have some main goals for the project: photography, database, interface, website, but we do not yet have the people for all these things.
Is the Nôm Foundation working actively with those other libraries around the world that own Nôm texts? (Will your staff go to Paris, for instance, to theBibliothéque nationale?)
It is our plan to visit other libraries, but we must wait until enough people become interested in Nôm that the resources are available to do so.
Who will own the copyrights to the digitized material you’re working with now?
We must leave these issues to the National Library of Vietnam to decide. At present, the policy of NLV is not to keep the copyrights for digitized materials, because they are cultural and belong to the people.
Who will be allowed to access the materials when they’re ready to be used? Will you charge?
In general, the public will be allowed to view them on the Internet. As you can imagine, there will be different kinds of users, and I do not know about the future policies of the NLV. We’re leaving the issue for the future.
Your work for the Nôm Foundation is on a volunteer basis. Can you tell us about your paid jobs with the Institute of Information Technology (within the Vietnam Academy for Sciences and Technologies), and the DTT Technology Group?
At the Institute of Information Technology, I am a researcher at the Department of Programming and Databases. My projects at the Institute focus on standardizing Vietnamese scripts for many ethnicities. Vietnam has three main script groups: Indic scripts (Cham, Thai, Khmer), ideographs (Han Nôm script), and Latin scripts (quoc ngu). I have participated in standardizing quoc ngu, Cham, Thai and Nôm. I also teach on project management and IT applications in enterprises.
At DTT Technology Group, I work as a consultant and senior teacher on enterprise architecture.
How would you describe the state of technology—especially computer technology—in Vietnam at this time? At what pace is change occurring?
In Vietnam, computers have come to every company and organization, even to families. Many people can use computers and the Internet. But IT application in organizations is still a great problem. Management has not yet shifted completely to IT infrastructure and has difficulty with it. I think the change will be faster in the next few years. With young companies, the change for a new and dynamic management is a must.
It must be incredibly exciting to be a part of a project so important to Vietnamese cultural heritage and to be using technology to make that a global effort. Is that effort being recognized in Vietnam, by the people, government, corporations, or others?
No, it’s not recognized adequately. Some individuals understand, but most official institutions do not pay attention. There are many reasons for this.
First, the public has forgotten Nôm script, nearly a century after it was replaced by quoc ngu, the modern Vietnamese script adapted from the Latin alphabet used in the West. All Vietnamese can read quoc ngu but think Nôm script is dead and do not think of it as their heritage. So they don't see why it’s necessary to put much effort into Nôm script or texts. Unless the mindset changes, Nôm will not return to Vietnamese life.
Second, IT corporations cannot see a market for Nôm script. Since so few can read it, who will buy the software, and how will they profit? They pay little to develop the software.
Third, there is a distance between the world and Vietnam, and many Vietnamese cannot see the urgent need for the project. It is somehow a new thing to them, quite strange, and they hold the traditional attitudes of the administrative system. Only when the results of the project are shown will there be a change, and public recognition. Everything new in Vietnam is the same way.
We require a new vision to become an information society, a new mindset, much knowledge and technology, as well as expertise. These are not present in Vietnamese organizations, whose leaders cannot conceive of technology in that way. Their mindsets are very limited, thinking only of traditional media such as books and papers. Many older people, especially those who run the organizations, have little access to the Internet and are afraid of such things, which are outside their grasp.
Some institutions want this change, but they confront another difficulty: no budget for such projects, which would not be accepted by managers unless they understood the issues. But managers will not see the problem unless they can access and use new technologies, and those don’t exist if there is no one to create them. Can you see the vicious circle and the answers to your question?
Further (fascinating) reading:
“A Look at the Status of Vietnamese Nôm Studies,” by Dr. Ngô Thanh Nhan, a computational linguist at the Courant Institute of Mathematical Sciences of New York University, a visiting research scholar at the Center for Vietnamese Philosophy, Culture & Society of Temple University, and a former Vice President of the Vietnamese Nôm Preservation Foundation. Dr. Nhan designed—for John Balaban’s Spring Essence—the first typeface to represent Nôm. Posted at Viplok, the Vietnamese Public Library of Knowledge.