Researcher, Dept. of Linguistics, UC Berkeley
Both the ''Seven Dimensions of Portability for Language Documentation and Description'' and the EMELD Best Practices guide advocate the use of Unicode for character encoding, since this standard will ultimately make it possible to transmit electronic documents without error. Still, linguists can find using Unicode problematic, for there is no ''Best Practices'' Unicode guide that is specifically geared for linguists and because many linguists still have documents which make use of non-Unicode fonts.
The kinds of questions that arise include:
(a) How do I tell if my font is Unicode compliant? If it isn't, how can I make it compliant?
(b) How do I check to see if the symbol (or script) I need is in Unicode? How am I sure I am using the correct symbol?
(c) If the sign (/script) ISN'T in Unicode, what should I do in the meantime?
(d) How should I mark uncertain readings using Unicode?
This talk will address each question above by identifying test pages for font testing, advocating the use of mapping tables for commonly used fonts (to be hosted on a widely accessible site, such as LinguistList or Unicode), explaining how to find characters on the Unicode website with preliminary guidelines for use of specific Unicode characters, giving an update on the latest TEI guidelines on handling missing Unicode characters, and recommending appropriate markup for uncertain readings. The ultimate goal is to lay the groundwork for a Unicode best practices guide for linguists.