Internationalized Domain Names IDNs Building a Sustainable Framework
Internationalized Domain Names (IDNs) Building a Sustainable Framework For A Multilingual Internet Yale A 2 K 2 Conference New Haven, USA April 27, 2007 Ram Mohan rmohan@afilias. info
Agenda • Role of IDNs in Access To Knowledge • Multi-lingual Internet Basics – A special case study in India • Current state of Technical Readiness • An IDN Policy Framework – Basics – Technical Principles – Policy Principles April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
The Language Barrier Limits Access to the Internet • The language barrier limits Internet usage • Domain Names are the single most important way to locate resources on the Internet • 65% of the world’s Internet users don’t speak English • In China, 90% of Internet users prefer to access content in their local languages 1 • Software applications now integrate websites/email seamlessly April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan http: //glreach. com/globstats/ 1. CNNIC Statistical Survey, 2005
Technology can help German IDN launch Additional IDNs launch IE 7 B 1(27 -July-05) Source: PIR ICANN Reports April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan IE 7 B 2(24 -Apr-06) IE 7 B 3(29 -Jun-06)
Utility of IDNs • Makes the Internet more friendly to non-English speakers • Provides more accessibility to applications like Email, FTP, etc • It is the most effective way to popularize the use of Internet in non-English speaking communities • Guarantees cultural diversity and protects the special interests of people in different regions • Allows national cultures and under-represented languages to stay alive April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Multi-lingual Internet Basics • Internationalized Domain Names (IDNs) – Domain names represented in characters used in local languages • Allows entire domain name to be represented in a local language character set – example. 日本, or 日本. 日本 • These names have to… – Work everywhere – Be backwards compatible – Not break application software – Support languages appropriately April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Access for India’s billion people • Total Population 1. 002 billion (2001 Census) • 22 Official Languages – Devanagari script based (North Indian): • Hindi, Marathi, Sanskrit, Kashmiri, Sindhi, Nepali, Manipuri – Dravidian script based (South Indian): • Tamil, Telugu, Kannada, Malayalam, Konkani – Arabic Script Based: Urdu – Some Languages representable in more than one script – Other script basis: Bengali, Oriya, Gujarati, Punjabi, Assamese • Worldwide Audience: – – Hindi Bengali Tamil Telugu - • Movies released in 15 languages • Schools teach in 58 different languages • Radio programs broadcast in 71 languages • Newspapers publish in 87 languages And … one Internet April 27, 2007 400 Million Speakers 200 Million Speakers 60 Million Speakers 70 Million Speakers Yale Access 2 Knowledge Conference Ram Mohan
Building Indian IDNs • China has 1. 6 billion Chinese language speakers … with two major scripts and shared characters with Japanese & Korean language communities • About 12 Indian languages are based on Devanagari scripts … leading to potential variant issues • A “many-to-many” problem for India – Multiple languages share common scripts – Multiple scripts used in multiple languages • Other Challenges: – Bi-directional text – Multiple diacentric positioning – Word Breaking April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Adequate Standards Exist • IETF Open Standards published 2002 -2004: – RFC 3490, 3491, 3492 – RFC 3454 – RFC 3743 • ICANN IDN Guidelines • China-Japan-Korea (CJK) common CDNC language tables provide an example of how to build community support • Indic Scripts – standards creation effort provides new learning • Successful root server tests for IDNs at the top level in Dec 2006 Much of the “technology” & “protocol” part of internationalizing domain names is complete April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Domain Name Technical Constraints Must be considered • Normal Unicode-Punycode conversion – flod 18 häst . xn--flod 18 hst-12 a • Performance with a 63 -character long TLD string –. hippo 18 potamushippo 1 8 po • Right to left, embedded characters with opposing directional properties • Left to right script with sophisticated shaping properties • Non-alphabetic script April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Introducing an IDN Policy Framework • Provides clarity to IDN issues • Ensures that registries give due consideration to key elements of IDN policies • Allow governments or other authorities to follow / evaluate the process for IDN deployment at the cc. TLD – Involvement of government / authorities at different stages of process (e. g. List of Valid Characters, Contextual Rules, Variants) • Provides language communities, civil society, businesses input into policy creation prior to rollout • IDN Policies & Registry System – Policy decisions may have profound technical implications to registry system April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Founding technical principles in IDN implementation • Build Character Inclusion Table (List of valid characters). . . governments, linguists, technologists needed • Variant Mapping Consideration …allow only one form of character set for IDN • Contextual Rules – Minimum & Maximum Length – Prohibited prefixes or suffixes – Potential contextual rules: prohibited character sequences • Register and operate the Internationalized TLD in the root DNS Server in the form of IDNA Punycode April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Founding policy principles in IDN implementation 1. Avoid ASCII-Squatting 2. Consult with government for Geo-political Impact of new top level domain 3. Actively solicit Language Community Input for evaluation of new IDN g. TLD Strings 4. One String per new IDN g. TLD 5. Limit Variant Confusion and Collision 6. Limit Confusingly Similar Strings April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Founding policy principles (cont. ) 7. (No) Priority Rights for new g. TLD strings and new domain names 8. Approach Aliasing as a Policy matter 9. Adhere to a Single Script (ASCII exception, other restrictions) 10. UDRP sufficient for dispute resolution in new IDN TLDs April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
General IDN TLD Rollout Principles • Retain global uniqueness of the TLD system – Domain names remain unique and unambiguous • Maintain interoperability of the TLD system – Domain names work the same way regardless of the geography it is accessed in –. ���� needs to point applications and users to the same place regardless of accessing the domain from India, UK or Greece • Promote “Future-Proof” solutions – Define Unicode characters to be allowed – Provides ability for adding new languages, new characters far in the future • Avoid User Confusion • Promote multi-stakeholder involvement April 27, 2007 Yale Access 2 Knowledge Conference Ram Mohan
Let’s make it happen Building a Sustainable Framework For A Multilingual Internet Ram Mohan rmohan@afilias. info
- Slides: 16