18 th International Unicode Conference Documentum and UTF8

  • Slides: 22
Download presentation
18 th International Unicode Conference Documentum and UTF-8: Converting Content Management Software Product Line

18 th International Unicode Conference Documentum and UTF-8: Converting Content Management Software Product Line to Unicode 27 April 2001 Donald Ziff Documentum Proprietary

18 th International Unicode Conference Agenda • What is Documentum? • Documentum’s I 18

18 th International Unicode Conference Agenda • What is Documentum? • Documentum’s I 18 N Problem • How Unicode UTF-8 Saved the Day • Other Success Factors • Demo Documentum Proprietary and Confidential 2

18 th International Unicode Conference About Documentum • Documentum: NASDAQ “DCTM” • The Leader

18 th International Unicode Conference About Documentum • Documentum: NASDAQ “DCTM” • The Leader in Web and Enterprise Content Management Solutions • > $128 M in revenue 1999. > 800 employees. • Over 900+ Global 2000 customers with strong vertical focus • Over 25 Offices in 10+ countries Documentum Proprietary 3

18 th International Unicode Conference DCTM’s I 18 N Problem • Everyone agrees: we

18 th International Unicode Conference DCTM’s I 18 N Problem • Everyone agrees: we need I 18 N to fuel growth – especially in Asia • Asian-certified product much more important than multi-lingual – Although demand for multi-lingual is growing… • So why not I 18 N? Documentum Proprietary 4

18 th International Unicode Conference I 18 N Perception Problems • Too Difficult –

18 th International Unicode Conference I 18 N Perception Problems • Too Difficult – won’t fit into a development cycle • Too much Overhead – multiplies QA and Support • Not Sexy – no new functionality Let’s look at these problems… Documentum Proprietary 5

18 th International Unicode Conference “I 18 N is too difficult” Product Layers: •

18 th International Unicode Conference “I 18 N is too difficult” Product Layers: • Server (built on RDBMS + Verity) • DMCL: Client Library (C++) • DFC: Foundation Classes (Java) • DTC: Desktop Client – Win 32 end-user client • WDK: Web Development Kit • Right. Site: Legacy Web-Server Integration • Web Publisher: Web Content Management App • Legacy clients: Workspace (Win 32), Intranet Documentum Proprietary 6

18 th International Unicode Conference History Lesson • Server v 3. 1. 6. INT,

18 th International Unicode Conference History Lesson • Server v 3. 1. 6. INT, created by consultants for Japanese market, was expensive and time-consuming – 3. 1. 6. INT attempted to internationalize all the layers in the DCTM architecture at once • 4. 0 was released without I 18 N changes • 4. 1 followed, the deltas from 3. 1. 6 to 3. 1. 6. INT became hard to apply… Documentum Proprietary 7

18 th International Unicode Conference “I 18 N requires too much overhead” • The

18 th International Unicode Conference “I 18 N requires too much overhead” • The DCTM server requires pharmaceutical-strength certification • Dimensions of certifications: – 3 RDBMS platforms: Oracle, Sybase, SQLServer – 4 Server OS’s: NT, Solaris, HPUX, AIX • The 3. 1. 6. INT architecture introduced new dimensions, leading us to… Documentum Proprietary 8

18 th International Unicode Conference Certification Hell! • New certification dimensions: – 5 DCTM

18 th International Unicode Conference Certification Hell! • New certification dimensions: – 5 DCTM Server code-pages – 5 RDBMS code-pages • Market requires another dimension: – 5 Server OS Localizations • 125 new times 12 old 1500 certs! • Exaggeration, of course… But still… Documentum Proprietary 9

18 th International Unicode Conference “I 18 N not sexy” • DCTM is a

18 th International Unicode Conference “I 18 N not sexy” • DCTM is a growth company, needs sizzle as well as steak • I 18 N grows markets, but doesn’t add much to marketing message • To be fair: new functionality is not just “sexy” – it is essential to DCTM’s continued survival • Other priorities will move to the top… Documentum Proprietary 10

18 th International Unicode Conference DCTM’s I 18 N Requirements • Crucial need: support

18 th International Unicode Conference DCTM’s I 18 N Requirements • Crucial need: support Asia from the main code -line. One binary for the world • Backward compatibility essential • Multi-lingual features would be a side-benefit. High on the wish list for a few key customers • I 18 N project must be scoped down to be achievable Documentum Proprietary 11

18 th International Unicode Conference How UTF-8 Saved the Day • UTF-8 moves safely

18 th International Unicode Conference How UTF-8 Saved the Day • UTF-8 moves safely through the server because anything that looks like ASCII actually is • Standardizing on UTF-8 as the only supported internal code-page cuts down certification matrix Documentum Proprietary 12

18 th International Unicode Conference Lessons from Double. Byte Experiments • EUC-KR: 4. 1

18 th International Unicode Conference Lessons from Double. Byte Experiments • EUC-KR: 4. 1 server works (basically) • SJIS: problems! double-byte characters whose second bytes are ASCII: ` | • Lessons: – Non-ASCII moves through the server safely – String handling need not be double-byte aware, if ASCII always means ASCII • Solution: UTF-8! Documentum Proprietary 13

18 th International Unicode Conference UTF-8: ASCII is ASCII • No need for special

18 th International Unicode Conference UTF-8: ASCII is ASCII • No need for special string handling – Server 3. 1. 6. INT replaced all standard c string handling with calls to 3 rd-party library – With UTF-8, we stick with standard – yacc and other legacy tools work fine • Greatly improved perception (and reality) of how difficult I 18 N would be – Now, it’s relatively low-impact Documentum Proprietary 14

18 th International Unicode Conference It’s UTF-8, dummy! • Use UTF-8 everywhere, cut down

18 th International Unicode Conference It’s UTF-8, dummy! • Use UTF-8 everywhere, cut down on certification dimensions • Provides safe character-handling for Asia • Even though multi-lingual is not a requirement • Easier to support Documentum Proprietary 15

18 th International Unicode Conference Other Success Factors • Rely on RDBMS services to

18 th International Unicode Conference Other Success Factors • Rely on RDBMS services to translate between RDBMS code-page and UTF-8 • Market research cut back on OS localization constraints • Transcoding infrastructure Documentum Proprietary 16

18 th International Unicode Conference RDBMS transcodes to/from UTF-8 • Oracle and Sybase transcode

18 th International Unicode Conference RDBMS transcodes to/from UTF-8 • Oracle and Sybase transcode automatically – SQL Server is a problem • No need for new transcoding calls between Server and RDBMS – lower impact • Upgrade customers have non-unicode RDBMS – no need for them to convert • One less certification dimension! Documentum Proprietary 17

18 th International Unicode Conference Cut back on Localized OS certs • Limit RDBMS

18 th International Unicode Conference Cut back on Localized OS certs • Limit RDBMS for Asia – for 4. 2, just Oracle • Localized OS certification not necessary for Europe Documentum Proprietary 18

18 th International Unicode Conference Transcoding Infrastructure • Server must be aware of interface

18 th International Unicode Conference Transcoding Infrastructure • Server must be aware of interface codepages • Transcoding done at the interfaces • 3 rd party transcoding used: Uniscape’s Global. C Documentum Proprietary 19

18 th International Unicode Conference New I 18 N Architecture Desktop Client Custom Web.

18 th International Unicode Conference New I 18 N Architecture Desktop Client Custom Web. App Web Publisher WDK (Unicode) Intranet Client Administrator Rightsite(NCS) Work. Space DFC (Unicode) Web Cache ARP(NCS) ( UTF 8) DMCL (4. 2) DMCL ≤ 4. 1 (NCS) Legend: National Character Set e-Content Server (UTF 8) Unicode File System Documentum Proprietary RDBMS (Unicode) Verity 20

18 th International Unicode Conference Demo • Demo – multilingual WDK • If there’s

18 th International Unicode Conference Demo • Demo – multilingual WDK • If there’s time, a quick look at localized Desktop Client (Win 32 Client) Documentum Proprietary 21

18 th International Unicode Conference Conclusion UTF-8 was a crucial technology in DCTM’s I

18 th International Unicode Conference Conclusion UTF-8 was a crucial technology in DCTM’s I 18 N strategy: • Provided an easy path for legacy C++ • Supported specific Asian languages consistently, minimizing certifications • Prepared infrastructure for multi-lingual requirements Documentum Proprietary 22