SHAWEL Sharable and Interactive WebLexicon Greg Gulrajani MaxPlanckInstitute

SHAWEL Sharable and Interactive Web-Lexicon Greg Gulrajani - Max-Planck-Institute in collaboration with David Harrison & Peter Wittenburg Max Planck Institute for Psycholinguistics 1

Introduction • paradigm of lexicon creation changes: > collaborators work at different locations > different knowledge and languages involved > lexicon is subject of continuous change Wish came up to test sharable and interactive lexicon: SHAWEL • different lexical structures used in DOBES • for purposes of study we chose a simple table-like lexicon Max Planck Institute for Psycholinguistics 2

CELEX Paradigm users (read only) users could create temporary sub DBs Development DBs complex transfer and generation procedure Production DBs developer team several sub DBs Max Planck Institute for Psycholinguistics 3

Goals • extremely simple user interface - spreadsheet like look & feel therefore selection of a simple lexicon as example • editing in a multi-user environment requiring consistency checks • functioning across unstable Internet lines requiring transaction capability • audit trail to keep overview about changes • only few should be able to write, all should be able to read but reading only in limited chunks? ? • creation of accounts by the leading researcher • participation of several teams, i. e. support for several lexica • immediate context-based startup • UNICODE support - input methods & writing systems • should finally also work on unconnected Notebook Max Planck Institute for Psycholinguistics 4

Design 1. Version • currently a Java Thin Client (reading/writing/administration) download and launch with JNLP Java Client • choice for a database system (roll-back mechanism, record locking, UNICODE - also searching & indexing access rights handling) • 27 different keyboard layouts available • others easy to add • also writing systems (Bengali) • web-client for reading by naïve users • choice for ORACLE since most easy Max Planck Institute for Psycholinguistics web client Web server CGI script Database system 5

UI and Operation 1 1. 2. launch SHAWEL and select lexicon (except started from context) structure is displayed (better show immediately some records) 1 2 3 Max Planck Institute for Psycholinguistics 6

UI and Operation 2 2. Issue global (or column) search and select input method 3 4 Max Planck Institute for 5 Psycholinguistics 7

UI and Operation 3 3. Enter query string - result are hits 5 6 Max Planck Institute for Psycholinguistics 8

UI and Operation 4 4. Select modification - click on cell - select input method & edit 7 8 Max Planck Institute for Psycholinguistics 9 9

Experiences • first tests by researchers • bootstrapping by converting Excel into table • multi-user functionality tested by the developers • extensive test is missing • UI easy enough - researchers seem to find it ok • tabbed interface with more easy selection • HTML interface • tool ready to be used/tested (www. mpi. nl/tools) Max Planck Institute for Psycholinguistics 10

Perspectives • how to go further? • simple technical design issues: > replace CGI by Java Server Pages (better multi-user perf ) > extend protocol mechanism to SOAP (opening as service) • add automatically updated author column • local operation - but how to achieve synchronization? • researcher bootstrap option (which formats / Shoebox? ) • extension to more complex structures (UI still simple? / CELEX) • has this already been done in a satisfying manner? ? ? ? Max Planck Institute for Psycholinguistics 11
- Slides: 11