Emerging technologies 2010 Censuses Challenges Shoshani Eli Managing

  • Slides: 47
Download presentation
Emerging technologies 2010 Censuses Challenges Shoshani Eli Managing Director Asia Pacific UN Workshop Thailand

Emerging technologies 2010 Censuses Challenges Shoshani Eli Managing Director Asia Pacific UN Workshop Thailand 2008

Agenda § § § 2 Introduction Who we are? Data capture methods Eflow Platform

Agenda § § § 2 Introduction Who we are? Data capture methods Eflow Platform Summery

“Counted” by e. FLOW world wide 1, 374, 026, 304 3

“Counted” by e. FLOW world wide 1, 374, 026, 304 3

4 TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa

4 TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovak Republic 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 Largest market share worldwide in census projects information capture

5 2008 won Belarus Argentina Thailand

5 2008 won Belarus Argentina Thailand

Overview - Top Image Systems § Founded 1991 § Data Extraction and Workflow solutions.

Overview - Top Image Systems § Founded 1991 § Data Extraction and Workflow solutions. Specialized in Censuses Project § Since 1996, traded on NASDAQ (TISA) § ~250 employees

Local Offices in the Region: Asia Shanghai, Japan, Singapore, Hong Kong, Guangzhou (R&D) and

Local Offices in the Region: Asia Shanghai, Japan, Singapore, Hong Kong, Guangzhou (R&D) and Australia Europe America’s United Kingdom, Germany, Italy, Spain, France, Benelux Boston, Rio De Jenero § Present in app. 40 countries § Strong partner network worldwide § Around 800 installed systems worldwide

The evolution of data capture in census projects From OCR into IDR Solution Key

The evolution of data capture in census projects From OCR into IDR Solution Key From Paper 8 OMR Key From Image Automated Data Capture e. FLOW Intelligent Data Capture

The evolution of data capture in census projects Manual data entry (key from paper)

The evolution of data capture in census projects Manual data entry (key from paper) Slow High error rate in the data entry process Recruitment, training and management of personnel key from Image: Archive Approx 30 -40% faster than key from paper 9 Key From Paper Key From Image

The evolution of data capture in census projects OMR (hardware readers for checkbox) OMR

The evolution of data capture in census projects OMR (hardware readers for checkbox) OMR – Requires specially printed forms and special scanners – Cannot handle handwritten/printed data – Forms are not user-friendly – OMR requires more answers => more space => increased paper expenditures => more handling and printing costs – Not flexible, difficult to adjust to other applications once census is over – No possibility to add business rules: computation, validations, coding 10

11 TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa

11 TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovak Republic 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 Largest market share worldwide in census projects information capture

The evolution of data capture in census projects Automated data capture – Requires less

The evolution of data capture in census projects Automated data capture – Requires less human intervention, enables to complete the census data capture much faster (less space, less salaries, less hardware) Automated Data Capture – Ensures data integrity – enables the use of automatic AND manual: online validations, exception handling, coding – The most advanced and proven technology for Censuses, recommended by the UN and used by all modern countries for census projects – Full flexibility in the type of data gathered (checkbox, handwritten, alpha and numeric, barcode…) – Provides all capabilities of the OMR and plus much more – Creates a correlation between the image and the actual form – Remote capabilities enable all forms to be scanned locally and then sent to a central site for processing 12 e. FLOW

The evolution of data capture in census projects Intelligent data capture platform by using

The evolution of data capture in census projects Intelligent data capture platform by using OCR/ICR/barcode/PDA/Web/email: – Automated data capture + – Smart - automatic classification for documents § Smart understands and differentiates between various types of documents and languages and Based on state-ofthe-art Machine Learning algorithms – Freedom § Artificial intelligence algorithms which provides enough information for the system to find the location of the fields on its own 13 Intelligent Data Capture

Unified content Platform Census Data base 14 Suggest a Single platform for all enterprise

Unified content Platform Census Data base 14 Suggest a Single platform for all enterprise content

India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999

India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovenia 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 15 Lessons learned

The customer says it best… Saving of 25% Saving of 12% (Source: CSO –

The customer says it best… Saving of 25% Saving of 12% (Source: CSO – Central Statistic Office Ireland) 16

The customer says it best… (Source: CSO – Central Statistic Office Ireland) 17

The customer says it best… (Source: CSO – Central Statistic Office Ireland) 17

The customer says it best… Benefits of the e. Flow Technology (Source: CSO –

The customer says it best… Benefits of the e. Flow Technology (Source: CSO – Central Statistic Office Ireland) 18

First, several general lessons… § Invest in creating the right application for the project

First, several general lessons… § Invest in creating the right application for the project – System Design § § § High level business process Functional design Technical/Detailed design Code Guidelines conventions Technical DR, with the R&D – Development § Project DR § Code review – Budget control – Bi-weekly reports – … 19

First, several general lessons… § Spend time on getting the form right – Meet

First, several general lessons… § Spend time on getting the form right – Meet organization standards – Form Design § Prepare and optimize with a pilot § Training & support 20

Indian Census 2001 TIS partners with CMC, Indian governmental agency with years of experience

Indian Census 2001 TIS partners with CMC, Indian governmental agency with years of experience and offices all over India. Form Processing Technology: §Around 500 million A 3 images §More than 2 million enumerators §The technology was implemented at 15 processing centers at major state capitals §Data was captured using only 25 high-end Kodak 7520 DS Scanners § 16 languages §The advanced technology in 2001 – e. FLOW ver. 1. 0 §Two phases 21

present new advanced technologies to meet 2010 census challenges e. FLOW 5. 0 –

present new advanced technologies to meet 2010 census challenges e. FLOW 5. 0 – Next Generation… 22

Main improvements in e. FLOW to meet Census Challenges § Architectural changes § Core

Main improvements in e. FLOW to meet Census Challenges § Architectural changes § Core changes § Recognition technologies § Modules § Features 23

e. FLOW Architectural Improvements § Core redesigned, built in. NET technology – Microsoft. NET

e. FLOW Architectural Improvements § Core redesigned, built in. NET technology – Microsoft. NET is the Microsoft strategy for connecting systems, information, and devices through Web services so people can collaborate and communicate more effectively § Customization by. NET Embedded – Speeds up Runtime – X 200 faster § Custom Code now part of CAB – no need to manage DLLs separately § Debug inside e. FLOW – No need to install development environment 24

. Net allows an Object Oriented design approach Batch 25 House Person

. Net allows an Object Oriented design approach Batch 25 House Person

e. FLOW Architectural Improvements § Improved flexibility – Multiple active applications on the same

e. FLOW Architectural Improvements § Improved flexibility – Multiple active applications on the same server (run phases in parallel) § balance workload and personnel § Ensuring on going work of all team members – Multiple sites – Support of multiple servers and cluster 26

New e. FLOW Architecture - Sites Form. ID Export 27

New e. FLOW Architecture - Sites Form. ID Export 27

Monitoring and Management 28

Monitoring and Management 28

Architectural Improvements (cont. ) § Easier management of application: – Control all stations from

Architectural Improvements (cont. ) § Easier management of application: – Control all stations from any location § Automatic stations similar to Windows Services – Remote activation of stations, no need physically access server room – Restart/Start/Control of stations from a centralized place (remotely) using e. FLOW Controller and Enterprise manager 29

Controller 30

Controller 30

Architectural Improvements § Handling Huge batches: – Ability to handle huge batches of 300

Architectural Improvements § Handling Huge batches: – Ability to handle huge batches of 300 -3000 pages each – Ability to process lots of batches in parallel – A stable, robust platform (Pic from e. FLOW’s performance test) 31

Architectural Changes § Load balancing (cont. ) – Load balancing between stations (get notifications

Architectural Changes § Load balancing (cont. ) – Load balancing between stations (get notifications automatically and better allocation of employees) – Automatic load balancing according to the numbers of batches in a queue – Priority handling - Using the e. FLOW capabilities for automatic prioritization by code (for example according to county, region etc) 32

Architectural Changes Improved security mechanism 33 (cont. )

Architectural Changes Improved security mechanism 33 (cont. )

Advanced approaches § Automatic EFI Matching – Improving template recognition station speed via the

Advanced approaches § Automatic EFI Matching – Improving template recognition station speed via the “Force EFI” mechanism, a unique barcode posted on each page 34

Advanced approaches (cont. ) § Auto Coding – Coding tasks and data validations performed

Advanced approaches (cont. ) § Auto Coding – Coding tasks and data validations performed on the data capture platform: a ‘cost-effective’ solution – Use one of the statistic software's in the market like ACTR (Canadian statistical software for coding some fields) – Use Approximate Search tools for improving results via DB (Exorbyte) 35

Advanced approaches (cont. ) § Dynamic Dictionary update – Lookup and dictionaries via DB

Advanced approaches (cont. ) § Dynamic Dictionary update – Lookup and dictionaries via DB (and not txt files) § Export – Reconstruct the original form according to the template 36

Advanced features (cont. ) § Splitting & Merging - Using the build in e.

Advanced features (cont. ) § Splitting & Merging - Using the build in e. FLOW 4 splitting/merging mechanism § Handling Problematic batches by Improved Split/Merge abilities – Taking out physically bad pages (or bad household) and continue to work with the rest of the batch – Split/Merge automatically without the need to build a specific station for merging of data § Additional powerful interfaces exposed in the CSM for faster development time – Priority (for example according to county, region etc) – Load balancing between stations (get notifications automatically and better allocation of employees) 37

Modules § Statistical report – Statistical report to monitor the daily, weekly, monthly rate

Modules § Statistical report – Statistical report to monitor the daily, weekly, monthly rate per user/station – Quality checking using § Licenses – Flexible licenses policy § Per station § Per number of pages processed 38

Statistic Reporter (e. g Crystal Reports) 39

Statistic Reporter (e. g Crystal Reports) 39

Recognition technologies OCR/ICR Engines 40

Recognition technologies OCR/ICR Engines 40

Custom stations approach 41

Custom stations approach 41

e. FLOW Receives Everything § § 42 Mobile Devices MNIC Web Completion Remote scanning

e. FLOW Receives Everything § § 42 Mobile Devices MNIC Web Completion Remote scanning

Web Completion

Web Completion

e. FLOW 4. x Web Completion 44

e. FLOW 4. x Web Completion 44

Summery § Data capture and IDR platform (paper, electronic, mobile) and not a recognition

Summery § Data capture and IDR platform (paper, electronic, mobile) and not a recognition product § Proven solution in census data capture! no need to invest time and money in new technology and vendor, minimizing the risk § Extensive experience in the design, development and implementation of real census and other high volume form processing projects. Largest market share worldwide in the processing of census projects, § Huge experience based on long researches for the special needs of the Indian Census. § Maximum flexibility, redundancy and robust platform ensuring you meet project timetable to release census results. 45

India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999

India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovenia 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 46 Summery

Thank you

Thank you