Tools and Libraries for Document Analysis and Recognition
Tools and Libraries for Document Analysis and Recognition
Tools • Image manipulation tools – Viewers – Conversion tools – Drawing tools • Image data preparation tools – Image selection tools – Truthing tools • System development tools – HWAI demo – Word seg tool – Digit string recognition
Libraries • General image libraries – Tiff, bitmap – Hips • Libraries for Document recognition – Binarization – Chaincode – Connected component – Others
Language for making tools • Tcl/tk • Java • VC++
Tools: Image manipulation tools • Viewers – On solaris: Chips, xv • Chips –i image. hips – On Windows: • xpaint, the basic, easy for getting screen shot. • irfan. View– my favorite, including some basic image processing functions
Tools: Image manipulation tools • Conversion tools – On Solaris: • Convert in_image. tif out_image. gif • tifftohips • Hipstotiff – On Windows • irfan. View to open an image and save it to the format you want • Cx. Image library for your own implementation, reference: www. codeproject. com search for Cx. Image • Drawing tools – For prepare prsentations – needed?
Tools: Image data preparation tools • Image selection tools – Need: to select training images from a large set of images. – Check_digits image. File. Name. list > tselected. list • Image 1. hips • Image 6. hips • …. .
Tools: Image data preparation tools • Truthing tools – Truther image. list >out • Image 1 1505 • Image 2 4237 • Image 3 111 • …
Tools: System development tools • Word segmentation tool
Tools: System development tools • Digit string recognition
Tools: System development tools • HWAI demo
Libraries: General image libraries • Tiff, bitmap – Available free for both Solaris and Windows • Hips – CEDAR internal library – Hipl_format. h
Libraries: General image libraries • Libraries for Document recognition – Binarization • A-thresh: Atsu’s global thresholding algorithm, code available • E-thresh: a local-adaptive thresholding, code avaliable • Oathresh: another global thresholding algorithm
Libraries: General image libraries • Libraries for Document recognition – Chaincode – Connected component – Others
- Slides: 15