OffLine Cursive Word Recognition Tal Steinherz TelAviv University
- Slides: 49
(Off-Line) Cursive Word Recognition Tal Steinherz Tel-Aviv University
Cursive Word Recognition Preprocessing Segmentation Feature Extraction Recognition Post Processing 2
Preprocessing l Skew correction l Slant correction l Smoothing l Reference line finding 3
4
Segmentation Motivation Given a 2 -dimensional image and a model that expects a 1 -dimensional input signal, one needs to derive an ordered list of features. l Fragmentation is another alternative where the resulting pieces have no literal meaning. l 5
Segmentation Dilemma To segment or not to segment ? That’s the question! l Sayre’s paradox: “To recognize a letter, one must know where it starts and where it ends, to isolate a letter, one must recognize it first”. l 6
Recognition Model l What is the basic (atomic) model? – word (remains identical through training and recognition) – letter (concatenated on demand during recognition) l What are the training implications? – specific = total cover (several samples for each word) – dynamic = brick cover (samples of various words that include all possible 7 characters=letters)
Basic Word Model 1 st letter submodel . . . i th letter sub -model . . . last letter sub -model 8
Segmentation-Free l In a segmentation-free approach recognition is based on measuring the distance between observation sequences 9
Segmentation-Free - continue l The most popular metric is Levenshtein’s Edit Distance, where a transformation between sequences is done by atomic operations: insertion, deletion and substitution associated with different costs l Implementations: Dynamic programming, HMM 10
Segmentation-Free (demo) Each column was translated into a feature vector. l Two types of features: – number of zero-crossing – gradient of the word’s curve l 11
12
1) The gradient of the word’s curve at a given pixel column 13
Letter sub-HMM components Normal Transition Null Transition 14
Letter sub-HMM Normal Transition Null Transition 15
Segmentation-Based l In a segmentation-based approach recognition is based on complete bipartite match-making between blocks of primitive segments and letters of a word 16
Segmentation-Based - continue l The best match is found by the dynamic programming Viterbi algorithm l An implementation by an HMM is very popular and enhances the model capabilities 17
Segmentation-Based (demo) First the word is heuristically segmented. l It is preferable to over segment a character. Nevertheless a character must not span more than a predefined number of segments. l Each segment is translated into a feature vector. l 18
Features in Segments (demo) l Global features: – ascenders, descenders, loops, i dots, t strokes l Local features: – X crossings, T crossings, end points, sharp curvatures, parametric strokes l Non-symbolic features: – pixel moments, pixel distributions, contour condings 19
20
21
Letter sub-HMM (maximum 4 segments per character) 1 2 3 4 1 22
Two-Letter joined sub-HMM (0. 5 -3 segments per character) L M R L 23
Pattern Recognition Issues l Lexicon size: – small (up to 100 words) – limited (between 100 to 1000 words) – infinite (more than 1000 words) 24
Word Model Extension l A new approach to practice recognition? – path discriminant (a single general word model, a path=hypothesis per word) ‘a’ sub -HMM . . . ‘m’ sub. HMM . . . ‘z’ sub -HMM 25
Online vs. Off-Line Online – captured by pen-like devices. the input format is a two-dimensional signal of pixel locations as a function of time (x(t), y(t)). l Off-line – captured by scanning devices. the input format is a two-dimensional image of gray-scale colors as a function of location I(m*n). strokes have significant width. l 26
Online vs. Off-Line (demo) 27
Online vs. Off-Line (cont. ) l In general online classifiers are superior to off-line classifiers because some valuable strokes are blurred in the static image. Sometimes temporal information (stroke order) is also a must in order to distinguish between similar objects. 28
Online Weaknesses Sensitivity to stroke order, stroke number and stroke characteristics variations: l Similar shapes that resemble in the image domain might be produced by different sets of strokes. l Many redundant strokes (consecutive superfluous pixels) that are byproducts of the continuous nature of cursive handwriting. l Incomplete (open) loops are more frequent. 29
30
Off-Line can improve Online Sometimes the off-line representation enables one to recognize words that are not recognized given the online signal. Ø An optimal system would combine online and off-line based classifiers. l 31
The desired integration between online and off-line classifiers Having a single word recognition engine to practice both the online and off-line data. Ø It requires an off-line to online transformation to extract an alternative list of strokes that preserves off-line like features while being consistent in order. l 32
Online signal Projection to image Domain Bitmap image Stroke width=1 Online signal “Painting” (thickening the strokes) Real static image Stroke width>1 The “pseudo-online” transformation Pseudo-online representation Online recognition engine C l a s s i f i c a t i o n Online classifiers Pseudo-online classifiers Online classification outputs Pseudo-online classification outputs Integration by some combination scheme Recognition results 33
Cursive Handwriting Terms Axis - The main subset of strokes that assemble the backbone, which is the shortest path from left to right including loops on several occasions. l Tarsi - The other subsets of connected strokes that produce branches, which are hang above (in case of ascenders) or below (in case of descenders) the axis. l 34
The Pseudo-Online Transformation Follow the skeleton of the axis from the left most pixel until reaching the first intersection with a tarsus. l Surround the tarsus by tracking its contour until returning back to the intersection point we started from. l Continue along the axis to the next intersection with a tarsus, and so on until the right most pixel is reached. l Loops that are encountered along the axis are also surrounded completely. l 35
Computing the axis’s skeleton 36
Computing the axis’s skeleton (cont. ) 37
Computing the axis’s skeleton (cont. ) 38
Processing the tarsi 39
Processing the tarsi (cont. ) 40
Handling i-dots 41
42
Experimental Setup The online word recognition engine of Neskovic et al. – satisfies Trainability and Versatility. l A combination of 6/12 online and pseudoonline classifiers. l Several combination schemes – majority vote, max rule, sum rule. l An extension of the HP’s dataset that can be found in the UNIPEN collection. l 43
Experimental Setup (cont. ) Different Training sets of 46 writers. l Disjoint validation sets of 9 writers. l Disjoint test set of 11 writers. l The lexicon contains 862 words. l 44
Experimental Results for 6 Classifiers 45
Experimental Results for 12 Classifiers 46
Result Analysis Word level - in 110 word classes (12. 8%) at least 7 word samples (10. 6%) were correctly recognized only by the combination with the pseudo-online classifiers. l Writer level – for 12 writers (18. 2%) at least 65 of the words they produced (7. 5%) were correctly recognized only by the combination with the pseudo-online classifiers. l 47
Result Analysis (cont. ) 909 of the input words (5. 9%) were correctly recognized by at least one pseudo-online classifier and neither one of the 12 online classifiers. l 357 of the input words (2. 3%) were correctly recognized by at least 4 of the 12 pseudo-online classifiers and neither one of the 12 online classifiers. l For 828 of the input words (5. 3%) the difference between the number of pseudo-online and online classifiers that correctly recognized them was 6 or more. l 48
Conclusions l The pseudo-online representation does add information that cannot be obtained by optimizing extending a combination of online classifiers only. 49
- Steinherz/bleyer algorithm
- Porque tal es su pensamiento
- De tal palo tal astilla
- Indirekt fråga
- Poema brincando de não me olhe
- Wolf in cursive
- What is a questioned document
- Liar in cursive
- Trois femmes puissantes lecture cursive
- How to write wendy in cursive
- Jay in cursive
- Physical in cursive
- Cursive nn
- Nn in cursive
- How to write a v in cursive
- Amiyah in cursive
- Cursive cont
- Four part processing model for word recognition
- Four part processing model for word recognition
- Rapid word recognition chart
- Tandem offline
- Skpmg2
- Perhitungan tkt penelitian
- Strategie offline
- Ansible offline installation
- Helpdesk emcs
- Siptm login
- Markah pajsk peringkat kebangsaan
- Online a offline komunikace
- 4greedy
- Introduction to ups
- Siha offline
- Skpmg
- Offline cms
- Forumssocialmedia
- Scratch offline
- Redcap mobile app
- Compass offline
- Temponox
- Vielfliegertreff offline
- Pengertian pemasaran online dan offline
- "university of maryland university college"
- Employee recognition program proposal
- Chapter 18 revenue recognition
- Unconformity
- Lisa kuklinski
- Tranquilizers examples
- Drug recognition expert chart
- Intermediate accounting revenue recognition
- Chamfer matching