KI 2 2 Lambert Schomaker Kunstmatige Intelligentie Ru
KI 2 - 2 Lambert Schomaker Kunstmatige Intelligentie / Ru. G
2 Outline Date 1 st hour 2 nd hour 6 nov Planning, N&R #11 -13 (LS) idem 13 nov Knowledge-based symbolic methods (LS) #19. 6, #21 Example: geometric modeling & matching (MB) 20 nov Statistical symbolic methods 1 (LS) #17 Example: spam filter 27 nov Statistical symbolic methods 2 (LS) Example: autoclass Heterogeneous-information integration Example: writer identification, sat. images 11 dec Grammar induction Articles 18 dec Misc. topics Misc. applications jan (exam) 4 dec
3 Knowledge-based symbolic methods § Assumption: the Turing / Von Neumann computer is a universal computation engine… § …therefore it can be used at all levels of information processing: § provided an appropriate algorithm can be designed § which operates on appropriate representations
4 Knowledge-based symbolic methods § provided an appropriate algorithm can be designed… § which operates on appropriate representations…
5 Knowledge-based symbolic methods § …provided an appropriate algorithm can be designed… § mechanisms: recursion, hierarchic procedures § search algorithms § parsers § matching algorithms § string manipulation. . § numerical computing § signal processing § image processing § statistical processing
6 Knowledge-based symbolic methods § …which operates on appropriate representations… § stacks § linear strings and arrays § matrices § linked lists § trees
7 Knowledge-based symbolic methods § …which operates on appropriate representations… § stacks § linear strings and arrays § matrices § linked lists § trees is indeed succesful in many information processing problems
8 Example: double spiral problem in inner or outer spiral?
9 Example: double spiral problem in inner or outer spiral? difficult for, e. g. , neural nets
10 Example: double spiral problem in inner or outer spiral? Answer: outside difficult for, e. g. , neural nets
11 Example: double spiral problem in inner or outer spiral? How? -flood fill algorithm? -other?
12 Example: double spiral problem in inner or outer spiral? count edges -Find the right representation! = Outside odd/even count is not sensitive to shape variations of the spiral: a general solution
13 Example: double spiral problem in inner or outer spiral? Outside
14 Culture § If it doesn’t work, you didn’t think hard enough § You have to know what you do § You have to prove that & why it works § Even neural networks work on top of the Turing/von Neumann engine (it will always win) § If you’re smart, you can often avoid NP-completeness § Use of probabilities is a sign of weakness
15 Strong points § Scalability is often possible § Convenience: little context dependence, no training § Reusability § Transformability (compilation) § Algorithmic refinement once it is known how to do a trick (e. g. , graphics cards and DSPs in mobile phones: ugly code but highly efficient)
16 Challenges § Knowledge dependence is expensive – not a problem in “IT” application design – a challenge to AI § Uncertainty § Noise § Brittleness
17 Solutions § More and more representational weight: (UML, Semantic Web, XML solves everything) § Symbolic learning mechanisms: – induction: version spaces grammar inference – decision tree learning – rewriting formalisms § Active hypothesis testing (what if…, assume X…)
18 Example § In Reading Systems (optical character recognition), only a small part of the algorithm concerns problems of image processing and character classification § Most of the code is concerned with the structure of the text image: – – where are the blobs? are these blobs text, photo or graphics? how to segment into meaningful chunks: characters, words? what is the logical organization (reading order) in the physical organization of pixels? Knowledge-based approaches are a necessity!
19
20
21
22 Name of conference Brief description of conference Programme committee Submission details
23 Example of layout analysis § Knowing the type of a text block strongly reduces the number of possible interpretations Example: “address block” § Address: – name of person – street, number – postal code, city
24 Amsterdam 7/7/2003 prof dr. L. R. B. Schomaker Grote Appelstraat 23 9712 TS Groningen Nederland
25 address prof dr. L. R. B. Schomaker Grote Appelstraat 23 9712 TS Groningen Nederland
26 address person name street codes+city country prof dr. L. R. B. Schomaker Grote Appelstraat 23 9712 TS Groningen Nederland
27 address titles initials surname street , , , digits 4 digits 2 upper case city name country name prof dr. L. R. B. Schomaker Grote Appelstraat 23 9712 TS Groningen Nederland
28 Content <address> <person> <title></title> <initials or first name> </initials or first name> <surname></surname> </person> <home> <street name></street name> <number> </home> <city> <postal code> <four digits></four digits> <white space></white space> <two upper-case letters> …. </postal code> </city> <country> </address> etc. Layout (address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city) is-above (country)) etc. prof dr. L. R. B. Schomaker Grote Appelstraat 23 9712 TS Groningen Nederland
29 Content <address> <person> <title></title> <initials or first name> </initials or first name> <surname></surname> </person> <home> <street name></street name> <number> </home> <city> <postal code> <four digits></four digits> <white space></white space> <two upper-case letters> …. </postal code> </city> <country> </address> etc. HELPS TEXT CLASSIFICATION Layout (address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city) is-above (country)) etc. HELPS TEXT SEGMENTATION prof dr. L. R. B. Schomaker Grote Appelstraat 23 9712 TS Groningen Nederland
30
- Slides: 30