LearningBased Indexing of Works of Art Kurt Grieb

Learning-Based Indexing of Works of Art Kurt Grieb

Presentation Overview l l Research Divided into 2 parts Parallel Upgrade of ALIP – – l Structure of Parallelization Results EMPEROR Database Tests – – Setup of Tests Results

Reasons for Parallelization l l ALIP statistical computations are computationally expensive Corel Image Library Comparison: – – 15 -20 Minutes Unacceptable for Web and other applications

Parallelization Concept l One server receives request, divides workload between the total number of clients. Server Client 1 – 30 31 – 60 . . Client 541 -570 571 -600

Parallelization Structure Request With URL Range of Concepts Server PERL GUI CLIENTS Best Fit Likelihoods

Results

Results l l l 600 concepts can now be computed in roughly 40 seconds over 30 processors. Roughly ideal speedup More processors on a smaller size reduces efficiency of speedup

The EMPEROR Library l l 1700 Images Chinese Historical Images

The Testing l l l 2 sets of tests (9 and 20 concepts) 4 runs per set (best, worst, 2 random) 4 sizes per run (3, 6, 9, 12) Set 1 Best Sub Worst Sub Size 3 Set 2 Random 1 Size 6 Random 2 Size 9 Best Sub Size 12 Worst Sub Size 3 Random 1 Size 6 Random 2 Size 9 Size 12

Motivation For Test Structure l l l Effects of more specific classes Effects of different training classes Determine reasonable training sizes

Results Set 1 Total Percentages 0. 8 0. 7 % Correct 0. 6 0. 5 Worst Case Random 2 0. 4 Random 1 Best Case Random Generation 0. 3 0. 2 0. 1 0 3 6 9 Sample Size 12

Interesting Cases / Notable Trends l l l Set One vs. Set Two The Black and White Sketches General Trends vs. Specific Classes Weak Classes Misclassification of Similar Objects – – – Black & White Images vs. Text All faces vs. Color/BW Faces and Upper Bodies

The Black and White Sketches l l l Performed the best of all classes Accuracies of 99% over all tests Due to difference between this class and most other classes

Interesting Cases / Notable Trends l l The overall accuracy of all classes went up with more training In certain classes, the accuracy went down as all concepts were trained with more imaging

Weak Classes l l In certain concepts a weak class outperformed other classes Could be due to openness of concept spaces

Misclassification of Similar Objects l Pictures with more than one concept in them sometimes can confuse ALIP

Misclassification of Similar Objects

Further Work l l Overlapping of Concepts 3 -D representations of objects Improved Accuracy of ALIP Current Results are Promising

ABSTRACT l Digital images are widely and readily in use. Text based indexing of these images is becoming tougher as the number of digital images grows. Therefore, Content Based Image Retrieval is becoming a more viable alternative because of the ability to automate this process. Dr. Wang’s Automatic Linguistic Indexing of Pictures shows great promise as a Contend Based Image Retrieval system. Our lab is looking to expand this indexing of pictures for artistic/historical purposes, which are harder to classify due certain characterizes of these pictures. Additionally, some upgrades need to be made to ALIP in order to convert it to a more user-friendly, mainstream program. I present the results of the upgrades to ALIP and the experiments conducted on a historic image database.