INTRODUCTION TO INFORMATION RETRIEVAL CS 4323 0910 1
- Slides: 35
INTRODUCTION TO INFORMATION RETRIEVAL CS 4323 / 0910 -1 YFA Tersedia online di http: //www. ittelkom. ac. id/staf/yanuar 01 YFA CS 4323 S 1/IT/IR/E 6/0910 Institut Teknologi Telkom http: //www. ittelkom. ac. id/staf/yanuar
References • Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. • Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval. Addison Wesley, 1999. • William B. Frakes and Ricardo Baeza-Yates, Information Retrieval Data Structures and Algorithms. Prentice Hall, 1992. • Amy Langville and Carl Meyer, Google's Page. Rank and Beyond: the Science of Search Engine Rankings. Princeton University Press, 2006. • G. Salton and M. J. Mc. Gill, Introduction to Modern Information Retrieval. Mc. Graw-Hill, 1983. http: //www. ittelkom. ac. id/staf/yanuar
References • Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. http: //www. ittelkom. ac. id/staf/yanuar
References • Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval. Addison Wesley, 1999. http: //www. ittelkom. ac. id/staf/yanuar
References • William B. Frakes and Ricardo Baeza-Yates, Information Retrieval Data Structures and Algorithms. Prentice Hall, 1992. http: //www. ittelkom. ac. id/staf/yanuar
References • Amy Langville and Carl Meyer, Google's Page. Rank and Beyond: the Science of Search Engine Rankings. Princeton University Press, 2006. http: //www. ittelkom. ac. id/staf/yanuar
References • G. Salton and M. J. Mc. Gill, Introduction to Modern Information Retrieval. Mc. Graw-Hill, 1983. http: //www. ittelkom. ac. id/staf/yanuar
References • Sample of Examination http: //www. infosci. cornell. edu/Courses/info 4300/200 9 fa/sample-exam. html • Sample of Test Data http: //www. infosci. cornell. edu/Courses/info 4300/200 9 fa/test. Data. html http: //www. ittelkom. ac. id/staf/yanuar
Information Science brings together faculty, students and researchers who share an interest in combining computer science with the social sciences of how people and society interact with information. This course is intended for both Computer Science and Information Science students. http: //www. ittelkom. ac. id/staf/yanuar
Discussion Class What is Information Retrieval? http: //www. ittelkom. ac. id/staf/yanuar
Course Description This course studies techniques and human factors in discovering information in online information systems. Methods that are covered include techniques for indexing, searching, browsing and filtering information, descriptive metadata, the use of classification systems and thesauruses, with examples from Web search systems http: //www. ittelkom. ac. id/staf/yanuar
Definition Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers). Information retrieval can also cover other kinds of data and information problems beyond that specified in the core definition above. http: //www. ittelkom. ac. id/staf/yanuar
The Field of IR • Now the world has changed, and hundreds of millions of people engage in information retrieval every day when they use a web search engine or search their email. • Information retrieval is fast becoming the dominant form of information access, overtaking traditional database-style searching (the sort that is going on when a clerk says to you: “I’m sorry, I can only look up your order if you can give me your order ID”). • Information retrieval can also cover other kinds of data and information problems beyond that specified in the core definition above. http: //www. ittelkom. ac. id/staf/yanuar
The Field of IR (cont’d) • The field of IR also covers supporting users in browsing or filtering document collections or further processing a set of retrieved documents. – Given a set of documents, clustering is the task of coming up with a good grouping of the documents based on their contents. http: //www. ittelkom. ac. id/staf/yanuar
An Example IR Problem • A fat book that many people own is Shakespeare’s Collected Works. Suppose you wanted to determine which plays of Shakespeare contain the words Brutus and Caesar and not Calpurnia. • The simplest form of document retrieval is for a computer to do this sort of linear scan through documents. – The way to avoid linearly scanning the texts for each query is to index the documents in advance. • The problems: – To process large document collections quickly – To allow more flexible matching operations – To allow ranked retrieval. http: //www. ittelkom. ac. id/staf/yanuar
Discussion Class Describe this Picture: http: //www. ittelkom. ac. id/staf/yanuar
Searching and Browsing: The Human in the Loop Return objects Return hits Browse repository Search index http: //www. ittelkom. ac. id/staf/yanuar
Definitions Information retrieval: Subfield of computer science that deals with automated retrieval of documents (especially text) based on their content and context. Searching: Seeking for specific information within a body of information. The result of a search is a set of hits. Browsing: Unstructured exploration of a body of information. Linking: Moving from one item to another following links, such as citations, references, etc. http: //www. ittelkom. ac. id/staf/yanuar
The Basics of Information Retrieval Query: A string of text, describing the information that the user is seeking. Each word of the query is called a search term. A query can be a single search term, a string of terms, a phrase in natural language, or a stylized expression using special symbols. Full text searching: Methods that compare the query with every word in the text, without distinguishing the function of the various words. Fielded searching: Methods that search on specific bibliographic or structural fields, such as author or heading. http: //www. ittelkom. ac. id/staf/yanuar
Sorting and Ranking Hits When a user submits a query to a search system, the system returns a set of hits. With a large collection of documents, the set of hits maybe very large. The value to the use depends on the order in which the hits are presented. Three main methods: • Sorting the hits, e. g. , by date • Ranking the hits by similarity between query and document • Ranking the hits by the importance of the documents http: //www. ittelkom. ac. id/staf/yanuar
Examples of Search Systems Find file on a computer system (Spotlight for Macintosh). Library catalog for searching bibliographic records about books and other objects (Library of Congress catalog). Abstracting and indexing system for finding research information about specific topics (Medline for medical information). Web search service for finding web pages (Google). http: //www. ittelkom. ac. id/staf/yanuar
General Applications of IR http: //www. ittelkom. ac. id/staf/yanuar
Domain Specific Applications of IR http: //www. ittelkom. ac. id/staf/yanuar
Quizand. Game
Puzzle
Puzzle D
Puzzle
Puzzle A
E mp o w eri n g An al ysi s
Empowering Analysis
Puzzle
Jawaban: Puzzle 6
Puzzle
Jawaban: Puzzle 9
YFA August 2008 Edition), February 2008 http: //www. ittelkom. ac. id/staf/yanuar Adapted from cs. cornell. edu and cambridge. edu (2 nd http: //www. ittelkom. ac. id/staf/yanuar
- Manning introduction to information retrieval
- Bvf document
- Introduction to information retrieval
- Manning information retrieval
- Sequential searching in information retrieval
- Information retrieval architecture
- What is precision and recall in information retrieval
- Modern information retrieval
- Query operations in information retrieval
- For skip pointer more skip leads to
- Index construction in information retrieval
- Bsbi vs spimi
- Which internet service is used for information retrieval
- Information retrieval tutorial
- Wildcard queries in information retrieval
- Browse capabilities in information retrieval system
- Link analysis in information retrieval
- Information retrieval lmu
- Defense acquisition management information retrieval
- Advantages of information retrieval system
- Information retrieval nlp
- Information retrieval data structures and algorithms
- Search engines information retrieval in practice
- Relevance information retrieval
- Stanford information retrieval
- Link analysis in information retrieval
- Skip pointer
- Log frequency weighting
- Information retrieval
- Information retrieval
- Information retrieval
- Relevance information retrieval
- Information retrieval
- Information retrieval
- Url image
- Information retrieval