Data Mining and Intelligent Agents Outline Data Mining

  • Slides: 39
Download presentation
Data Mining and Intelligent Agents

Data Mining and Intelligent Agents

Outline • Data Mining Overview

Outline • Data Mining Overview

Proliferation of Data • Indexes – – – – – PAC-INFO Public Records Online

Proliferation of Data • Indexes – – – – – PAC-INFO Public Records Online Florida gun licenses (look up John Smith) Lee County property records (look up John F. Smith) Death index Investigative Resources National STR 82 U Allegheny County Property Online Public Records (fosson. com) • Pay services – uspublicinfo. com – USsearch. com

Data Mining “The key in business is to know something that nobody else knows.

Data Mining “The key in business is to know something that nobody else knows. ” — Aristotle Onassis “To understand is to perceive patterns. ” — Sir Isaiah Berlin PHOTO: LUCINDA DOUGLAS-MENZIES PHOTO: HULTON-DEUTSCH COLL

Data Mining • Extracting previously unknown relationships from large datasets – discover trends, relationships,

Data Mining • Extracting previously unknown relationships from large datasets – discover trends, relationships, dependencies – make predictions – target customers • In e. Commerce, data comes from – – – customers themselves cookies external databases data matching Double. Click, etc. Digital rights management tools (what we read and how much) – library records

Taxonomy of Data Mining Methods Predictive Modeling • Decision Trees • Neural Networks •

Taxonomy of Data Mining Methods Predictive Modeling • Decision Trees • Neural Networks • Naive Bayesian • Branching criteria Database Segmentation Link Analysis Text Mining Deviation Detection Semantic Maps • Clustering • K-Means Rule Associa tion Visualization SOURCE: WELGE & REINCKE, NCSA 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001 MICHAEL I. SHAMOS

Predictive Modeling • Objective: use data about the past to predict future behavior •

Predictive Modeling • Objective: use data about the past to predict future behavior • Sample problems: – Will this (new) customer pay his bill on time? (classification) – What will the Dow-Jones Industrial Average be on October 15? (prediction) • Technique: supervised learning – decision trees – neural networks – naive Bayesian

Neural Networks of processing units called neurons. This is the j th neuron: Neuron

Neural Networks of processing units called neurons. This is the j th neuron: Neuron computes a linear function of the inputs n INPUTS x 1, …, xn 1 OUTPUT yj depends only on the linear function Neurons are easy to simulate n WEIGHTS w 1 j , …, wnj SOURCE: CONSTRUCTING INTELLIGENT AGENTS WITH JAVA

Neural Networks Learning through back-propagation 1. Network is trained by giving it many inputs

Neural Networks Learning through back-propagation 1. Network is trained by giving it many inputs whose output is known 2. Deviation is “fed back” to the neurons to adjust their weights 3. Network is then ready for live data DEVIATION SOURCE: CONSTRUCTING INTELLIGENT AGENTS WITH JAVA

Neural Network Demos • Demo: Notre Dame football • Financial applications: – Churning: are

Neural Network Demos • Demo: Notre Dame football • Financial applications: – Churning: are trades being instituted just to generate commissions? – Fraud detection in credit card transactions – Kiting: isolate float on uncollected funds – Money Laundering: detect suspicious money transactions (US Treasury's Financial Crimes Enforcement Network) • Insurance applications: – Auto Insurance: detect a group of people who stage accidents to collect on insurance – Medical Insurance: detect professional patients and ring of doctors and ring of references

Database Segmentation (Clustering) • “The art of finding groups in data” Kaufman & Rousseeuw

Database Segmentation (Clustering) • “The art of finding groups in data” Kaufman & Rousseeuw • Objective: gather items from a database into sets according to (unknown) common characteristics • Much more difficult than classification since the classes are not known in advance (no training) • Examples: – Demographic patterns – Topic detection (words about the topic often occur together) • Technique: unsupervised learning

Clustering Example • Are there natural clusters in the data (36, 10), (12, 8),

Clustering Example • Are there natural clusters in the data (36, 10), (12, 8), (38, 42), (13, 6), (36, 38), (16, 9), (40, 36), (35, 19), (37, 7), (39, 8)?

Clustering • K-means algorithm • To divide a set into K clusters • Pick

Clustering • K-means algorithm • To divide a set into K clusters • Pick K points at random. Use them to divide the set into K clusters based on nearest distance • Loop: – Find the mean of each cluster. Move the point there. – Redefine the clusters. – If no point changes cluster, done • K-means demo • Agglomerative clustering: start with N clusters & merge • Agglomerative clustering demo

Rule Association Demos • Magnum Opus (Rule. Quest, free download) • See 5/C 5.

Rule Association Demos • Magnum Opus (Rule. Quest, free download) • See 5/C 5. 0 (Rule. Quest, free download) • Cubist numerical rule finder (Rule. Quest, free download)

Text Mining • Objective: discover relationships among people & things from their appearance in

Text Mining • Objective: discover relationships among people & things from their appearance in text • Generation of “knowledge map”, a graph representing terms/topics and their relationships • Semio. Map demo (Semio Corp. ) – – Phrase extraction Concept clustering (through co-occurrence) not by document Graphic navigation (link means concepts co-occur) Processing time: 90 minutes per gigabyte • Semio Taxonomy available for legal documents • Automatic summarization (Extractor demo)

Visualization • Objective: produce a graphic view of data so it become understandable to

Visualization • Objective: produce a graphic view of data so it become understandable to humans • Hyperbolic trees (Inxight. com) grocery, UTC • Table Lens (inxight. com) • Spot. Fire (free download from www. spotfire. com) • Open. Viz • Internetivity

Intelligent Agents

Intelligent Agents

Outline • What is an agent? • Why do we need them? – Important

Outline • What is an agent? • Why do we need them? – Important tasks are too time-consuming, not economical – Too much information (filtering) • What kinds of agents are there? • How do they work?

What is an Agent? • In real life, a person who acts on your

What is an Agent? • In real life, a person who acts on your behalf • In ecommerce, a computer program that acts on your behalf • Agents often perform tasks usually associated with humans • But: there is no magic • An agent is just a computer program • Synonyms: bot, daemon (a supernatural being of Greek mythology intermediate between gods and men)

Sample Shopping Agent User 0 Communicate needs SOURCE: DAVID ELLIMAN 20 -751 ECOMMERCE TECHNOLOGY

Sample Shopping Agent User 0 Communicate needs SOURCE: DAVID ELLIMAN 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001 MICHAEL I. SHAMOS

Agent Properties • Autonomous – Acts by itself (independent of user) • Reactive –

Agent Properties • Autonomous – Acts by itself (independent of user) • Reactive – Responds to its environment, initiates actions • Communicative – Communicates with people and other agents • Goal-driven – Acts until it accomplishes its purpose or learns that it can’t

Examples of Agents • Search agents – Find web pages. Fast. Search, Google, Northern.

Examples of Agents • Search agents – Find web pages. Fast. Search, Google, Northern. Light – Find search engines. Searchenginecollosus. com • Metacrawlers – Search multiple indexes. LEXIBOT • Text agents – Summarization. Extractor demo • News agents – Locate relevant news stories. Total. NEWS

Information Agents • Monitors, update agents – Notify user when events occur, e. g.

Information Agents • Monitors, update agents – Notify user when events occur, e. g. page is modified Mind-it , jav. Elink, Cyber. Alert (company news), Enfish tracker (tracks email, web pages, files) Eo. Monitor, Morning. Paper – e. Watch, Cyber. Alert • Web intelligence. Net. Currents • Addresses, phone numbers, reverse directories – AT&T Any. Who, Big. Yellow, Info. Space (by address!) • Stock bots (financial information, charts, news) – Stock. Point, Street. EYE, Yahoo

Shopping Agents

Shopping Agents

Shopping Agents • Price bots – Best. Book. Buys, Bottom. Dollar, Price. Grabber, Store.

Shopping Agents • Price bots – Best. Book. Buys, Bottom. Dollar, Price. Grabber, Store. Runner (CBS) • Sale locators – Shopping. List (brick & mortar), Value. Find • Auction notification – Auction. Watch, Bid. Find • Browser buttons – Value. Speed • Recommenders – Active. Buyers. Guide, Product. Review. Net

Travel Agents • Information about flights, trains, purchase tickets – Orbot, USAirways, Travelocity •

Travel Agents • Information about flights, trains, purchase tickets – Orbot, USAirways, Travelocity • Discount Hotels – hoteldiscount!com • Price auctions • Where is the human travel agent going? • Airplanes in flight – Flight. Tracker – JFK Tower audio • CMU Bot List

Agent Technologies • • Table-driven (data lookup) Rule-based Goal-directed Utility-based inputs “ ”

Agent Technologies • • Table-driven (data lookup) Rule-based Goal-directed Utility-based inputs “ ”

Rule-Based Agents Condition-action rule: if car-in-front-is-braking then start-braking SOURCE: ANDREAS GEYER-SCHULZ

Rule-Based Agents Condition-action rule: if car-in-front-is-braking then start-braking SOURCE: ANDREAS GEYER-SCHULZ

Rule-Based Agents • Businessmen are not programmers • Need natural rule specification language +

Rule-Based Agents • Businessmen are not programmers • Need natural rule specification language + rule follower • Need memory modified and accessed by rules • Example: classifying a vehicle IF wheels < 1 THEN vehicle = NOT land_vehicle IF wheels == 1 THEN vehicle = unicycle IF wheels > 2 AND wheels < 4 THEN vehicle = cycle IF wheels > 4 THEN vehicle = truck IF wheels > 3 AND weight < 2400 AND length < 8 THEN vehicle = car ; logic incomplete here IF wheels > 12 THEN vehicle = semi

Business Rules • Grocery store example IF in. Basket(french_fries) AND NOT asked(ketchup) THEN ask(ketchup)

Business Rules • Grocery store example IF in. Basket(french_fries) AND NOT asked(ketchup) THEN ask(ketchup) ; ask “Would you care for ketchup to go ; with your french fries? ” • Rules that learn IF in. Basket(french_fries) THEN prob(want_ketchup) = SQL( <sql_query> ) ; query might involve customer data and ; demographics IF prob(want_ketchup) > 0. 3 AND NOT asked(ketchup) THEN ask(ketchup)

Goal-Directed Agents Actions are evaluated with respect to goals Will this action get me

Goal-Directed Agents Actions are evaluated with respect to goals Will this action get me closer to the goal state? SOURCE: ANDREAS GEYER-SCHULZ

Static versus Mobile Agents Static Agent System Mobile Agent System SOURCE: MITSUBISHI 20 -751

Static versus Mobile Agents Static Agent System Mobile Agent System SOURCE: MITSUBISHI 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001 MICHAEL I. SHAMOS

Cooperating Agents SOURCE: PETER FINGAR 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001

Cooperating Agents SOURCE: PETER FINGAR 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001 MICHAEL I. SHAMOS

Applications • Intelligent freight planning • Tele. Truck DFKI Gmb. H Saarbrücken

Applications • Intelligent freight planning • Tele. Truck DFKI Gmb. H Saarbrücken

SOURCE: K. FISCHER

SOURCE: K. FISCHER

SOURCE: K. FISCHER

SOURCE: K. FISCHER

SOURCE: K. FISCHER

SOURCE: K. FISCHER

Key Takeaways • Agents are the wave of the future – laziness + information

Key Takeaways • Agents are the wave of the future – laziness + information overload = agents • Agent systems are object-oriented and distributed • Agents are mobile • Agents negotiate with and talk to other agents

Q&A 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001 MICHAEL I. SHAMOS

Q&A 20 -751 ECOMMERCE TECHNOLOGY SUMMER 2001 COPYRIGHT © 2001 MICHAEL I. SHAMOS