Web Mining Web Mining Web mining is the
Web Mining
Web Mining Web mining is the application of data mining techniques to find interesting and potentially useful knowledge from web data.
What is Web Data ? Web data is Web content –text, image, records, etc. Web structure –hyperlinks, tags, etc. Web usage –http logs, app server logs, etc.
Web Mining Taxonomy Web-Mining Web Content Mining Web Structured Mining Web Usage Mining
Web Content Mining Discovery of useful information from web contents / data / documents Web data contents: 1. text, 2. image, 3. audio, 4. video, 5. metadata and 6. hyperlinks
Web Structured Mining • It deal with discovering and modeling the link structure of the web. • Work has been carried out to model the web based on the topology of the hyperlinks. Helps in • Discovering similarities between sites • In discovering important sites for a particular topic. • Discovering web communities.
Web Usage Mining Itt deals with understanding user behavior in interacting with the web or with a website. Aim To obtain information that may assist web sites for reorganization or adaptation to better suit the user.
To understand user’s behaviour • Clicking pattern • Browsing time • Transaction
Application 1. Target potential customers for electronic commerce 2. Enhance the quality and delivery of Internet information services to the end user 3. Improve Web server system performance 4. Identify potential prime advertisement locations 5. Facilitates personalization/adaptive sites 6. Improve site design 7. Fraud/intrusion detection 8. Predict user’s actions (allows pre fetching)
Web Mining Taxonomy Web-Mining Web Structured Mining Web Content Mining Text Image Audio Video Structured Hyperlinks Web Usage Mining Document Structured Intra-Document Hyperlinks Web-Server Logs Inter-Document Hyperlinks Application Server Logs Application Level Logs
Web usage mining and E-Commerce E-commerce is the killer-application of web mining • Keep former customers and attract new customers • Provide better service and be more interactive Web usage mining is the best way to analyse the customer’s behaviour. • Discover customers needs or interests • Analyse customers behaviour
The KDD Process for E-commerce Action Data collection and Pre-processing Mining Pattern discovery and analysis Reconditions Again Action
Pattern Discovery and Analysis Pattern Discovery Using the mining algorithms to discover the pattern Pattern Analysis To filter out uninteresting/meaningless rules or patterns from the set found in the pattern discovery phase • Information filter • OLAP (On-line analytical processing) • Visualization • Knowledge query mechanism (SQL)
Technologies For Web Usage Mining Web usage mining technologies • Statistical analysis Most common method, such as frequency, mean (average), median, etc. • Classification Mapping a data item into one of several predefined classes • Clustering To group together a set of items having similar characteristics
Technologies For Web Usage Mining • Association rule Can be used to relate page or product that are most often referenced or purchased together • Sequential patterns A set of items is followed by another item in time-order
E-commerce Business Objectives • Personalization Web site personalization (content or layout) Personalized advertisement Personalized product recommendation • Marketing strategy Marketing rule Changing the marketing strategy • Web site design Web site evaluation Reorganize Improve the hypertext structure Optimization
Web usage mining for e-commerce • Many applications in different areas of Ecommerce have already been proposed • However, most research just focuses on the first two steps of the KDD process • Data mining is meaningless if we do not take action in E-commerce
Possible work n the area of web usage mining for -commerce. Also in the area of web search mploying Web Crawlers or algorithms like ITS (Hypertext Induced topic search), Web Warehousing etc.
- Slides: 18