A Cocktail Approach for Travel Package Recommendation Qi

  • Slides: 67
Download presentation
A Cocktail Approach for Travel Package Recommendation Qi Liu, Enhong Chen, Senior Member, IEEE,

A Cocktail Approach for Travel Package Recommendation Qi Liu, Enhong Chen, Senior Member, IEEE, Hui Xiong, Senior Member, IEEE, Yong Ge, Zhongmou Li, and Xiang Wu IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 2, FEBRUARY 2014 2015/12/1 kei

Author Qi Liu Recieved Ph. D degree from the University of Science and Technology

Author Qi Liu Recieved Ph. D degree from the University of Science and Technology of China (USTC) in 2013 Research Interests • • Data Mining Machine Learning Recommender Systems Social Network http: //staff. ustc. edu. cn/~qiliuql/ 2

Introduction Online service provide Travel companies Tourists have to choose from a large number

Introduction Online service provide Travel companies Tourists have to choose from a large number of travel packages for satisfying their personalized needs 3

Introduction • Recommender systems have been successfully applied to enhance the quality of service

Introduction • Recommender systems have been successfully applied to enhance the quality of service in a number of fields • The problem of leveraging unique features to distinguish personalized travel package recommendations from traditional recommender systems remains pretty open 4

Introduction • Technical and domain challenges 1. Travel data are much fewer and sparser

Introduction • Technical and domain challenges 1. Travel data are much fewer and sparser than traditional items 2. Every travel package • Consists of many landscapes • Has intrinsic complex spatio-temporal relationships 3. Traditional recommender systems usually rely on user explicit ratings which are not conveniently available 4. Traditional items for recommendation usually have a long period of stable value 5

Proposal • Cocktail approach on personalized travel package recommendation 6

Proposal • Cocktail approach on personalized travel package recommendation 6

Proposal Cocktail approach on personalized travel package recommendation • Analyze the key characteristics of

Proposal Cocktail approach on personalized travel package recommendation • Analyze the key characteristics of the existing travel packages travel time and travel destinations are divided into different seasons and areas • Develop a tourist-area-season topic (TAST) model Represent travel packages and tourists by different topic distributions In the TAST model, the extraction of topics is conditioned on both the tourists and the intrinsic features, such as locations and travel seasons, of the landscapes • Based on this TAST model, a cocktail approach is developed for personalized travel package recommendation by considering some additional factors including the seasonal behaviors of tourists, the prices of travel packages, and the cold start problem of new packages 7

Concepts and data description 8

Concepts and data description 8

Definition Landscapes • Travel places of interest and attractions which usually locate in nearby

Definition Landscapes • Travel places of interest and attractions which usually locate in nearby areas Travel package • General service package provided by a travel company for the • individual or a group of tourists based on their travel preferences Consists of the landscapes and some related information Travel topics travel period, transportation means • Themes designed for theprice, package 9

Example of travel package Name of package from STA Travel (http: //www. statravel. com/)

Example of travel package Name of package from STA Travel (http: //www. statravel. com/) 10

Concepts and data description • Different packages may include the same landscapes and each

Concepts and data description • Different packages may include the same landscapes and each landscape can be used for multiple packages • The tourists for each individual package are often divided into different travel groups • Each package has a travel schedule • Most of the packages will be traveled only in a given time (season) of the year Strong seasonal patterns ex) “Maple Leaf Adventures” → Fall 11

Data set Real-world travel data set • • • Provided by a travel company

Data set Real-world travel data set • • • Provided by a travel company in China Nearly 220, 000 expense records From January 2000 to October 2010 Extract 23, 351 useful records • • 7, 749 travel groups for 5, 211 tourists from 908 domestic and international packages Contain 1, 065 different landscapes located in 139 cities from 10 countries Each tourist has traveled at least two different packages Each package has 11 different landscapes and each tourist has traveled 4. 4 times on average 12

Unique characteristics of the travel data 1. Very sparse and each tourist has only

Unique characteristics of the travel data 1. Very sparse and each tourist has only a few travel records 2. Strong time dependence 3. Landscape has some intrinsic features like the geographic location and the right travel seasons 4. The tourists will consider both time and financial costs before they accept a package 5. People often travel with their friends, family, or colleagues 6. Few tourist ratings are available for travel packages 13

Unique characteristics of the travel data 2. Strong time dependence • Travel packages •

Unique characteristics of the travel data 2. Strong time dependence • Travel packages • Only last for a certain period • Most of the landscapes • Be active • Form new packages together with some other landscapes The landscapes are more sustainable and important than the package itself 14

Challenges Characteristics of the travel data bring in major challenges • How to compare

Challenges Characteristics of the travel data bring in major challenges • How to compare the interests of tourists and the content of the travel package • How to make package recommendations for each tourist • How to capture the tourist relationships to form a travel group 15

The TAST model 16

The TAST model 16

Mathematical notations 17

Mathematical notations 17

Topic model representation Designing a travel package… • • <Process> Determine the set of

Topic model representation Designing a travel package… • • <Process> Determine the set of target tourists, the travel seasons, and the travel places Choose one or multiple travel topics based on the category of target tourists and the scheduled travel seasons Determine landscapes according to the travel topics and the geographic locations Include some additional information people in travel company 18

Topic model representation Abstract the previous slide, Package generation is a What-Who-When-Where (4 W)

Topic model representation Abstract the previous slide, Package generation is a What-Who-When-Where (4 W) problem • What:travel topics • Who:target tourists • When:seasons • Where:corresponding landscape located areas Four factors are strongly correlated 19

Topic model representation • Reprocess the generation of a package in a topic model

Topic model representation • Reprocess the generation of a package in a topic model style, and treat it mainly as a landscape drawing problem • A topic mentioned in TAST:a latent factor extracted by topic model • A real topic:an explicit travel theme identified in the real world • Latent topics are used to simulate real topics 20

Topic model representation 21

Topic model representation 21

Model inference Gibbs sampling method • Easy to implement and provides a relatively efficient

Model inference Gibbs sampling method • Easy to implement and provides a relatively efficient way for extracting a set of topics from a large set of travel logs 22

Model inference • After Gibbs sampling, all the tourists and packages are represented by

Model inference • After Gibbs sampling, all the tourists and packages are represented by the Z entry topic distribution vectors • Z:the number of topics 23

Model inference Example Tourist Traveled “Tour in Disneyland, Hongkong” “Christmas day in Hongkong” High

Model inference Example Tourist Traveled “Tour in Disneyland, Hongkong” “Christmas day in Hongkong” High probabilities “Amusement parks” “Hongkong” 24

Area segmentation • Divide the entire location space in the data set into seven

Area segmentation • Divide the entire location space in the data set into seven big areas South China Center China North China East Asia Southeast Asia Oceania North America 25

Seasons segmentation • Assume that most packages are seasonal • Use an information gain-based

Seasons segmentation • Assume that most packages are seasonal • Use an information gain-based method to get the season splits • The information entropy of the season is • To find the best split, use the weighted average entropy (WAE) 26

Related topic models The tourist topic (TT) model • does not consider the travel

Related topic models The tourist topic (TT) model • does not consider the travel area and travel season factors The tourist-area topic (TAT) model • only considers the travel area All these methods can also be used The tourist-season topic (TST) model for package and tourist • only considers the travel season representation 27

ocktail recommendation approac 28

ocktail recommendation approac 28

Cocktail recommendation approach Personalized travel package recommendation based on the TAST model • Use

Cocktail recommendation approach Personalized travel package recommendation based on the TAST model • Use the output topic distributions of TAST to find the seasonal nearest • • • neighbors for each tourist Collaborative filtering will be used for ranking the candidate packages New packages are added into the candidate list by computing similarity with the candidate packages Use collaborative pricing to predict the possible price distribution of each tourist and reorder the packages After removing the packages which are no longer active, we will have the final recommendation list 29

Cocktail recommendation approach 30

Cocktail recommendation approach 30

Seasonal collaborative filtering for tourists • The method for generating the personalized candidate package

Seasonal collaborative filtering for tourists • The method for generating the personalized candidate package set for each tourist by the collaborating filtering method • Compute the similarity between each tourist by their topic distribution similarities 31

New package problem • Travel packages often have a life cycle and new packages

New package problem • Travel packages often have a life cycle and new packages are usually created • Most of the landscapes will keep in use In case of the data of the year 2010, • 65 new packages • Only 2 of them are composed completely by new landscapes • For most of the new packages Pnew, their topic distributions can be estimated by the topics of their landscapes 32

How to recommend new packages • Content-based method: • Recommend the new packages that

How to recommend new packages • Content-based method: • Recommend the new packages that are similar to the ones already traveled by the given tourist • Compute the similarity between the new package and the given number (e. g. , 10) of candidate packages in the top of the recommendation list • The new packages which are similar to the candidate packages are added into the recommendation list and their ranks in the list based on the average probabilities of the similar candidate packages 33

Collaborative pricing • The price factor influences the decision of tourists • Consider the

Collaborative pricing • The price factor influences the decision of tourists • Consider the price constraint for developing a more personalized package recommender system <Phase of method> 1. Divide the prices into different segments 2. Use the Markov forecasting model to predict the next possible price range for a given tourist 34

Phase 1 • Divide the prices of the packages based on the variance of

Phase 1 • Divide the prices of the packages based on the variance of prices in the travel logs • Sort the prices of the travel logs • Partition the sorted list PL into several sublists in a binaryrecursive way. • The best split price having the minimal weighted average variance (WAV) 35

Phase 2 • Mark each price segment as a price state and compute the

Phase 2 • Mark each price segment as a price state and compute the transition probabilities between them • After removing the packages which are no longer active, it is able to have the final recommendation list 36

Related Cocktail Recommendations TASTContent • Content-based cocktail Cocktail • The topic preference of the

Related Cocktail Recommendations TASTContent • Content-based cocktail Cocktail • The topic preference of the packages in each price segment can also be inferred. What’s more, this topic model shares the same inference process with the TAST model 37

The TRAST model 38

The TRAST model 38

The TRAST model TAST model do not consider the information of the travel group

The TRAST model TAST model do not consider the information of the travel group extend TRAST model • Tourist-relation-area-season topic model • Formulates the tourist relationships in a travel group 39

The TRAST model • Split TRAST model into two submodels 40

The TRAST model • Split TRAST model into two submodels 40

The TRAST 1 model • Compute and store for each pair • Number of

The TRAST 1 model • Compute and store for each pair • Number of landscape tokens that are assigned to topic t, and have been cotraveled by tourists in season s 41

The TRAST 2 model • For each relationship assignment, 42

The TRAST 2 model • For each relationship assignment, 42

The TRAST model • Each tourist’s travel relationship preference can be estimated 43

The TRAST model • Each tourist’s travel relationship preference can be estimated 43

Experimental results 44

Experimental results 44

Experimental results 1. The results of the season splitting and price segmentation 2. The

Experimental results 1. The results of the season splitting and price segmentation 2. The understanding of the extracted topics 3. A recommendation performance comparison between Cocktail and benchmark methods 4. The evaluation of the TRAST model 5. A brief discussion on recommendations for travel groups 45

Experimental setup <Data set> • Divided into ・Test set - The last expense record

Experimental setup <Data set> • Divided into ・Test set - The last expense record of each tourist in the year of 2010 - 65 new packages traveled by 269 tourists etc. ・Training set (the remaining records) • Details 46

Experimental setup <Benchmark methods> • Compare the TAST model with ・the TAT model ・the

Experimental setup <Benchmark methods> • Compare the TAST model with ・the TAT model ・the TST model ・the TT model 47

Experimental setup <Benchmark methods> • Three methods based on topic models including TTER, TASTContent

Experimental setup <Benchmark methods> • Three methods based on topic models including TTER, TASTContent and Cocktail- as described in Section 4. 4. • A content-based recommendation (SContent) based on cotraveled landscapes, following in [27, Eqs. (3. 1)- (3. 4)]. • For the memory-based collaborative filtering, we implemented the user-based collaborative filtering method (UCF) [31]. • For the model-based collaborative filtering, we chose binary SVD (BSVD) [24]. • Since UCF and BSVD only use the package-level information, to do a fair comparison, we implemented two similar methods based on landscapes (i. e. , LUCF, LBSVD). • One graph-based algorithm, LItem. Rank [16], where a landscape correlation graph is constructed, and the packages are ranked by the expected average steady-state probabilities 48 on their landscapes.

ason Splitting and Price Segmentati 49

ason Splitting and Price Segmentati 49

ason Splitting and Price Segmentati 50

ason Splitting and Price Segmentati 50

ason Splitting and Price Segmentati 51

ason Splitting and Price Segmentati 51

Understanding of topics All the topics can now be classified into eight types from

Understanding of topics All the topics can now be classified into eight types from 1 -1 -1 (packages have price, spatial and temporal correlations) to 0 -0 -0 (packages have none of these correlations) 52

53

53

Understanding of topics 54

Understanding of topics 54

Recommendation performances • Degree of agreement (DOA) • Top-K • User study • Volunteers

Recommendation performances • Degree of agreement (DOA) • Top-K • User study • Volunteers were invited to rate the recommendations 55

Recommendation performances DOA • Degree of agreement (DOA) • measures the percentage of item

Recommendation performances DOA • Degree of agreement (DOA) • measures the percentage of item pairs ranked in the correct order with respect to all pairs • The individual DOA for tourist Ui is defined as 56

Recommendation performances Top-K • Recall value of the recommended top-K percent of packages 57

Recommendation performances Top-K • Recall value of the recommended top-K percent of packages 57

Recommendation performances User study • Mean : mean rating • SD : standard deviations

Recommendation performances User study • Mean : mean rating • SD : standard deviations 58

Recommendation performances Computational performances • Run all the algorithms on the same platform •

Recommendation performances Computational performances • Run all the algorithms on the same platform • For the topic model-based algorithms, the authors set Gibbs sampling run 100 iterations, since similar results are already observed 59

Recommendation performances Summary • Cocktail performs better than other methods for all the evaluation

Recommendation performances Summary • Cocktail performs better than other methods for all the evaluation metrics • Cocktail- and TTER have the second best performances • UCF and BSVD • Traditional collaborative filtering methods • Do not perform well • Cannot recommend new packages for tourists • Due to the unique characteristics of the travel data 60

The Evaluation of the TRAST Model • Experiments 1. Use K-means clustering for grouping

The Evaluation of the TRAST Model • Experiments 1. Use K-means clustering for grouping given tourists 2. Find the tourists who would like to travel with given tourist • Use 7, 083 travel groups to train the model • Select 76 packages from the test set (Table 3) 61

The Evaluation of the TRAST Model 62

The Evaluation of the TRAST Model 62

The Evaluation of the TRAST Model 63

The Evaluation of the TRAST Model 63

The Evaluation of the TRAST Model 64

The Evaluation of the TRAST Model 64

Recommendation for Travel Groups 65

Recommendation for Travel Groups 65

Conclusion • The authors present study on personalized travel package recommendation. Specifically, they first

Conclusion • The authors present study on personalized travel package recommendation. Specifically, they first analyzed the unique characteristics of travel packages and developed the TAST model, a Bayesian network for travel package and tourist representation. • The TAST model can discover the interests of the tourists and extract the spatial-temporal correlations among landscapes. • Furthermore, they extended the TAST model to the TRAST model, which can capture the relationships among tourists in each travel group. • Finally, an empirical study was conducted on real-world travel data. • Experimental results demonstrate that the TAST model can capture the unique characteristics of the travel packages, the cocktail approach can lead to better performances of travel package recommendation 66

Thank you. 67

Thank you. 67