Web Markov Skeleton Processes and Applications ZhiMing Ma
Web Markov Skeleton Processes and Applications Zhi-Ming Ma 10 June, 2013, St. Petersburg Email: mazm@amt. ac. cn http: //www. amt. ac. cn/member/mazhiming/index. html
• Y. Liu, Z. M. Ma, C. Zhou: Web Markov Skeleton Processes and Their Applications, Tohoku Math J. 63 (2011), 665695 • Y. Liu, Z. M. Ma, C. Zhou: Further Study on Web Markov Skeleton Processes, in Stochastic Analysis and Applications to Finance, World Scientific, 2012 • C. Zhou: Some Results on Mirror Semi. Markov Processes, manuscript
Web Markov Skeleton Process Markov Chain conditionally independent given
Define WMSP by :
Simple WMSP: Many simple WMSPs are Non-Markov Processes
[LMZ 2011 a, b]
Mirror Semi-Markov Process is not a Hou-Liu’s Markov Skeleton Process, i. e. it does not satisfy
Multivariate Point Process associated with WMSP
Let
Consequently Define We can prove that where
where
Time-homogeneous mirror semi-Markov processes are all independent of n
More property of of time homogeneity Renewal Theory Contribution probability Staying times and first entry times Limit distribution for semi-Markov process Limit distribution for mirror semi-Markov processes Reconstruction of Mirror Semi-Markov Processes
Why it is called a Web Markov Skeleton Process?
Page Rank, a ranking algorithm used by the Google search engine. 1998, Sergey Brin and Larry Page , Stanford University From probabilistic point of view, Page. Rank is the stationary distribution of a Markov chain. A simple Markov Skeleton Process
Markov chain describing surfing behavior
Markov chain describing surfing behavior
Web surfers usually have two basic ways to access web pages: 1. with probability α, they visit a web page by clicking a hyperlink. 2. with probability 1 -α, they visit a web page by inputting its URL address.
where
Weak points of Page. Rank • Using only static web graph structure • Reflecting only the will of web managers, but ignore the will of users e. g. the staying time of users on a web. • Can not effectively against spam and junk pages. Browse. Rank. SIGIR. ppt
Data Mining
Browsing Process • Markov property • Time-homogeneity
Computation of the Stationary Distribution – Stationary distribution: – is the mean of the staying time on page i. The more important a page is, the longer staying time on it is. – is the mean of the first re-visit time at page i. The more important a page is, the smaller the revisit time is, and the larger the visit frequency is.
Browse. Rank: Letting Web Users Vote for Page Importance Yuting Liu, Bin Gao, Tie-Yan Liu, Ying Zhang, Zhiming Ma, Shuyuan He, and Hang Li July 23, 2008, Singapore the 31 st Annual International ACM SIGIR Conference on Research & Development on Information Retrieval. Best student paper !
• Browse Rank the next Page. Rank says Microsoft • jerbrows er. wmv
• Browsing Processes will be a Basic Mathematical Tool in Internet Information Retrieval Beyond: --General fromework of Browsing Processes? --How about inhomogenous process? --Marked point process --Mobile Web: not really Markovian
Ext. Browse. Rank and semi-Markov processes
Mobile. Rank and Mirror Semi-Markov Processes
Mobile. Rank and Mirror Semi-Markov Processes
Web Markov Skeleton Process [10] B. Gao, T. Liu, Z. M. Ma, T. Wang, and H. Li A general markov framework for page importance computation, In proceedings of CIKM '2009, [11] B. Gao, T. Liu, Y. Liu, T. Wang, Z. M. Ma and H. LI Page Importance Computation based on Markov Processes, Information Retrieval online first: <http: //www. springerlink. com/content/7 mr 7526 x 21671131
Research on Random Complex Networks and Information Retrieval: In recent years we have been involved in the research direction of Random Complex Netowrks and Information Retrieval. Below are some of the related outputs by our group (in collaboration with Microsoft Research Asia)
More property of time homogeneity right continuous, piecewise constant functions
Theorem [LMZ 2011 a] for all n Theorem [LMZ 2011 b] General case
The statistical properties of a time homogeneous mirror semi-Markov process is completely determined by:
Reconstruction of Mirror Semi-Markov Processes Given: , , Theorem [LMZ 2011 b] We can construct such that
uniformly
Limit distribution for semi-Markov process
Limit distribution for mirror semi-Markov processes
Staying times and first entry times Staying time on the state j: Distribution Expectation First entry time into the state k: where into k Distribution Expectation
Contribution probability from state i to state j:
Renewal Theory Proposition
Renewal Equation [LMZ 2011 a]
Renewal functional: where Below are the resuls on the renewal functional [LMZ 2011 a]
Thank you !
Time Homogeneous WMSP
right continuous, piecewise constant functions
More property of of time homogeneity Theorem [LMZ 2011 b] for all
Reconstruction of WMSP [LMZ 2011 b] Write is expressed as
Ranking Websites, a Probabilistic View Internet Mathematics, Volume 3 (2007), Issue 3 Ying Bao, Gang Feng, Tie-Yan Liu, Zhi-Ming Ma, and Ying Wang Aggregate. Rank: Bring Order to Web Sites 29 th Annual International Conference on Research & Development on Information Retrieval (SIGIR’ 06). G. Feng, T. Y. Liu, Ying Wang, Y. Bao, Z. M. Ma et al
- Slides: 74