Wikipedia To trust or not is hardly the
- Slides: 33
Wikipedia To trust or not, is hardly the question! Sai Moturu
Trust Quality We're never so vulnerable than when we trust someone but paradoxically, if we cannot trust, neither can we find love or joy - Walter Anderson Reach Popularity How much we can trust is the right question…
Agenda Review two articles Briefly summarize other publications
Content quality �What are the hallmarks of consistently good information? �Objectivity: unbiased information �Completeness: self explanatory �Pluralism: not restricted to a particular viewpoint �Define prepositions of trust
Prepositions of trust
UML Model for Wikipedia
Macro-areas of analysis �Six macro-areas: Quality of user, user distribution and leadership, stability, controllability, quality of editing and importance of an article. �Using the ten propositions, 50 sources of trust evidence are identified.
Logic conditions �Necessary to control the meaning of each trust factor in relationship to the others �IF stability is high AND (length is short OR edit is low OR importance is low) THEN warning �IF leadership is high AND dictatorship is high THEN warning �IF length is high AND importance is low THEN warning
Calculation of Trust
Evaluation �Featured articles vs. Standard articles
Cluster Analysis
Models �Basic � The better the authors, the better the article quality �Peer. Review � Assumption: A contributor reviews the content before modifying it, thereby approving the content that he/she does not edit
Models �Prob. Review � Improved assumption: A contributor may not review the entire article before modifying it � The farther a word is from another that the author has written, the lower the probability that he/she has read it � In conflicts, the higher probability is considered � Probability is modeled as a monotonically decaying function of the distance between the words �Naïve � The longer the article is , the better its quality � Used as a baseline for comparison
Iterative computation 1. Initialize all quality and authority values equally 2. For each iteration � Use authority values from previous iteration to compute quality � Use quality values to compute authority � Normalize all quality and authority values 3. Repeat step 2 until convergence (alternatives: repeat until difference is very small or until maximum iterations have been reached)
Evaluation �Use a set of articles on countries that have been assigned quality labels by Wikipedia’s Editorial team �Preprocessing: � Bot revisions were removed from the analysis. � Consecutive edits by a user were removed and final edit was used.
Evalation metrics �Normalized discounted cumulative gain at top k (NDCG@k) � Suited for ranked articles that have multiple levels of assessment �Spearman’s rank correlation � Relevant for comparing the agreement between two rankings of the same set of objects
Results
Conclusions �Prob. Review works best with decay scheme 2 or 3. �Article length seems to be correlated with article quality �Adding this to Basic and Peer. Review models showed some improvement but Prob. Review did not benefit
Summary �Revision trust model may help address � Article trust � Fragment trust � Author trust �A dynamic Bayesian network is used to model the evolution of article trust over revisions �Wikipedia featured articles, clean-up articles and normal articles are used for evaluation
Results
Summary �Uses revision history as well as the reputation of the contributing authors �Assigns trust to text
Summary �Propose the use of a trust tab in Wikipedia �Link-ratio: Ratio between the number of citation and the number of non-cited occurrences of the encyclopedia term �Evaluation: compare link ratio values for featured, normal and clean-up articles
Summary �Propose a content-driven reputation system for authors �Authors gain reputation when their work is preserved by subsequent authors and lose reputation when edits are undone or quickly rolled back �Evaluation: Low-reputation authors have larger than average probability of having poor quality as judged by human observers and are undone by later editors
Summary �A different question: What are the controversial articles? �Uses edit and collaboration history �Two Models: Basic and Contributor Rank �Contributor Rank model tries to differentiate between disputes due to the article and those due to the aggressiveness of the contributors, with the former being the one that is to be measured �Evaluation: Identification of labeled controversial articles
Conclusions Interesting area to work on Different angles to consider and different questions too Data is available easily and has lots of relevant features Wikipedia editorial team classified articles help evaluation Great scope for more work in this area I want to look at this from the health perspective
Thank You Feb 29, 2008
- I can hardly imagine
- Hardly express
- Always usually sometimes rarely never
- Always usually often sometimes never
- I can hardly imagine
- Northern trust charitable trust
- Not genuine, not true, not valid
- Reflections on trusting trust ken thompson
- Root of trust wikipedia
- Lean not on your own understanding song
- Hình ảnh bộ gõ cơ thể búng tay
- Frameset trong html5
- Bổ thể
- Tỉ lệ cơ thể trẻ em
- Chó sói
- Tư thế worms-breton
- Hát lên người ơi
- Môn thể thao bắt đầu bằng chữ f
- Thế nào là hệ số cao nhất
- Các châu lục và đại dương trên thế giới
- Công thức tính độ biến thiên đông lượng
- Trời xanh đây là của chúng ta thể thơ
- Mật thư anh em như thể tay chân
- Làm thế nào để 102-1=99
- độ dài liên kết
- Các châu lục và đại dương trên thế giới
- Thể thơ truyền thống
- Quá trình desamine hóa có thể tạo ra
- Một số thể thơ truyền thống
- Cái miệng bé xinh thế chỉ nói điều hay thôi
- Vẽ hình chiếu vuông góc của vật thể sau
- Nguyên nhân của sự mỏi cơ sinh 8
- đặc điểm cơ thể của người tối cổ
- Thế nào là giọng cùng tên