Bibliographic Coupling Co Citation and Mapping Peter Ingwersen
Bibliographic Coupling, Co. Citation and Mapping Peter Ingwersen Det Informationsvidenskabelige Akademi, Danmark University College, Oslo, Norway – 2010
Agenda l Bibliographic coupling l Example l Author co-citation l Example, old vs. Newer (Dialog) l Example, Wo. S, quick ’n dirty l Maps Ingwersen 2008 2
BIBLIOGRAPHIC COUPLING Doc 1 Doc 2 Ref X Ref. . . REF X Doc 33 Ref X Ref. . Ingwersen 2008 Documents 1 -3 are coupled by REF X … can be selected in Wo. S/ Dialog, if X is known; s CR=X 3
Example of bibliographic coupling l One article has 23 references on its reference list l Another has 54 references on its list l There are 4 references in common. Strength: 4 / (23 + 54 - 4) = 0. 054 (max. =1) l Jaccard l CA Ingwersen similarity measure – CY/CU – CW could also be used 2008 4
CO-CITATION Documents X and Y are CO-CITED Twice on the reference lists of A and B (Doc. A is coupled bibliographically to Doc. B by X & Y) DOC A Ref X Ref Y DOC B Ref X Ref Y Doc X Doc Y Ingwersen 2008 5
Co-citation illustration: KNOWN CITATIONS – known cited documents Set Items S 4 S 8 S 9 7 13 6 Description CA=JOSS PC(S)CY=1988(S)CW=NATURE CA=FABIAN AC(S)CY=1987(S)CW=NATURE S 4 AND S 8 COSINE: SIM = 6 / (71/2 * 131/2) = 0, 63 JACCARD: SIM = 6 / (7 + 13 - 6) = 0, 43 These are BINARY calculations (document level) Wo. S: can be done on cited author or cited work Ingwersen 2008 6
Co-citation illustration: KNOWN AUTHORSHIPS – citing document level S 1 S 2 S 3 S 4 S 5 S 6 279 86 955 49 70 18 CA=BELKIN NJ CA=INGWERSEN P CA=SALTON G S 1 AND S 2 S 1 AND S 3 S 2 AND S 3 AUTHOR CO-CITATION SIMILARITY - 1990: Ingwersen 2008 7
One creates an overlap matrix ITEMS Level NJ Belkin 279 P Ingwersen 86 G Salton 955 49 70 18 P Ingwersen 86 G Salton 955 Ingwersen 2008 8
Co-citation illustration – Similarity matrix: KNOWN AUTHORSHIPS – Binary, citing document level – Jaccard & Cosine SIM(BELK/INGW): 49 / (279+86 -49)=. 16 Cosine: 49 /16. 7 x 9. 3 =. 32 SIM(BELK/SAL): 70 / (279+955 -70)=. 06 Cosine: 70 /16. 7 x 30. 9 =. 14 SIM(SAL/INGW): 18 / (955+86 -18)=. 02 Cosine: 18 /30. 9 x 9. 3 =. 06 THIS WAS IN 1990 - WHAT IN 2000? ? Ingwersen 2008 9
One creates a similarity matrix NJ Belkin 279 P Ingwersen 86 G Salton 955 . 16 . 06. 02 P Ingwersen 86 G Salton 955 Ingwersen 2008 10
At Item Level: Cosine and Jaccard OK: Similarity in a BINARY CONTEXT Ingwersen Belkin Ingwersen Agreement 2008 11
Co-citation illustration: KNOWN AUTHORSHIPS - 1990 -2000 S 1 S 2 S 3 S 4 S 5 S 6 Ingwersen Items Citations Name 541 933 CA=BELKIN NJ 258 382 CA=INGWERSEN P 1365 2417 CA=SALTON G 126 175 50 559 680 204 2008 S 1 AND S 2 S 1 AND S 3 S 2 AND S 3 12
Co-citation illustration: KNOWN AUTHORSHIPS – Binary, citing document level; Jaccard & Cosine SIM(BELKIN/INGWERSEN): 126 / (541+258 -126) =. 19 Cosine: 126 / 23. 3 x 16 =. 34 SIM(BELKIN/SALTON): 175 / (541+1365 -175) =. 10 Cosine: 175 / 23. 3 x 36. 9 =. 20 SIM(SALTON/INGWERSEN): 50 / (1365+258 -50) =. 03 Cosine: 50 / 36. 9 x 16 =. 08 Ingwersen 2008 13
Co-citation illustration: KNOWN AUTHORSHIPS – non-binary Citation level; Only Cosine (or Pearson) Similarity is calculated following this formular: Sim(b, i)= ∑(citb, d x citi, d) / √∑(citb, d²) x √∑(citi, d²) – e. g. : (3 x 2+4 x 1+3 x 0) / √(3²+4²+3²) x √(2²+1²+0) = 10 / √(9+16+9) x √(4+1) = 10 / √ 34 x √ 5 = 10 / (5. 8 x 2. 34) = 10 / 13. 64 =. 73 For Belkin: 10 citations; Ingwersen: 3 citations; No. of documents co-citing: 2 – with (3 x 2 + 4 x 1 = 10 co-citations); No. of documents citing B: 3, with 10 citations in total No. of documents citing I: 2, with 3 citations in total Ingwersen 2008 14
The trend 1978 - 2000 The Belkin-Salton co-citation ratio has increased from 0. 06 to 0. 10 in the 90 s l The Belkin-Ingwersen co-citation ratio has slightly increased, from 0. 16 to 0. 19 l l l Ingwersen Both pairs are thus closer to one another seen from colleagues’ views!! 50 % of documents citing Ingwersen (126/258) co-cites him with Belkin after 1990. Up to 1990 this ratio was larger (49/86) = 57 %. 2008 15
Co-occurrence applications l By co-citations: the perceived connexion between people: l Sharing Expertice l Check consistency by ageing measures of their work l By bibliographic coupling: people are connected by using the same work!! Ingwersen 2008 16
Co-occurrence applications – 2 l Data mining in databases l Creation of maps of scientific domains l Demonstrate collaboration between: l People l Countries l Institutions l Topical areas l …. Ingwersen 2008 17
Øvelse i co-citation 1. Ingwersen Update the Belkin, Salton, Ingwersen co-citation analysis for the period 20012011, by carrying it out at Document level (Binary analysis) via Wo. S. 2008 18
Wo. S: Cited Ref Search Ingwersen 2008 19
Ingwersen published 76 papers, cited Ingwersen 2008 20
153 items cited PI 165 times (2001 -) Ingwersen 2008 21
Belkin: 126 papers give 148 cites Ingwersen 2008 22
Co-citations Belkin-Ingwersen 2001 -11 Ingwersen 2008 23
The one record co-citing I & B Ingwersen 2008 24
The two citations to Ingwersen 2008 25
Limitations l Wo. S does sometimes NOT carry out all the analyses profoundly l With larger sets of citing papers (and citations) it is only possible to calculate the co-occurrence at DOKUMENT LEVEL Ingwersen 2008 26
- Slides: 26