Bibliographic Coupling Co Citation and Mapping Peter Ingwersen

  • Slides: 26
Download presentation
Bibliographic Coupling, Co. Citation and Mapping Peter Ingwersen Det Informationsvidenskabelige Akademi, Danmark University College,

Bibliographic Coupling, Co. Citation and Mapping Peter Ingwersen Det Informationsvidenskabelige Akademi, Danmark University College, Oslo, Norway – 2010

Agenda l Bibliographic coupling l Example l Author co-citation l Example, old vs. Newer

Agenda l Bibliographic coupling l Example l Author co-citation l Example, old vs. Newer (Dialog) l Example, Wo. S, quick ’n dirty l Maps Ingwersen 2008 2

BIBLIOGRAPHIC COUPLING Doc 1 Doc 2 Ref X Ref. . . REF X Doc

BIBLIOGRAPHIC COUPLING Doc 1 Doc 2 Ref X Ref. . . REF X Doc 33 Ref X Ref. . Ingwersen 2008 Documents 1 -3 are coupled by REF X … can be selected in Wo. S/ Dialog, if X is known; s CR=X 3

Example of bibliographic coupling l One article has 23 references on its reference list

Example of bibliographic coupling l One article has 23 references on its reference list l Another has 54 references on its list l There are 4 references in common. Strength: 4 / (23 + 54 - 4) = 0. 054 (max. =1) l Jaccard l CA Ingwersen similarity measure – CY/CU – CW could also be used 2008 4

CO-CITATION Documents X and Y are CO-CITED Twice on the reference lists of A

CO-CITATION Documents X and Y are CO-CITED Twice on the reference lists of A and B (Doc. A is coupled bibliographically to Doc. B by X & Y) DOC A Ref X Ref Y DOC B Ref X Ref Y Doc X Doc Y Ingwersen 2008 5

Co-citation illustration: KNOWN CITATIONS – known cited documents Set Items S 4 S 8

Co-citation illustration: KNOWN CITATIONS – known cited documents Set Items S 4 S 8 S 9 7 13 6 Description CA=JOSS PC(S)CY=1988(S)CW=NATURE CA=FABIAN AC(S)CY=1987(S)CW=NATURE S 4 AND S 8 COSINE: SIM = 6 / (71/2 * 131/2) = 0, 63 JACCARD: SIM = 6 / (7 + 13 - 6) = 0, 43 These are BINARY calculations (document level) Wo. S: can be done on cited author or cited work Ingwersen 2008 6

Co-citation illustration: KNOWN AUTHORSHIPS – citing document level S 1 S 2 S 3

Co-citation illustration: KNOWN AUTHORSHIPS – citing document level S 1 S 2 S 3 S 4 S 5 S 6 279 86 955 49 70 18 CA=BELKIN NJ CA=INGWERSEN P CA=SALTON G S 1 AND S 2 S 1 AND S 3 S 2 AND S 3 AUTHOR CO-CITATION SIMILARITY - 1990: Ingwersen 2008 7

One creates an overlap matrix ITEMS Level NJ Belkin 279 P Ingwersen 86 G

One creates an overlap matrix ITEMS Level NJ Belkin 279 P Ingwersen 86 G Salton 955 49 70 18 P Ingwersen 86 G Salton 955 Ingwersen 2008 8

Co-citation illustration – Similarity matrix: KNOWN AUTHORSHIPS – Binary, citing document level – Jaccard

Co-citation illustration – Similarity matrix: KNOWN AUTHORSHIPS – Binary, citing document level – Jaccard & Cosine SIM(BELK/INGW): 49 / (279+86 -49)=. 16 Cosine: 49 /16. 7 x 9. 3 =. 32 SIM(BELK/SAL): 70 / (279+955 -70)=. 06 Cosine: 70 /16. 7 x 30. 9 =. 14 SIM(SAL/INGW): 18 / (955+86 -18)=. 02 Cosine: 18 /30. 9 x 9. 3 =. 06 THIS WAS IN 1990 - WHAT IN 2000? ? Ingwersen 2008 9

One creates a similarity matrix NJ Belkin 279 P Ingwersen 86 G Salton 955

One creates a similarity matrix NJ Belkin 279 P Ingwersen 86 G Salton 955 . 16 . 06. 02 P Ingwersen 86 G Salton 955 Ingwersen 2008 10

At Item Level: Cosine and Jaccard OK: Similarity in a BINARY CONTEXT Ingwersen Belkin

At Item Level: Cosine and Jaccard OK: Similarity in a BINARY CONTEXT Ingwersen Belkin Ingwersen Agreement 2008 11

Co-citation illustration: KNOWN AUTHORSHIPS - 1990 -2000 S 1 S 2 S 3 S

Co-citation illustration: KNOWN AUTHORSHIPS - 1990 -2000 S 1 S 2 S 3 S 4 S 5 S 6 Ingwersen Items Citations Name 541 933 CA=BELKIN NJ 258 382 CA=INGWERSEN P 1365 2417 CA=SALTON G 126 175 50 559 680 204 2008 S 1 AND S 2 S 1 AND S 3 S 2 AND S 3 12

Co-citation illustration: KNOWN AUTHORSHIPS – Binary, citing document level; Jaccard & Cosine SIM(BELKIN/INGWERSEN): 126

Co-citation illustration: KNOWN AUTHORSHIPS – Binary, citing document level; Jaccard & Cosine SIM(BELKIN/INGWERSEN): 126 / (541+258 -126) =. 19 Cosine: 126 / 23. 3 x 16 =. 34 SIM(BELKIN/SALTON): 175 / (541+1365 -175) =. 10 Cosine: 175 / 23. 3 x 36. 9 =. 20 SIM(SALTON/INGWERSEN): 50 / (1365+258 -50) =. 03 Cosine: 50 / 36. 9 x 16 =. 08 Ingwersen 2008 13

Co-citation illustration: KNOWN AUTHORSHIPS – non-binary Citation level; Only Cosine (or Pearson) Similarity is

Co-citation illustration: KNOWN AUTHORSHIPS – non-binary Citation level; Only Cosine (or Pearson) Similarity is calculated following this formular: Sim(b, i)= ∑(citb, d x citi, d) / √∑(citb, d²) x √∑(citi, d²) – e. g. : (3 x 2+4 x 1+3 x 0) / √(3²+4²+3²) x √(2²+1²+0) = 10 / √(9+16+9) x √(4+1) = 10 / √ 34 x √ 5 = 10 / (5. 8 x 2. 34) = 10 / 13. 64 =. 73 For Belkin: 10 citations; Ingwersen: 3 citations; No. of documents co-citing: 2 – with (3 x 2 + 4 x 1 = 10 co-citations); No. of documents citing B: 3, with 10 citations in total No. of documents citing I: 2, with 3 citations in total Ingwersen 2008 14

The trend 1978 - 2000 The Belkin-Salton co-citation ratio has increased from 0. 06

The trend 1978 - 2000 The Belkin-Salton co-citation ratio has increased from 0. 06 to 0. 10 in the 90 s l The Belkin-Ingwersen co-citation ratio has slightly increased, from 0. 16 to 0. 19 l l l Ingwersen Both pairs are thus closer to one another seen from colleagues’ views!! 50 % of documents citing Ingwersen (126/258) co-cites him with Belkin after 1990. Up to 1990 this ratio was larger (49/86) = 57 %. 2008 15

Co-occurrence applications l By co-citations: the perceived connexion between people: l Sharing Expertice l

Co-occurrence applications l By co-citations: the perceived connexion between people: l Sharing Expertice l Check consistency by ageing measures of their work l By bibliographic coupling: people are connected by using the same work!! Ingwersen 2008 16

Co-occurrence applications – 2 l Data mining in databases l Creation of maps of

Co-occurrence applications – 2 l Data mining in databases l Creation of maps of scientific domains l Demonstrate collaboration between: l People l Countries l Institutions l Topical areas l …. Ingwersen 2008 17

Øvelse i co-citation 1. Ingwersen Update the Belkin, Salton, Ingwersen co-citation analysis for the

Øvelse i co-citation 1. Ingwersen Update the Belkin, Salton, Ingwersen co-citation analysis for the period 20012011, by carrying it out at Document level (Binary analysis) via Wo. S. 2008 18

Wo. S: Cited Ref Search Ingwersen 2008 19

Wo. S: Cited Ref Search Ingwersen 2008 19

Ingwersen published 76 papers, cited Ingwersen 2008 20

Ingwersen published 76 papers, cited Ingwersen 2008 20

153 items cited PI 165 times (2001 -) Ingwersen 2008 21

153 items cited PI 165 times (2001 -) Ingwersen 2008 21

Belkin: 126 papers give 148 cites Ingwersen 2008 22

Belkin: 126 papers give 148 cites Ingwersen 2008 22

Co-citations Belkin-Ingwersen 2001 -11 Ingwersen 2008 23

Co-citations Belkin-Ingwersen 2001 -11 Ingwersen 2008 23

The one record co-citing I & B Ingwersen 2008 24

The one record co-citing I & B Ingwersen 2008 24

The two citations to Ingwersen 2008 25

The two citations to Ingwersen 2008 25

Limitations l Wo. S does sometimes NOT carry out all the analyses profoundly l

Limitations l Wo. S does sometimes NOT carry out all the analyses profoundly l With larger sets of citing papers (and citations) it is only possible to calculate the co-occurrence at DOKUMENT LEVEL Ingwersen 2008 26