Patent Citation Networks Bernard Gress http student ucr
Patent Citation Networks • Bernard Gress • http: //student. ucr. edu/~gressb 01 • Fannie Mae Inc. , Washington DC. • bernard_gress@fanniemae. com • Forthcoming in The Mathematica Journal • http: //www. mathematica-journal. com/
The Patent Citation Dataset • Patent citations are part of the legal patent process where the patent applicant has the duty to disclose any knowledge of 'prior art' amongst previous patents. • Some objectivity in the process is provided by the government patent examiner who is supposed to be an expert in the area and who approves the final citation. • The network established by patent citations allows one to trace the flow of technology through time, from patent to patent, and across fields. • Studies of technological spillover effects, the impact or influence of individual patents, the rates of technological development, and other such issues, can be assisted by the consideration of patent citations.
The Patent Citation Dataset continued • Hall, Jaffe, and Trajtenberg, and the National Bureau of Economic Research (NBER) (http: //www. nber. org/patents/). • The primary database (cite 75_99. zip) contains 22, 309, 440 pairs of pair-wise patent citation dataset on more than 3 million U. S. patents granted between January 1963 and December 2002. • The secondary database (pat 63_02 f. txt) contains records for 3, 414, 910 patents with 25 fields each.
Structure of Primary Database (cite 75_99. zip)
Structure of Secondary Database (pat 63_02 f. txt)
Patent Numbers Issued Serially
Two Types of Citation Networks • A Citation Lineage – all of the progenitors and descendants by citation reference, so long as no siblings are brought into the picture • A Citation Neighborhood – all those patents that are within a specified network distance of the patent of interest, regardless of relationship, including all 'siblings' and 'cousins'.
There are 14 nodes for the 1 generation lineage of patent #3858382: • Patent. Lineage[3858382, 1] – Patents. Of. Interest ® {3858382}, – Print. Rules ® {1® 3858382, 2® 1794517, 3® 2045678, 4® 2069266, 5® 2790591, 6® 3044233, 7® 3100569, 8® 3468100, 9® 3646723, 10® 4085822, 11® 4316353, 12® 4750694, 13® 4863125, 14® 5054646, 15® 6250501} – Relations ® {3858382® 4085822, 3858382® 4316353, 3858382® 4750694, 3858382® 4863125, 3858382® 5054646, 3858382® 6250501, 1794517® 3858382, 2045678® 3858382, 2069266® 3858382, 2790591® 3858382, 3044233® 3858382, 3100569® 3858382, 3468100® 3858382, 3646723® 3858382} – Vertexes ® {3858382, 1794517, 2045678, 2069266, 2790591, 3044233, 3100569, 3468100, 3646723, 4085822, 4316353, 4750694, 4863125, 5054646, 6250501} – Index. Pairs ® {{1, 10}, {1, 11}, {1, 12}, {1, 13}, {1, 14}, {1, 15}, {2, 1}, {3, 1}, {4, 1}, {5, 1}, {6, 1}, {7, 1}, {8, 1}, {9, 1}} – Index. Rules ® {1® 10, 1® 11, 1® 12, 1® 13, 1® 14, 1® 15, 2® 1, 3® 1, 4® 1, 5® 1, 6® 1, 7® 1, 8® 1, 9® 1}
There are 15 nodes for the 1 -generation Neighborhood of patent #3858382: • Patent. Neighborhood[3858382, 1] – Patents. Of. Interest ® {3858382} – Print. Rules ® {1® 3858382, 2® 1794517, 3® 2045678, 4® 2069266, 5® 2790591, 6® 3044233, 7® 3100569, 8® 3468100, 9® 3646723, 10® 4085822, 11® 4316353, 12® 4750694, 13® 4863125, 14® 5054646, 15® 6250501} – Relations ® {1794517® 3858382, 2045678® 3858382, 2069266® 3858382, 2790591® 3858382, 3044233® 3858382, 3100569® 3858382, 3468100® 3858382, 3646723® 3858382, 3858382® 4085822, 3858382® 4316353, 3858382® 4750694, 3858382® 4863125, 3858382® 5054646, 3858382® 6250501} – Vertexes ® {3858382, 1794517, 2045678, 2069266, 2790591, 3044233, 3100569, 3468100, 3646723, 4085822, 4316353, 4750694, 4863125, 5054646, 6250501} – Index. Pairs ® {{1, 10}, {1, 11}, {1, 12}, {1, 13}, {1, 14}, {1, 15}, {2, 1}, {3, 1}, {4, 1}, {5, 1}, {6, 1}, {7, 1}, {8, 1}, {9, 1}} – Index. Rules ® {1® 10, 1® 11, 1® 12, 1® 13, 1® 14, 1® 15, 2® 1, 3® 1, 4® 1, 5® 1, 6® 1, 7® 1, 8® 1, 9® 1}
• Mathematica has Nice Built-in Graph Visualization Functions for Unstructured Graphs: • Graph. Plot 3 D • Show. Graph • But to Plot Graphs Over Time then Have to Use My Function: • Patent. Plot
Citation Networks Over Time - continued The 2 -Generation Lineage of 3858382
Citation Networks Over Time - continued The 2 -Generation Neighborhood of 3858382
Graph. Plot[Patent. Neighbor. Hood[ {3858382, 4597749}, 2]]
A nice illustration of the spread of technology over time.
I also add functions to color nodes and edges by different patent characteristics, e. g. – Patent Technology Category and 4 -digit Coloring nodes by(2 -criteria HJT) – Patent Originality/ Generality Index – Total Number of Citations
Graph. Plot 3 D[Patent. Neighborhood[ 3858382, 7]]
Graph. Plot[Patent. Neighborhood[3858382, 12]] Colored by technology category
Time Constrained The 7 -Generation Neighborhood of #3858382, Colored by Technology Class
Network Statistics and Structure Analysis • • • Citation Lags Network Curvature Citation Count Distributions HJT Technology Categories Originality and Generality
Distributions of Backward Lags
Distributions of Backward Lags
Network Curvature the average number of patents reached at subsequent network distances -some simple graphs and their respective curvature plots-
Network Curvature the average number of patents reached at subsequent network distances
A much larger network of 91, 000 patents over 40 years Curvature graphs for each year
Curvature graphs for each year, all together
Curvature graphs for each year, all together, different view
Patent Technological Composition
HJT Technology Category Distribution
Cumulative distribution of patents by tech category
Citation Count Distributions
Citation Count Distributions - continued
Citation Count Distributions - continued
Citation Count Distributions - continued
Generality and Originality • where J is the number of patent classes, Ni is the total number of forward citations for patent i, and Ni, j is the number of forward citations in each patent class for patent i. The second term is a Herfindal-type of index. • The 'Originality' of Patent 'i' is the same, except with backwards citations (i. e. citations made). • "Thus if a patent cites previous patents that belong to a narrow set of technologies, the originality score will be low, whereas Citing patents in a wide range of fields would render a higher score. "
Generality and Originality - Continued Not very interesting - at least no trends over time – and seemingly no necessary relationship to the concepts they intend to measure.
Conclusions • Mathematica is a nice platform for networks analysis • There is a lot of opportunity for research in this area • Don’t know what the value of this research is to the IPI-Conf. Ex clientele
References • [1] B. Hall, Jaffe, Trajtenberg, "The NBER Patent Citations Data File: Lessons, Insights and Methodological Tools, " 2002, http: //emlab. berkeley. edu/users/bhhall/pat/ NBERpatdata. pdf • [2] S. Wolfram, A New Kind of Science, : 2002
- Slides: 37