Taxonomy Strategies LLC Taxonomy 1 2 3 Enterprise

  • Slides: 75
Download presentation
Taxonomy Strategies LLC Taxonomy 1 -2 -3 Enterprise Search Summit 2007 Tutorial May 14,

Taxonomy Strategies LLC Taxonomy 1 -2 -3 Enterprise Search Summit 2007 Tutorial May 14, 2007 Copyright 2007 Taxonomy Strategies LLC. All rights reserved.

Today’s agenda 9: 00 -9: 05 5 min Introduction 9: 05 -9: 10 5

Today’s agenda 9: 00 -9: 05 5 min Introduction 9: 05 -9: 10 5 min Warm-up exercise 9: 10 -9: 35 25 min Building taxonomies 9: 35 -9: 45 10 min Taxonomy exercise 9: 45 -10: 05 20 min Taxonomy business case 10: 05 -10: 20 15 min Taxonomy & search 10: 20 -10: 35 15 min Coffee Break 10: 35 -11: 05 30 min Taxonomy ROI 11: 05 -11: 15 10 min ROI exercise 11: 15 -11: 45 30 min Taxonomy governance 11: 45 -12: 00 15 min Q&A Taxonomy Strategies LLC The business of organized information 2

My taxonomy questions Priority (1 -5) Questions Your title or role: Your org or

My taxonomy questions Priority (1 -5) Questions Your title or role: Your org or industry: Your dept: Your name: (optional) Taxonomy Strategies LLC The business of organized information 3

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance Taxonomy Strategies LLC The business of organized information 4

The Taxonomy problem: How to pick from > 5, 000 faucets? By: v Category

The Taxonomy problem: How to pick from > 5, 000 faucets? By: v Category v Price v Brand v Color/Finish v # Handles v Series Name v Water Filter? v Faucet Spray v Handle Shape v Soap Dispenser? Taxonomy Strategies LLC The business of organized information 5

The main issue: What goes here? v When do the things in the list

The main issue: What goes here? v When do the things in the list change? v How do we maintain the list? v What rules do we follow? Taxonomy Strategies LLC The business of organized information 6

What's involved in creating a taxonomy? v Metadata Scheme. Data fields for describing content

What's involved in creating a taxonomy? v Metadata Scheme. Data fields for describing content so that it can be found and used. v Vocabularies. Collections of terms that are used to specify some of the metadata properties. § Relationships between content, fields or terms (hierarchical, equivalence, & associative) § Some vocabularies are big & hierarchical, some are small and flat. v Application Profile. Formal representation of metadata & vocabularies. Taxonomy Strategies LLC The business of organized information 7

Seven phases of taxonomy development Week: 1 Identify Objectives 2 Inventory Resources 1 2

Seven phases of taxonomy development Week: 1 Identify Objectives 2 Inventory Resources 1 2 3 4 5 6 7 8 9 10 11 12 Conduct interviews Identify, gather & review resources 3 Specify Metadata 4 Model Content 5 Specify Vocabularies 6 Specify Procedures 7 Test & Train Taxonomy Strategies LLC The business of organized information Define fields & purpose Define content chunks & XML DTDs Compile controlled vocabularies Develop workflow, rules & procedures Manually tag small sample 8

Taxonomy design phases need to be iterated Plan & Prototype 1 Identify Objectives 2

Taxonomy design phases need to be iterated Plan & Prototype 1 Identify Objectives 2 Inventory Resources 3 Specify Metadata 4 Model Content 5 Specify Vocabularies 6 Specify Procedures 7 Test & Train Alpha Dev & Test Review tagged samples, default procedures Interview core team and stakeholders Interview alpha users Gather additional resources, if any Identify, gather & review resources Define content chunks & XML DTDs Revise if needed, bake into alpha CMS Compile controlled vocabularies Develop workflow rules & procedures Manually tag small sample Taxonomy Strategies LLC The business of organized information Final D&T Interview beta users Gather additional sources, if any Revise if needed, bake into alpha CMS Define fields & purpose Beta D&T Revise, use in alpha CMS alpha workflows in CMS Use alpha CMS to tag larger sample Modify CMS for beta Modify for 1. 0 Revise, use in beta CMS Modify & extend workflows Use beta CMS to tag larger sample Revise using team proced ure Finalize procedure materials Finalize training materials & train staff 9

Licensing an existing taxonomy See Factiva’s taxonomy www. taxonomywarehouse. com v There are usually

Licensing an existing taxonomy See Factiva’s taxonomy www. taxonomywarehouse. com v There are usually license fees, but these will be less than the effort to develop an equivalent taxonomy. v But pre-existing taxonomies rarely fit an organization’s needs and may require extensive customization. Recommendation v Adopt a faceted approach. v Reuse existing (especially internal) vocabularies for as many of the facets as possible. v Plan on doing full-custom “Content Type” and “Topic” taxonomies. Taxonomy Strategies LLC The business of organized information 10

Free sources for 8 common taxonomies Taxonomy Definition Potential Sources Organizational structure. Content Type

Free sources for 8 common taxonomies Taxonomy Definition Potential Sources Organizational structure. Content Type Structured list of the various types Dublin Core Type Vocabulary, AGLS of content being managed or used. Document Type, Your records management policy, etc. Industry Broad market categories such as lines of business, life events, or industry codes. SIC, NAICS, Your market segments, etc. Location Place of operations or constituencies. FIPS 5 -2, FIPS 55 -3, ISO 3166, UN Statistics Div, US Postal Service, Your sales regions, etc. Functions and processes performed to accomplish mission and goals. Federal Enterprise Architecture Business Reference Model, Enterprise ontology, Your business functions, etc. Topic Business topics relevant to your mission & goals. Federal Register Thesaurus, NAL Agricultural Thesaurus, Your research areas, etc. Audience Subset of constituents to whom a piece of content is directed or intended to be used. GEM, ERIC Thesaurus, IEEE LOM, Your psycho-graphics or personas, etc. Products & Services Names of products/programs & services. ERP system, Your products and services, etc. Taxonomy Strategies LLC The business of organized information SP 800 -87, U. S. Government Manual, Your organizational structure, etc. 11

Typical product catalog: A-Z, then idiosyncratic categories Taxonomy Strategies LLC The business of organized

Typical product catalog: A-Z, then idiosyncratic categories Taxonomy Strategies LLC The business of organized information 12

How to analyze existing product catalog categories: Principles and priorities Preparing a product catalog

How to analyze existing product catalog categories: Principles and priorities Preparing a product catalog for facet browsing (aka Guided Navigation) requires a category hierarchy and additional attributes. Principles 1. Categories and subcategories that could be swapped are candidates for conversion to attributes. 2. Repeated lists of subcategories signal a possible need for an attribute. 3. The number of attributes should not exceed six or seven, so not all attribute candidates should be used. • Avoid selecting strongly correlated attributes, such as “Weight” and “Shipping Weight”. Priorities 1. Choose Categories that apply to many products, over those with few products. 2. Choose Attributes that apply to many Categories over those that apply only to very few categories. Taxonomy Strategies LLC The business of organized information 13

Product categories example: Wireless carrier Products Accessories Content Phones Services Batteries Cases Chargers Data

Product categories example: Wireless carrier Products Accessories Content Phones Services Batteries Cases Chargers Data Hands-Free Headsets Miscellaneous Purchased Subscription Taxonomy Strategies LLC The business of organized information Versatile Phones Smart Devices Basic Phones Prepaid Phones International Only Phones Mobile Broadband Cards Conferencing Internet / Data Landline Phone Network & Roaming Relay Services Solutions Wireless Data 14

Product attributes example: Digital cameras in an electronics catalog Resolution v Types of attributes

Product attributes example: Digital cameras in an electronics catalog Resolution v Types of attributes § Generic attributes – Brand/Product Family/Model – Price Range – Usually Ships 3 Megapixels (4) 4 Megapixels (5) 5 Megapixels (27) 6 -8 Megapixels (21) Brand Canon (15) § Merchandising attributes Fuji (10) – Usage (E-mail, Internet Browsing, Programming, …) Kodak (17) – Segment (Home, Business, Education, Government …) Nikon (8) – Region & Country Olympus (9) – Most Popular – New Type Point & Shoot (25) – Related Products Digital SLR (10) § Specialized attributes Packages (5) – Capacity (Battery; Memory; MB; GB; BPS, …) – Resolution (DPI; Megapixels; XGA, UXGA, …) Price Range – Size (Display; Screen; . . . ) $100 -250 (5) – Standard (a, b, g, n, …; scsi, ata, sata, eide, …; dimm, simm, $250 -500 (16) …) $500 -1000 (19) – Type (Camera; Battery; Display; Printer; Server; Storage; More than $1000 (3) Switch; …) Taxonomy Strategies LLC The business of organized information 15

Faceted taxonomy theory & practice v How many terms are needed to provide sufficient

Faceted taxonomy theory & practice v How many terms are needed to provide sufficient granularity? Not as many as you think! v Post-coordinate indexing allows several simple controlled vocabularies to be combined, rather than using a single large pre-coordinated vocabulary. Taxonomy Strategies LLC The business of organized information 16

The power of faceted taxonomy v 4 independent categories of 10 nodes each have

The power of faceted taxonomy v 4 independent categories of 10 nodes each have the same discriminatory power as one 4) hierarchy of 10, 000 nodes (10 10, 000 § Easier to maintain § Easier to tag by content authors § Can be easier to navigate Audience Advocacy Contractors & Grantees Environmental Professionals Federal Facilities General Public Industry Kids Researchers & Scientists Small Business Students Health Advisory Exposure Food Safety Health Assessment Health Effect Health Risk Occupational Health Pesticide Effects Sun Protection Toxicity Industry Agriculture & Cattle Automobile Repair Chemical Dry Cleaning Electronics & Computer Energy Extractive Industries Food Processing Leather Tanning & Finishing Metal Finishing Substance Allergen Biological Contaminant Carcinogen Chemical Explosive Liquid Waste Microorganism Ozone Pesticide Radioactive Waste v It’s more effective to increase the number of facets, than to increase the number of terms per facet. Taxonomy Strategies LLC The business of organized information 17

Automatically created taxonomies v Documents can be ‘clustered’ based on similarities and differences. v

Automatically created taxonomies v Documents can be ‘clustered’ based on similarities and differences. v Problems: § Typically only a single hierarchy § No overall plan § Results hard for people to navigate What does “North” mean on this map? Taxonomy Strategies LLC The business of organized information 18

Automatic taxonomy construction software v Software can scan large quantities of content and extract

Automatic taxonomy construction software v Software can scan large quantities of content and extract statistically significant words and phrases. v Example: § Archive of 10 publications analyzed for topics related to “copyright. ” v Software does a poor job of § De-duplication. § Turning significant words and phrases into a larger structure. § Discriminating between “gold” and “garbage. ” v Software is good for § Getting an understanding of the key noun phrases in a large collection. § Providing test cases for evaluating a taxonomy. Source: Sample data courtesy of n. Stein. Taxonomy Strategies LLC The business of organized information 19

Most popular flickr tags on 20 Feb 2007 http: //www. flickr. com/photos/tags/ Sort flickr

Most popular flickr tags on 20 Feb 2007 http: //www. flickr. com/photos/tags/ Sort flickr categories into 5 or fewer groups. Then label each group. Taxonomy Strategies LLC The business of organized information 20

Taxonomy exercise— Facet grouping v Universal taxonomy facets § By location (spatially) § By

Taxonomy exercise— Facet grouping v Universal taxonomy facets § By location (spatially) § By time (chronologically) § By type (genre) § By physical properties (size, color, shape, etc. ) § By subject (topic) Richard Saul Wurman. Information Architects (1996) Taxonomy Strategies LLC The business of organized information 21

Taxonomy exercise— Facet grouping Location Time Type Color Subject Sort flickr categories into 5

Taxonomy exercise— Facet grouping Location Time Type Color Subject Sort flickr categories into 5 or fewer groups. Then label each group. Taxonomy Strategies LLC The business of organized information 22

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance Taxonomy Strategies LLC The business of organized information 23

Business case and motivations for taxonomies v How are we going to use content,

Business case and motivations for taxonomies v How are we going to use content, metadata, and taxonomies in applications to obtain business benefits? Taxonomy Strategies LLC The business of organized information 24

What technology analysts have said: Add metadata to search on! v “Adding metadata to

What technology analysts have said: Add metadata to search on! v “Adding metadata to unstructured content allows it to be managed like structured content. Applications that use structured content work better. ” v “Enriching content with structured metadata is critical for supporting search and personalized content delivery. ” v “Content that has been adequately tagged with metadata can be leveraged in usage tracking, personalization and improved searching. ” v “Better structure equals better access: Taxonomy serves as a framework for organizing the ever-growing and changing information within a company. The many dimensions of taxonomy can greatly facilitate Web site design, content management, and search engineering. If well done, taxonomy will allow for structured Web content, leading to improved information access. ” Taxonomy Strategies LLC The business of organized information 25

Fundamentals of taxonomy ROI v Tagging content using a taxonomy is a cost, not

Fundamentals of taxonomy ROI v Tagging content using a taxonomy is a cost, not a benefit. v There is no benefit without exposing the tagged content to users in some way that cuts costs or improves revenues. v Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes. v You need to determine those changes, and their costs, as part of the ROI. Taxonomy Strategies LLC The business of organized information 26

Product utilization: Taxonomy compared to search v Conversion rate increases. § Home. Depot. com

Product utilization: Taxonomy compared to search v Conversion rate increases. § Home. Depot. com – Double digit increase. § 1 -800 -Flowers. com – More than a 10% increase. § Otto Group (Kaleidoscope, Freemans, Grattan, and lookagain catalogs) – 130% increase. v Lift in average order size. Taxonomy Strategies LLC The business of organized information 27

Product catalog: Taxonomy compared to search Increased conversion rate Benefit: & revenue lift Web

Product catalog: Taxonomy compared to search Increased conversion rate Benefit: & revenue lift Web sales net income Increased conversion rate $ 80, 000 30% $ 24, 000 Order size lift 10% $ 8, 000 Potential revenue increase per year Taxonomy Strategies LLC The business of organized information $ 32, 000 28

Usability research: Taxonomy compared to search v “We found that users preferred a browsing

Usability research: Taxonomy compared to search v “We found that users preferred a browsing oriented interface for a browsing task, and a direct search interface when they knew precisely what they wanted. ” Marti Hearst (and others) v “The category interface is superior to the list interface in both subjective and objective measures. ” Hao Chen & Susan Dumais Taxonomy Strategies LLC The business of organized information 29

Usability research: Taxonomy compared to search Category is 48% faster Median Search Time in

Usability research: Taxonomy compared to search Category is 48% faster Median Search Time in Seconds Category is 36% faster Source: Chen & Dumais Taxonomy Strategies LLC The business of organized information In top 20 results Not in top 20 results 30

Time saved: Taxonomy compared to search 1 hour per day searching x 36% faster

Time saved: Taxonomy compared to search 1 hour per day searching x 36% faster = 22 minutes each day 22 minutes x 250 working days per year = 5500 minutes or 92 hours per year Taxonomy Strategies LLC The business of organized information 31

Time saved: Taxonomy compared to search Benefit: Increase service efficiency Number of call center

Time saved: Taxonomy compared to search Benefit: Increase service efficiency Number of call center calls per month 50, 000 Average cost per call $ 20 Call response costs per month $ 1, 000 Total call response costs per year $12, 000 Percentage of self-serviced calls due to improved information browsing Service costs savings per year Taxonomy Strategies LLC The business of organized information 30% $ 3, 600, 000 32

Trusted advisers: Taxonomy avoids costs v “The amount of time wasted in futile searching

Trusted advisers: Taxonomy avoids costs v “The amount of time wasted in futile searching for vital information is enormous, leading to staggering costs …” Sue Feldman, v Sun’s usability experts calculated that 21, 000 employees were wasting an average of six minutes per day due to inconsistent intranet navigation structures. When lost time was multiplied by staff salaries, the estimated productivity loss exceeded $10 M per year—about $500 per employee per year. Jakob Nielsen, useit. com Taxonomy Strategies LLC The business of organized information 33

Knowledge workers spend up to 2. 5 hours each day looking for information …

Knowledge workers spend up to 2. 5 hours each day looking for information … … But find what they are looking for only 40% of the time. Source: Kit Sims Taylor Taxonomy Strategies LLC The business of organized information 34

Knowledge workers spend more time re-creating existing content than creating new content 25% 8%

Knowledge workers spend more time re-creating existing content than creating new content 25% 8% Source: Kit Sims Taylor (cited by Sue Feldman in her original article) Taxonomy Strategies LLC The business of organized information 35

Cost saved by not recreating content Benefit: Increase in productivity Number of employees 100

Cost saved by not recreating content Benefit: Increase in productivity Number of employees 100 Average employee salary $ 80, 000 Employee costs per year $8, 000 Increase in productivity from not recreating content Employee cost savings per year Taxonomy Strategies LLC The business of organized information 25% $2, 000 36

Business case summary 1. Classifications and classification-like schemes are being used to facilitate information

Business case summary 1. Classifications and classification-like schemes are being used to facilitate information seeking in the workplace, and on the web. 2. Users take advantage (and prefer) this type of scheme (faceted navigation) when it is made available in the user interface. 3. Hierarchical or facet navigation can be guided by the User Interface. 4. Facet navigation is best combined with keyword searching. E. g. , keyword search followed by faceted navigation of results. Taxonomy Strategies LLC The business of organized information 37

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance Taxonomy Strategies LLC The business of organized information 38

Do taxonomies actually improve search? v Input (Query) Side § “Search” using a small

Do taxonomies actually improve search? v Input (Query) Side § “Search” using a small set of pre-defined values instead of trying to guess what word or words might have been used in the content. § Have synonyms mapped together so searches for “car” and “automobile” return the same things. v Output (Results) Side § Organize search results into groups of related items. § Sorting and filtering § Refining search results Taxonomy Strategies LLC The business of organized information 39

Finding information should not be about “Feeling Lucky” Taxonomy Strategies LLC The business of

Finding information should not be about “Feeling Lucky” Taxonomy Strategies LLC The business of organized information 40

Google search on “pcb” – Returns > 28 M items Taxonomy could suggest “polychlorinated

Google search on “pcb” – Returns > 28 M items Taxonomy could suggest “polychlorinated biphenyls” Taxonomy Strategies LLC The business of organized information 41

169, 169 items Categorized results Taxonomy Strategies LLC The business of organized information Refine

169, 169 items Categorized results Taxonomy Strategies LLC The business of organized information Refine search by clicking on categories 42

Taxonomy in action on the results side: www. Career. Builder. com search on IT

Taxonomy in action on the results side: www. Career. Builder. com search on IT positions By Category By Company Taxonomy Strategies LLC The business of organized information By City By State 43

Typical search on “database”: List of ranked hits on www. oracle. com/pr. Navigator. jsp

Typical search on “database”: List of ranked hits on www. oracle. com/pr. Navigator. jsp Select item Taxonomy Strategies LLC The business of organized information 44

Faceted search on “database”: Categorized results + Ranked list Select item, or Refine search

Faceted search on “database”: Categorized results + Ranked list Select item, or Refine search by clicking on categories Taxonomy Strategies LLC The business of organized information 45

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance Taxonomy Strategies LLC The business of organized information 46

Key Factors in ROI (Return on Investment) Breadth v “How many people will metadata

Key Factors in ROI (Return on Investment) Breadth v “How many people will metadata affect? ” Repeatability v “How many times a day will they use it? Cost/Benefit v “Is this a costly effort with little or no benefits? ” Source: Todd Stephens, Dublin Core Global Corporate Circle Taxonomy Strategies LLC The business of organized information 47

Some common taxonomy ROI scenarios Product catalog v Increased conversions v Increased self-service &

Some common taxonomy ROI scenarios Product catalog v Increased conversions v Increased self-service & use v Increased productivity Customer support v Cutting requests for information costs v Increased web statistics (page hits) v Higher ACSI (American Customer Satisfaction Index) score Knowledge worker productivity v Less time searching, more time working v Avoiding re-creating information that already exists Compliance v v Improved regulatory compliance Improved enforcement Higher PARS (Performance & Accountability Reports) FDIC, SOX, HIPAA, etc. compliance Taxonomy Strategies LLC The business of organized information 48

How to estimate costs— Tagging Taxonomy Facet Hier? Typical CV Size Time/ Value (min)

How to estimate costs— Tagging Taxonomy Facet Hier? Typical CV Size Time/ Value (min) Avg # values / Item $ / Min Cost/ Element Audience N 10 0. 25 2 $ 0. 42 $ 0. 21 Content Type N 20 0. 25 1 $ 0. 42 $ 0. 11 Organizational Unit Y 50 0. 5 2 $ 0. 42 Products & Services Y 500 1. 5 4 $ 0. 42 $ 2. 52 Geographic Region Y 100 0. 5 2 $ 0. 42 Broad Topics Y 400 2 4 $ 0. 42 $ 3. 36 1080 5 15 $ 7. 04 TOTALS Inspired by: Ray Luoma, BAU Solutions Taxonomy Strategies LLC The business of organized information 49

How to estimate costs— Assumptions ASSUMPTIONS Enterprise SW License Maintenance/Support $ 100, 000 15%

How to estimate costs— Assumptions ASSUMPTIONS Enterprise SW License Maintenance/Support $ 100, 000 15% SW Implementation x 200% Legacy Content Items 100, 000 Content Growth Rate 15% Tagging/Item $ 7. 04 Enterprise Taxonomy $ 100, 000 Taxonomy Strategies LLC The business of organized information 50

How to estimate costs— Total cost of ownership (TCO) Description Year 1 Year 2

How to estimate costs— Total cost of ownership (TCO) Description Year 1 Year 2 Year 3 Year 4 Year 5 SW Licenses $ 100, 000 Maintenance $ 15, 000 Implementation $ 200, 000 App Tech Support $ 30, 000 Tagging Legacy Content $ 703, 500 Ongoing $ 105, 525 $ 121, 354 $ 139, 557 $ 160, 490 Taxonomy Creation $ 100, 000 Maintenance $ 15, 000 TOTAL $ 1, 103, 500 $ 165, 525 $ 181, 354 $ 199, 557 $ 220, 490 Taxonomy Strategies LLC The business of organized information 51

Benefits Assumptions Productivity Assumptions Employee costs per year (100 employees, $75, 000 per year)

Benefits Assumptions Productivity Assumptions Employee costs per year (100 employees, $75, 000 per year) Increase in productivity (from not recreating content) Cost savings Percentage realized in first year $ 7, 500, 000 25% $ 1, 875, 000 10% Service Efficiency Assumptions Customer service calls cost/year Efficiency (from customer self-service) Cost savings Percentage realized in first year Taxonomy Strategies LLC The business of organized information $ 12, 000 30% $ 3, 600, 000 10% 52

Sample ROI Calculations Description Year 1 Year 2 Year 5 Software Licenses/ Maintenance $

Sample ROI Calculations Description Year 1 Year 2 Year 5 Software Licenses/ Maintenance $ 100, 000 $ 15, 000 Implementation/Support $ 200, 000 $ 30, 000 Taxonomy Creation/ Maintenance $ 100, 000 $ 15, 000 Legacy/Ongoing Tagging $ 703, 500 $ 105, 525 $ 121, 354 $ 139, 557 $ 160, 490 Benefits Productivity increases $ - $ 187, 500 $ 1, 875, 000 Service efficiency gains $ - $ 360, 000 $ 3, 600, 000 Yearly Net Benefits $(1, 103, 500) $ 381, 975 1. 1 Year 4 Costs Payback period Year 3 $ 5, 293, 646 $ 5, 275, 443 $ 5, 254, 510 Years until Benefits = Costs Inspired by: Todd Stephens, Dublin Core Global Corporate Circle Taxonomy Strategies LLC The business of organized information 53

ROI exercise— Why tag? v Tagging content using a taxonomy is a cost, not

ROI exercise— Why tag? v Tagging content using a taxonomy is a cost, not a benefit. v There is no benefit without exposing the tagged content to users in some way that cuts costs or improves revenues. v Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes. v You need to determine those changes, and their costs, as part of the ROI. v List the top 5 benefits from tagging content. Then, rank the benefits by priority. Taxonomy Strategies LLC The business of organized information Priority (15) Questions 54

Potential benefits from tagging content ROI exercise— Benefits from tagging content 1. Reduce information

Potential benefits from tagging content ROI exercise— Benefits from tagging content 1. Reduce information requests Priority (1 -5) 2. Reduce cost per UU (unique user) 3. Expand to new audiences 4. Improve customer satisfaction 5. Improve performance & accountability 6. Increase number of successful 7. Increase number of links Questions (internal cross-cutting & external) 8. Reduce time to build websites 9. Increase metadata consistency & quality 10. Decrease time to create & publish marketing information 11. Improve e-commerce website searches 12. Decrease product v List the top 5 benefits from tagging content. development lifecycle Then, rank the benefits by priority. Taxonomy Strategies LLC The business of organized information 55

Why implement a taxonomy? v v v v Find relevant information quicker. Discover information

Why implement a taxonomy? v v v v Find relevant information quicker. Discover information you didn’t know you had. Avoid duplicate efforts to “reinvent the wheel” Learn from mistakes. Create better quality work product. Provide overview as well as details about a subject. Demonstrate relationships between content. Reduce complexity. Taxonomy & Content Classification Taxonomy Strategies LLC The business of organized information 56

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search

Taxonomy Fundamentals: Agenda v Building taxonomies v Taxonomy business case v Taxonomy & search v Taxonomy ROI v Taxonomy maintenance Taxonomy Strategies LLC The business of organized information 57

Taxonomy requires a business processes v Taxonomies must change, gradually, over time if they

Taxonomy requires a business processes v Taxonomies must change, gradually, over time if they are to remain relevant. v Maintenance processes need to be specified so that the changes are based on rational cost/benefit decisions. Taxonomy Strategies LLC The business of organized information 58

Taxonomy change process overview 2: Taxonomy Team decides when to update CV 2: NASA

Taxonomy change process overview 2: Taxonomy Team decides when to update CV 2: NASA Taxonomy Team snapshots CV Sources Taxonomy Facets CV Consumers Site Search Tool decides when to update snapshots of external CVs Portal Subject Codes Taxonomy Working Copies of CVs, maintain in Tool Taxonomy Tool Working Papers Project Archives NASA Expertise Competencies Other CVs from other NASA Sources Internal 3: 3: Team adds value to Team adds value via definitions, synonyms, snapshots through definitions, synonyms, classification rules, training materials, etc. External Standard Vocabularies Standard Internally Created CVs 1: External controlled vocabularies (CVs) change on their own schedule Created ’ Web CMS 4: Updated versions of CVs 4: Updated versions of published to CVs to Consumers consumers Taxonomy NASA Taxonomy Governance Environment DMS’ ’ DAM Tagging Metatagging Tool Search UI Environment CV = Controlled Vocabulary Taxonomy Strategies LLC The business of organized information 59

Who should maintain the taxonomy? v The taxonomy (and metadata specification) should be produced

Who should maintain the taxonomy? v The taxonomy (and metadata specification) should be produced by a cross-functional team which includes business, technical, information management, and content creation stakeholders. v The team should plan on maintaining the taxonomy as well as building it. § Maintenance will not (usually) be anyone’s full-time job. § Exact mix of people on team will change. v It should be built in an iterative fashion, with more content and broader review for each iteration. Taxonomy Strategies LLC The business of organized information 60

Taxonomy maintenance: Generic team charter v Taxonomy Team is responsible for maintaining: § The

Taxonomy maintenance: Generic team charter v Taxonomy Team is responsible for maintaining: § The Taxonomy, a multi-faceted classification scheme. § Associated taxonomy materials, such as: – Editorial Style Guides. – Taxonomy Training Materials. – Metadata Standard. § Team rules and procedures for change management. v Taxonomy Team will consider costs and benefits of suggested changes. v Taxonomy Team will: § Manage relationship between providers of source vocabularies and consumers of the Taxonomy. § Identify new opportunities for use of the Taxonomy across the enterprise to improve information management practices. § Promote awareness and use of the Taxonomy Strategies LLC The business of organized information 61

Taxonomy team: Generic roles § Keeps committee on track with larger business objectives. §

Taxonomy team: Generic roles § Keeps committee on track with larger business objectives. § Balances cost/benefit issues to decide appropriate levels of Business Lead Technical Specialist effort. § Obtains needed resources if those on committee can’t accomplish a particular task. § Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc. § Helps obtain data from various systems. Taxonomy Specialist § Committee’s liaison to content creators. Content Specialist § Suggests potential taxonomy changes based on analysis of Content Owners § Estimates costs of proposed changes in terms of editorial process changes, additional or reduced workload, etc. query logs, indexer feedback. § Makes edits to taxonomy, installs into system with aid of IT specialist. § Reality check on process change suggestions. Taxonomy Strategies LLC The business of organized information 62

Where taxonomy changes come from Firewall Application UI Tagging UI Content Application Logic Taxonomy

Where taxonomy changes come from Firewall Application UI Tagging UI Content Application Logic Taxonomy Query log analysis End User Recommendations by Editor 1. Small taxonomy changes (labels, synonyms) 2. Large taxonomy changes (retagging, application changes) 3. New “best bets” content. Tagging Logic Staff notes ‘missing’ concepts Tagging Staff Taxonomy Editor Taxonomy Team Taxonomy Strategies LLC The business of organized information Team Considerations 1. Business goals. experience 2. Changes in user experience. 3. Retagging cost. Requests from other parts of NASA parts of the organization 63

Taxonomy maintenance processes v Different organizations will need to consider their own change processes.

Taxonomy maintenance processes v Different organizations will need to consider their own change processes. § Organization 1: A custodian is responsible for the content, but checks facts with department heads before making changes. § Organization 2: Analysts suggest changes, editors approve, copyeditors verify consistency. § Organization 3: Marketing reps ask for a change, taxonomy editor makes demo, web representative approves it. v Change process MUST also consider cost of implementing the change § § Retagging data. Reconfiguring auto-classifier. Retraining staff. Changes in user expectations. Taxonomy Strategies LLC The business of organized information 64

Taxonomy maintenance workflow Taxonomy Tool Yes Suggest new name/category Review name Problem? No Copy

Taxonomy maintenance workflow Taxonomy Tool Yes Suggest new name/category Review name Problem? No Copy edit new name Add to enterprise Taxonomy No Yes Analyst Editor Taxonomy Strategies LLC The business of organized information Copywriter Sys Admin 65

Sample taxonomy editor: Data Harmony Hierarchy Browser Standard Term Info Taxonomy Strategies LLC The

Sample taxonomy editor: Data Harmony Hierarchy Browser Standard Term Info Taxonomy Strategies LLC The business of organized information 66

Taxonomy editing tools vendors An immature area– No Ability to Execute high Most popular

Taxonomy editing tools vendors An immature area– No Ability to Execute high Most popular taxonomy editor is MS Excel vendors are in upperright quadrant! low High functionality /high cost products ($100 K+) Niche Players Multi. Tes is widely used, Completeness of Visionaries cheap with functionality Taxonomy Strategies LLC The business of organized information 67

Taxonomy maturity model v Taxonomy governance processes must fit the organization. v As consultants,

Taxonomy maturity model v Taxonomy governance processes must fit the organization. v As consultants, we notice different levels of maturity in the business processes around content management, taxonomy, and metadata. v Honestly assess your organization’s metadata maturity in order to design appropriate governance processes. v The following slides present results from a survey of metadata and taxonomy practices at 87 organizations. How does your organization compare? Taxonomy Strategies LLC The business of organized information 68

2005 Maturity survey: Search practices Not current practice Being developed In practice Former practice

2005 Maturity survey: Search practices Not current practice Being developed In practice Former practice NA or Unknown Search Box in standard place on all web pages. 20% (12) 11% (7) 62% (38) 2% (1) 5% (3) Search engine indexes multiple repositories in addition to web sites. 25% (15) 21% (13) 44% (27) 2% (1) 8% (5) Spell Checking. 31% (19) 18% (11) 38% (23) 0% (0) 13% (8) Synonym Searching. 41% (25) 23% (14) 30% (18) 0% (0) 7% (4) Search results grouped by date, location, or other factors in addition to simple relevance score. 37% (22) 20% (12) 37% (22) 0% (0) 7% (4) Queries are logged and the logs are regularly examined 31% (19) 25% (15) 31% (19) 5% (3) 8% (5) Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. (Best Bets) 46% (28) 25% (15) 21% (13) 0% (0) 8% (5) Advanced computation of relevance based on data in addition to the text of the document. 43% (26) 16% (10) 25% (15) 0% (0) 16% (10) A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search. 68% (41) 7% (4) 10% (6) 0% (0) 15% (9) A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal. 57% (34) 15% (9) 17% (10) 0% (0) 12% (7) n=87 Taxonomy Strategies LLC The business of organized information 69

2005 Maturity survey: Metadata practices Not current practice Being developed In practice Former practice

2005 Maturity survey: Metadata practices Not current practice Being developed In practice Former practice NA or Unknown Metadata standards are developed for the needs of each system with no overall attempt to unify them. 22% (13) 12% (7) 37% (22) 20% (12) 10% (6) An Organization-wide metadata standard exists and new systems consider it during development. 37% (22) 20% (12) 0% (0) 7% (4) The Organization-wide metadata standard is based on the Dublin Core. 52% (30) 16% (9) 21% (12) 0% (0) 12% (7) Multiple repositories comply with metadata standard. 52% (31) 20% (12) 17% (10) 0% (0) 12% (7) A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard. 48% (29) 20% (12) 0% (0) 12% (7) The Cataloging Policy document is revised periodically. 48% (29) 15% (9) 17% (10) 0% (0) 20% (12) A centralized metadata repository exists to aggregate and unify metadata from disparate sources. 57% (34) 17% (10) 0% (0) 10% (6) 15% (9) 12% (7) 61% (36) 3% (2) 8% (5) Metadata is generated automatically by software. 38% (23) 18% (11) 27% (16) 2% (1) 15% (9) Metadata is generated automatically, then reviewed manually for correction. 48% (29) 18% (11) 17% (10) 2% (1) 15% (9) n=87 Metadata is manually entered into web forms. Taxonomy Strategies LLC The business of organized information 70

2005 Maturity survey: Taxonomy practices Not current practice Being developed In practice Former practice

2005 Maturity survey: Taxonomy practices Not current practice Being developed In practice Former practice NA or Unknown Org Chart Taxonomy - One based primarily on the structure of the organization. 36% (21) 10% (6) 34% (20) 5% (3) 15% (9) Products Taxonomy - One based primarily on the products and/or services offered by the organization. 37% (22) 10% (6) 32% (19) 5% (3) 15% (9) Content Types Taxonomy - One based primarily on the different types of documents. 28% (16) 21% (12) 40% (23) 5% (3) 7% (4) Topical Taxonomy - One based primarily on topics of interest to the site users. 20% (12) 36% (21) 34% (20) 3% (2) 7% (4) Faceted Taxonomy - One which uses several of the approaches above. 32% (19) 29% (17) 34% (20) 0% (0) 5% (3) The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor. 75% (44) 3% (2) 14% (8) 0% (0) 8% (5) The Taxonomy follows a written 'style guide' to ensure its consistency over time. 47% (28) 22% (13) 20% (12) 0% (0) 10% (6) The Taxonomy is maintained using a taxonomy editing tool other than MS Excel. 35% (21) 17% (10) 40% (24) 2% (1) 7% (4) The Taxonomy was validated on a representative sample of content during its development. 28% (17) 22% (13) 33% (20) 3% (2) 13% (8) A Roadmap for the future evolution of the Taxonomy has been developed. 38% (23) 40% (24) 13% (8) 0% (0) 8% (5) n=87 Taxonomy Strategies LLC The business of organized information 71

Taxonomy Strategies LLC Questions? Joseph A. Busch 415 -377 -7912 jbusch@taxonomystrategies. com Mike Lauruhn

Taxonomy Strategies LLC Questions? Joseph A. Busch 415 -377 -7912 jbusch@taxonomystrategies. com Mike Lauruhn 415 -378 -2747 mlauruhn@taxonomystrategies. com Ron Daniel Jr 925 -368 -8371 rdaniel@taxonomystrategies. com Donna Fritzsche 312 -804 -5629 dfritzsche@taxonomystrategies. com May 14, 2007 Copyright 2007 Taxonomy Strategies LLC. All rights reserved.

Taxonomy 1 -2 -3: Webography (1) H. Chen, S. Dumais. “Bringing order to the

Taxonomy 1 -2 -3: Webography (1) H. Chen, S. Dumais. “Bringing order to the web: automatically categorizing search results. ” Proceedings of CHI 2000. pp. 145 -152. http: //research. microsoft. com/copyright/accept. asp? path=http: //research. microsoft. com/~sdumais/chi 2001. pdf&pub=ACM Sue Feldman. “The high cost of not finding information. ” 13: 3 KM World (March 2004) http: //www. kmworld. com/publications/magazine/index. cfm? action=readarticle&Art icle_ID=1725&Publication_ID=108 P. R. Hagen. Must search stink? Forrester Research, June 2000. K. Hall. Content tagging strategies. Giga Information Group, February 2001. M. Hearst, A. Elliott, J. English, R. Sinha, K. Swearingen & K. Yee. “Finding the flow in website search. ” 45 Communications of the ACM (Sept 2002) http: //www. ischool. berkeley. edu/~hearst/papers/cacm 02. pdf J. Morrison. “How to create effective taxonomy. ” ZDNet Asia, August 18 2004. http: //www. zdnetasia. com/builder/program/dev/0, 39045513, 39190441, 00. htm Taxonomy Strategies LLC The business of organized information 73

Taxonomy 1 -2 -3: Webography (2) Jakob Nielsen. Web Design and Development. Eric T.

Taxonomy 1 -2 -3: Webography (2) Jakob Nielsen. Web Design and Development. Eric T. Peterson. “Home Depot uses Endeca to consolidate search and navigation, dramatically increasing conversion: case study. ” Jupiter Research (July 11, 2005) http: //www. jupiterresearch. com/bin/item. pl/research: casestudy/79/id=96483/ S. Phillips, E. Maguire, C. Shilakes. Content management: The new data infrastructure–Convergence and divergence out of chaos. Merrill Lynch, June 2001. K. S. Taylor. "The brief reign of the knowledge worker, " 1998. http: //online. bcc. ctc. edu/econ/kst/Brief. Reign/BRwebversion. htm Taxonomy & content classification: market milestone report. Dephi Group, 2002. http: //www. delphiweb. com/knowledgebase/documents/upload/pdf/2176. pdf? sessi on=%5 Bg_sid%5 D Taxonomy Warehouse. www. taxonomywarehouse. com Richard Saul Wurman. Information Architects (1996) Taxonomy Strategies LLC The business of organized information 74

Vendors Taxonomy Editing Tools URLs Knowledge Workbench www. convera. com/solutions/retrievalware/Knowledge. Workbench. a spx Cuadra

Vendors Taxonomy Editing Tools URLs Knowledge Workbench www. convera. com/solutions/retrievalware/Knowledge. Workbench. a spx Cuadra STAR/Thesaurus www. cuadra. com/products/thesaurus. html Thesaurus Master www. dataharmony. com/products/tm. htm Knowledge Engineering Workbench www. entrieva. com/entrieva/html_site/knowworkbench. htm Meta. Tagger www. interwoven. com/products/content_intelligence/index. html Smart. Discovery www. inxight. com/pdfs/Taxonomy_Final. Web. pdf MS Excel Intelligent Topic Manager www. mondeca. com Multi. Tes Pro www. multites. com Taxonomy/Authority File Manager www. nstein. com/epub/ncm-taxonomy. asp Protégé http: //protege. stanford. edu/ Schema. Server www. schemalogic. com Synaptica www. factiva. com/products/taxonomy/synaptica. asp? node=menu. El em 1511 Taxonomy Manager www. teragram. com/solutions/taxonomy. htm Term Tree www. termtree. com. au Enterprise Vocabulary Server www. webchoir. com/products/wvs. html Designer www. wordmap. com/Enterprise/Taxonomy_and_metadata_manage ment. html Taxonomy Strategies LLC The business of organized information 75