- Slides: 75
Mandala 8: A Database System that Can Start Out Small and End Up Big Gail E. Kampmeier University of Illinois Institute for Natural Resource Sustainability Division of Illinois Natural History Survey [email protected] edu Supplemental materials to live demo 27 October 2008 of Mandala database system at Biodiversity Information Standards (TDWG) meeting, Fremantle, Australia
Table of Contents What is Mandala? History Why use it? Where to begin Specimens Taxa Localities & Collecting Events People Bulk samples Loans, Deposition Museums Images Literature Journals Literature Mining …more • Navigate via buttons in upper right of each screen, home returns you here. • Click "more" above to see p. 2 of Contents • To quit from slide show, hit escape (esc)
Table of Contents Resources Help Online services Gazetteers Taxonomic Names Troubleshooting Electronic sticky notes Document structural changes Changing key fields Verifying data Building a thesaurus of people names The Payoff: Reports Phenology Taxon illustrations Image comparisons Synonymies Specimens examined & Faunal lists Loan management Mapping
What is Mandala? Cross-platform database system supporting data acquisition & management for systematics & biodiversity studies Specimens Taxonomic Names Literature Illustrations
Interrelated & Interconnected Viewed this way, it can seem intimidating, but realistically, most functions are accessed from only 3 or 4 tables dealing with TAXA, SPECIMENs, ILLUS (images), & LITERATURE
Highlights in Mandala's History Begun in 1995 for therevid (fly) NSF PEET* project Uses File. Maker® Pro 8. x/9 Relational Works on Macintosh & Windows operating systems Extensible, adaptable, & evolving Fully operational demo available by request *U. S. National Science Foundation Partnerships for Enhancing Expertise in Taxonomy DEB 95 -21925 & 99 -77958
So Why Use Mandala? Completely open architecture with crossplatform, well-supported, off-the-shelf database engine (File. Maker Pro) Multiuser support through File. Maker Pro Server or Server Advanced. Access your databases remotely from File. Maker & a good internet connection Web serving options using FMP Server (for PHP) or Server Advanced
So Why Use Mandala? Extensive data entry options Overwhelmed by the options & possibilities? News flash: Use only the features you need in Mandala Break up data entry tasks by putting in verbatim label data & basic specimen info on single layout in SPECIMEN or a basic taxon entry in TAXA Return later to georeference, interpret collection data in LOCALITY and COLLEVENT; add nomenclatural history to TAXA, and construct a classification hierarchy Customize layouts & pick (value) lists to your needs Extensive built-in reporting options Integrated context sensitive HELP Integrated problem & resolution tracking
1 Users Enter through NAVIGATION… Whether from your desktop or via a network, open the NAVIGATION file first, sign in with an account name & password (1), then your user name (2), & navigate (3) to Mandala. 3 2
Data Entry & Viewing Options Buttons allow you to navigate from a central point for data entry. Customize* this screen to rename, bring to front, or remove various navigational features Quit Mandala & File. Maker from here. *Customization should be done by the database administrator.
Illustrated Features… 1 2 7 3 4 Resource files 6 5 9 8
1 Enter Specimen Data Enter basic information about a specimen: Unique ID Verbatim label(s) Life history Curatorial info Link to locality, collecting event & taxon ID Navigate via tabbed interface for additional data entry options
2 Minimum Taxon Name Entry Rank Specific epithet, or if in combination Status (Valid, Invalid, Unregulated)
2 Add to the minimum… Common name Valid name Parent taxon Build classification List author(s) from PEOPLE Source of name information Comments Verification
2 And history of the type…
2 And name changes, homonyms, & other conflicts…
Resource: People The PEOPLE table is used to reference, via the People. ID, all the information stored about a person, permits the flexibility of constructing that person’s name in various ways, & to determine of the various ways a name is presented, which is the senior synonym. This table (part of TAXA) is referenced for authors of literature; authorities of taxa; illustrators & copyright holders; and determiners, borrowers, lenders, & collectors of specimens.
3 -4 Interpreting Collection Labels In retrospective capture of specimen data, often from museum specimens that may be over 100 years old & with incomplete or difficult to interpret data, it is more efficient to split the labor of recording the verbatim label(s) in SPECIMEN, which does not require much training, from their interpretation in the LOCALITY & COLLEVENT tables. In prospective data entry of information for specimens or bulk samples, it is more likely that labels will be generated from the information entered & that it will not need much additional interpretation, e. g. , latitude, longitude, elevation, & more complete locality & collecting data will be available.
Serial Locality 3 a Serial Locality. IDs (layout=1) are automatically generated with new records. Locality records parse the political description as well as that of named geographic features that may cross political boundaries.
3 b Custom Locality Custom locality identifiers do not rely on the automatically generated serial. ID (layout=2) Configure a 3 -part unique identifier that may be meaningful to the collector. Ecological Community is added (can be used with serial localities). No displacement calculator appears for georeferencing here, but can be done on Additional Details tab.
Choosing Locality Type Generally you would choose a single locality entry style (choose serial unless having meaningful identifiers is important) Three types exist for scripting purposes, layout= 1=Serial locality 2=Custom locality 3=Bioblitz locality
3 Recording Serial Localities Because serial localities are most often used with retrospective data capture with greater interpretation needs, features were added to aid in this task: Translating elevation in feet to meters Calculating displacement from a known lat/long by distance in km & one of 16 compass directions. Translating miles to kilometers
3 From Locality to Collecting Event Each locality may have one or more collecting events attached to it. To see what collecting events are already associated with a locality, click on the small magnifier to see these records. The Locality. ID is represented by the left side of the underscore character. It is easiest to find or create a locality first, then find or create a new collecting event to associate with it. You can mix/match serial & custom identifiers, but if you don't have to, stick to serial localities & collecting events
4 Serial Collecting Event Collecting events include the collecting method, collectors, collection date range, abiotic conditions at the time of collection, & a locality. ID. Serial collecting events (layout=1) could be unique references to both the locality & event, but because custom collecting event. IDs may NOT be unique, a combination of the Loc. Coll. Event. ID is used as the key.
4 Custom Collecting Event Both Custom (non-serial) and serial collecting events (CEVs) are displayed on the same tab, but layout=2. But, custom CEVs are composed from the method code, trap/site#, & custom Coll. Event. ID, all of which are designated by the person doing data entry. These codes may make more sense to collectors Custom CEVs may be combined with custom or serial localities. They are only guaranteed to be unique in combination with the locality. ID as Loc. Coll. Event. ID.
4 Create Records in SPECIMEN Whether from serial or non-serial collecting event recordings, you can create two kinds of new records in SPECIMEN: Single specimen records with a unique identifier for the specimen Bulk sample record that reflects the collection of many specimens that may or may not have been sorted, but whose tracking is of importance. You can see specimens or a bulk sample already attached to this Locality. Coll. Event. ID.
5 Bulk Sample Tracking Bulk samples, or samples representing more >1 specimen for which you are recording a locality & collecting event are recorded as a special type of specimen. Its Collection. Code, if generated from CEV, is composed of the locality. ID and first half of the CEV. The Catalog. Number is the last number of the Loc. Coll. Event. ID. The Bulk. Sample. ID = the Loc. Coll. Event. ID in this case.
5 Bulk Sample Tracking in SPECIMEN Each bulk sample may be sorted into groups that may be sent to specialists for further identification. Each taxon is Identified to group & sexed Counted or not, including any specimens removed from the sample & not sent to the specialist Given a loan number specific to the borrower for each shipment (not each bulk sample) The loan is recorded using the pencil icon next to the loan. ID. If there are multiple loans to the same person in the same bulk sample, this is recorded only once. Create a taxon list of the sample
6 Track Bulk Sample Loans When you click on the pencil icon, you create DEPOSIT record based on the sample. ID, but also reflecting taxa in a loan that may be accumulated by loan. Print out the loan form after filling out shipping details
6 Basic Loan & Shipping Basic loan & shipping details are common to all loans & appear on the top tab. For loans based on samples, loan forms are accessed on their respective tabs
6 Sample Loan Form Note that this loan form lists taxa from one or more Sample IDs that have accumulated under a single loan number. Shown here in print preview, you see this form as it will print. Print this form, or return to the data entry screen using the icons seen when you are in browse mode (not print preview as these do not print & would not function in this mode)
Museums & Collections * http: //hbs. bishopmuseum. org/codens/ also accessible from NAV’s resources help layout The 3 or 4 letter coden (Museum Code) represents a museum or collection & is used to represent loans to & from collections & in ILLUS for physical storage of images & documents. The populated table provided is based on Arnett, et al. ’s 1993 Insect & Spider Collections of the World, which is kept current by the Bishop Museum’s codens-R-us site*
7 Digital Archiving of Images Store images or other documents or links to these to the ILLUS table. Give detailed description of method, medium, background, copyright, view, stage, etc. Detail archiving of both digital & physical images. Link to literature citation.
8 Enter Literature Citations Enter all types of literature Link to journals, URL Translate titles from non-target language Give location & level of curation of the literature Sample citations prebuilt by citation type New bibliographic export feature
Journal Titles & Abbreviations Journal titles & their abbreviations may be stabilized for data entry & flexibility of output in literature citations. Publisher, City, & State/Country information is also sometimes required for some citations.
9 Dissect the Literature Not likely the first task you will undertake in Mandala, mining the literature & linking (via an organizing principle) it to a taxon or specimen are activities associated with maturing databases with a specific taxonomic or geographic focus.
Eyes Glazed Over Yet? OK, so you have that sick feeling in your stomach that this is WAY more database than you bargained for…too complex, too many features, too hard to understand. STOP! DON’T PANIC There are help & troubleshooting resources built into Mandala or are accessible online, which are only a click or two away…
Help Resources 3 1 2 4 A variety of help resources are built into Mandala, from general, file, and field contextsensitive help, to explanations of the icons & terms used in Mandala, to links to online gazetteers, taxonomic name servers, & an index to codens used for many museums & collections.
1 Glossary of Icons & Terms Find out what all of those icons mean Learn the language of databasing & conventions used throughout Mandala
2 Integrated Help File/Table & field context sensitive Help. Click ? icon in table or while in field for which you want information.
3 Access Online Gazetteers
4 Access Taxonomic Names Servers (TNS)
Tools for Troubleshooting 1 2 3 4 Track & resolve problems encountered by users Document changes to database structure Open all files or verify integrity after a power outage or crash See the ramifications of making changes to or deleting unique identifiers & prevent orphaned records
1 Electronic Sticky Notes Track user problems from any file even if you never see the user Track resolutions to problems Keep track of records deleted & the reason for deletion
2 Document Structural Changes Tracks changes by table & types of changes Lists changes made & changes needed Include screen captures of changes made Documents version concerned & environment (server, web, demo, etc. ) For database administrators
3 Implications of ID Changes Specimens can be tied to at least 13 other tables related through the Specimen. ID. Change the related records first before changing the Specimen. ID
Verifying Data Many levels of verification Look for the obvious Duplicate entries Information that doesn’t agree Outliers & ocean dwellers for terrestrial organisms Nothing beats proofing from original Examination by expert who will see errors on labels: misspellings, inconsistencies
4 Synonymies Among People Names Because people may be listed as authors, taxon authorities, illustrators, determiners, lenders, borrowers, collectors, etc. their names may be presented in different forms. By designating synonymic status & senior synonym ID, you can choose the format to display in various output scenarios (labels, specimens examined, literature), even if a junior synonym is used.
So What’s the REAL Payoff for All This Work? ? Why do we make databases to organize our data? It’s not as though we’d rather be doing this than be out in the field!! We record our data so that we can examine it in novel ways that couldn’t otherwise be seen, to identify: Changes through time (phenologies) Changes & relationships through space (distributions; mapping) Holes in our data Strategic dates & places in which to collect Correlations of our data with datasets of others Facilitate publishing our data So on to the built-in reporting features!!
Reporting Features Report examples 1 Phenology 2 Illustrations of taxon 3 Image comparisons 4 Synonymies 5 Specimens examined 6 Faunal list 7 Loan management 8 Export for mapping
1 Reports: Phenologies You can search for all of a taxon. ID in a state or province to find the range of collecting dates that this species has been found in the this geographic range. The search can also specify a range of sequential dates spanning more than 1 year & a calculator is supplied for this purpose. In find mode, use “…” between two numbers to find the range.
1 Julian vs. Sequential Date Julian date is a day from 1 -365 (366 in leap years) that represents a day of any year. Use this to map phenologies for the same days regardless of year. Sequential date specifies a range of days linked to specified years and will not automatically accumulate data for the same day across a year. Date information is pulled from the initial collecting date information entered for a collecting event (in COLLEVENT)
1 Reports: Export Phenology Print, export, or view only for outliers (unexpected collecting dates such as January in Illinois!), the phenology of one or more taxa. Export to plot in your own charting software such as Microsoft Excel.
1 Plot Exported Results in Excel
2 Reports: Illustration of Taxon Some “reports” are collated automatically because of a relationship specified such as that between a taxon. ID & illustrations associated with that taxon. Click on magnifier icon to see details about these illustrations.
3 Compare Images Image comparisons now appear beside the entry form for all images. This image comparison is filtered automatically by the field “Illustration of, ” allowing you to view comparisons with other illustrations in the database. Other comparisons could be made by creating other relationships (see your database administrator)
4 Specialized Search: Synonymies Script only finds valid species or genera. You can export synonymies from here. Or use the back-curved yellow arrow to view the synonymy on the tab for the taxon (see next).
4 Reports: Mining Portals Although the portal giving the full synonymic list for this species is dynamically generated, the indented list at the right, which is often used in catalogs, is not. This list needs to be regenerated by running the script associated with the single pencil icon on individual records or by clicking on the multiple pencil icon to update all found records.
5 6 Specimens Examined & Faunal Lists Complex finds may be facilitated here to generate faunal lists for a locality or specimens examined lists for a single or group of taxa. Start the find process with the scripted button “click to start FIND”, add criteria with + button, & execute find with blue find button in status bar.
5 Specimens Examined. . . Although you may do all of your work in SPECIMEN to assemble your specimens examined list, behind the scenes, Mandala is actually pulling your data from many different sources! PEOPLE -> PEO_JOIN ->TAXA (taxon authorities) PEOPLE -> PEO_JOIN -> COLLEVENT (collectors) LOCALITY -> COLLEVENT MUSEUMS -> DEPOSIT (lending, borrowing, original lending, redistribute to institutions)
5 View, Print, or Export Specimens Examined List Or, after finding the records from “Illinois, ” for example, click a button at the bottom of the layout on the previous slide to sort your records for specimens examined for a single taxon or multiple taxa. Note here, the taxa are output as listed, without interpolation to a valid taxon name. You could change this on your layout. Choose to print, export or simply view your results.
6 View, Print, or Export Faunal List After you’ve found the records you want (we searched for the state of “Illinois”), you can choose a button at the bottom of the layout on the previous slide to sort your records for faunal lists by the taxon listed or by the valid name. Choose to print, export or simply view your results.
7 Reports: Managing Loans Mandala now manages loans from several different perspectives, so you must choose in which you are interested. Find this button on layout to lead you here
7 Loan Management. . . Context = specimens, owned by you, or sent by or received from museums or collections. Recorded on tab in SPECIMEN table. May be the partitioning of context = taxa Context = taxa, sent by or received by you from museums or collections; bypasses specimens & lists groups of taxa in a loan in BULKSAMPLE. …depends on perspective & context and is handled in DEPOSIT. Context = borrower & loan. ID, where subsamples (usually taxonomically based) of one or many larger samples are tagged to send to a specialist (=borrower) under a single loan (=shipment). Uses Sample entry of SPECIMEN, BULKSAMPLE, & DEPOSIT tables.
7 Sort & View, Export, or Print Once found, choose option for sorting & viewing results. Print or export if desired.
8 Finding Data to Map Narrow to place, make sure georeferenced data exists by using the buttons to initiate & extend finds, &/or narrow to taxon Once found, export data & use your favorite mapping program or map your data with Discover. Life or GBIF.
8 Map in Desired Program Collecting sites (black dots) in Madagascar where therevid fly specimens have been collected. i. Map* was used here to map latitude/longitude in decimal degrees. Other mapping programs may be used by user preference. *http: //www. biovolution. com/imap
Online Mapping… Map One or More Taxa with Discover. Life. org These data were exported from the Therevid PHP database If others had contributed therevid fly data, their data would also appear here unless databases were limited. Can map species against potential plant hosts or associates.
Zoom in Progressively Shows taxa remaining on the map Click on map to zoom in Click on point to see data or on links in list below map
Point Data Served in DL These are data exported from Mandala to a tab-delimited text file for Discover. Life. Links to the taxonomic name & back to the originating PHP record output (by Rec. ID)
PHP Searches Drill back & forth between DL & PHP versions of Mandala The database version queried by PHP is not the production database but an optimized version of data exported from Mandala for display on the web (not pretty but functional!)
Contribute to GBIF Data served on Discover. Life (DL) can now be contributed to GBIF (Global Biodiversity Information Facility) GBIF datasets will be able to be mapped in DL
Newest additions… Support for LSIDs (life science identifiers) for taxa, specimens, people) DOIs (digital object identifiers) for literature Unique resource identifiers for illustrations (may be DOIs) BCI GUIDs (Biodiversity Collections Index Global Unique Identifiers) for collections Track the progress of molecular studies based on Extractions from a specimen PCR products using specified primers Sequences derived & deposition into Genbank
Coming in Early 2009 Seminal reference for Mandala Kampmeier, G. E. & Irwin, M. E. 2009. Meeting the interrelated challenges of tracking specimen, nomenclature, and literature data in Mandala. Chapter 15 in T. Pape, D. Bickel, & R. Meier [eds. ], Diptera Diversity: Status, Challenges and Tools. Koninklijke Brill NV.
Acknowledgements Michael E. Irwin F. Chris Thompson Neal Evenhuis Chris Lambkin Don Webb Mark Metz Martin Hauser Kevin Holston Steve Gaimari J. Marie Metz Amanda Buck Kristin Algmin Andy Bennett University of Illinois Discover. Life. org Schlinger Foundation Hatch ILLU 875 -380 NSF Therevid PEET DEB 95 -21925 & 99 -77958 NSF BSI for Fiji Bioinventory of Arthropods DEB 0425790 John Pickering Illinois Natural History Survey Shelah Morita NSF ATo. L for Diptera EF-0334948 FMWebschool NSF Tabanid PEET DEB 07 -31528