VOTable Format IO Libraries Tools Participants Francois Ochsenbein
VOTable Format, I/O Libraries & Tools Participants. Francois Ochsenbein (CDS, Strasbourg) Mark Taylor (Starlink, UK) Pallavi Kulkarni (IUCAA, India) Sonali Kale (Persistent Systems, India) 12 -Oct-03 ADASS VO Tutorial 1
Agenda VOTable format I/O Libraries Plotting tools for VOTables Mirage VOPlot Topcat VOTable browsers Treeview 12 -Oct-03 ADASS VO Tutorial 2
Understanding VOTable François Ochsenbein (francois@simbad. u-strasbg. fr), on behalf of: Roy Willliams, Clive Davenhall, Daniel Durand, Pierre Fernique, David Giaretta, Robert Hanish, Tom Mc. Glynn, Alex Szalay, Andreas Wicenec 12 -Oct-03 ADASS VO Tutorial 3
Overview The VOTable format is a proposed XML standard for representing tabular data in the context of the Virtual Observatory VOTable was designed to be compatible with FITS Binary Tables (the data part can be addressed directly) VOTable is designed as a flexible storage and exchange format for tabular data, with particular emphasis on astronomical tables 12 -Oct-03 ADASS VO Tutorial 4
Why move to VOTable ? Interoperability is encouraged through the use of standard data structures and descriptions (metadata) FITS is a widely accepted structure, but FITS keywords are rarely shared beyond the basic ones. XML built-ins: easy validation of input document easy transformations and displays via XSLT engine VOTable can cope with very large datasets does not require the huge XML overhead direct access to binary files and existing FITS files 12 -Oct-03 ADASS VO Tutorial 5
Data Model VOTable = hierarchy of Metadata + Tables Metadata = Infos + Descriptions + Parameters + Links + Fields Resource = set of tables Table = list of Fields/Parameters + Data = stream of Rows (or binary stream (remote)) Row = list of Cells Cell = Primitive or variable-length list of Primitives or or multidimensional array of Primitives Primitive = integer, character, float. Complex 12 -Oct-03 ADASS VO Tutorial 6
A VOTable document Contains the following elements: DEFINITIONS typically about used coordinate system(s) RESOURCE contains a DESCRIPTION and a list of tables, eventually (recursively) other RESOURCEs TABLE contains: a textual DESCRIPTION of the table a list of FIELD (columns) which describe the table layout and eventually PARAMETER which may specify some constants the DATA part which can be present in the document, or remotely accessible in several formats including FITS, or absent (description of the data structure) possibly LINK for getting details, explanations, related data, . . . Example 12 -Oct-03 ADASS VO Tutorial 7
Table Element in Detail Written as DESCRIPTION and a collection of FIELD elements, plus possible PARAMETER and LINK elements, all representing the metadata; followed by the DATA Fields describe the nature of table columns Table data start with the DATA element, followed by the actual rows containing the values of the fields, in the same order as their description. 12 -Oct-03 ADASS VO Tutorial 8
FIELD in detail Has sub elements like DESCRIPTION, LINK, and possibly a VALUES element Provides information for a corresponding cell in the DATA element For identification, the FIELD has a name, and also an ID name is the field designation (column header) ID is an identifier which follows the XML rules (restricted character set, unicity in an XML document) The FIELD must contain a datatype attribute Each table cell may contain more than one of the specified datatype and is specified by arraysize, which can be variable and multidimensional (64 x 64 x*) 12 -Oct-03 ADASS VO Tutorial 9
Available Datatypes 12 -Oct-03 ADASS VO Tutorial 10
Field cont. . unit attribute precises the units in a controlled vocabulary ucd (Unified Content Descriptors) This provides information about the semantics of the field expressed as a standardized string originally created at CDS and currently being reviewed. precision indicates the number of significant digits for edition purposes VALUES element Holds the domain information (min, max, null, set of available values) LINK Element It is used to provide pointers to other documents or data servers through URL. 12 -Oct-03 ADASS VO Tutorial 11
Parameter definitions A PARAMETER is similar to a FIELD : it may contain a DESCRIPTION, the attributes unit, ucd, name, ID, . . . plus a value attribute may be considered as a constant column Typical examples: frequency or wavelength of flux measurements statistical error of results 12 -Oct-03 ADASS VO Tutorial 12
Data Content The data content of the table is in a single DATA element and is organized in a reading order. Data can be stored or accessed in several formats: TABLEDATA introduces an XML coding of the table FITS refers to an external FITS file BINARY indicates a binary coding of the data used for its efficiency STREAM introduces remote or encoded data Data can be in the input stream or stored separately Example 12 -Oct-03 ADASS VO Tutorial 13
VOTable Additions Several new features are being considered: GROUP of fields introduces a view of the fields as a structure utype attribute to characterize the role of a field in the context of a data model encoding attribute of a cell element in order to store e. g. images in table cells 12 -Oct-03 ADASS VO Tutorial 14
VOTable I/O libraries Pallavi Kulkarni (pck@iucaa. ernet. in) 12 -Oct-03 ADASS VO Tutorial 15
I/O Libraries Why are parsers required ? Why are writers required ? Tree Structured Approach. Event Driven Approach. Tree Structured vs. Event Driven Approach. VOTable Parsers & Writers. 12 -Oct-03 ADASS VO Tutorial 16
Why are parsers required ? Provide a library for API based access to VOTable files. These APIs can be directly used to develop VOTable applications without having to do raw VOTable processing. 12 -Oct-03 ADASS VO Tutorial 17
Why are writers required ? Writer provides APIs for generating syntactically correct documents. The user doesn’t need to be aware of low level APIs. Simplifies the process of writing a document. 12 -Oct-03 ADASS VO Tutorial 18
Tree Structured Approach Tree structured approach is a two step process. 1. 2. The entire XML data is loaded in memory. Operations are performed on the loaded data. VOTable Resource Table 12 -Oct-03 Table ADASS VO Tutorial Table 19
Event Driven Approach Does not create a tree structure in memory. Single step process. Data is passed from XML document to the application on the fly. 12 -Oct-03 ADASS VO Tutorial 20
Tree Structured vs. Event Driven Event driven approach is faster. Event driven approach is good for large documents because it takes comparatively less memory. With event driven approach, one can access the data but not modify it. With tree structured approach, one can modify data. Tree structured parsing allows back and forth traversal in the XML data. Event driven parsing can be stopped once desired XML data has been extracted. 12 -Oct-03 ADASS VO Tutorial 21
VOTable Parsers & Writers JAVOT (NVO) SAVOT (European VO) VOTable Java Streaming Writer (VO-India) C++ Parser (VO-India) VOTable Perl Modules (NVO) VOTable: : DOM (NVO) 12 -Oct-03 ADASS VO Tutorial 22
JAVOT Developed at Caltech. Written in JAVA. Creates a tree structure in memory. Current version supports reading of VOTables. Editing and writing of tables not yet supported. More information can be found at: http: //www. us-vo. org/VOTable/JAVOT/ 12 -Oct-03 ADASS VO Tutorial 23
SAVOT Developed at CDS. Written in JAVA. Supports reading, writing & editing of VOTables. Can work in both full (tree structured) & sequential (event driven) modes for parsing the data. Suitable for large VOTable files as well. More information can be found at: http: //simbad. u -strasbg. fr/public/cdsjava. gml 12 -Oct-03 ADASS VO Tutorial 24
Sample VOTable <VOTABLE> <RESOURCE> <TABLE> <FIELD name=“area” datatype=“int”/> <DATA><TABLEDATA> <TR><TD>2000</TD></TR> <TR><TD>35467</TD></TR> </TABLEDATA></DATA> </TABLE> </RESOURCE> </VOTABLE> 12 -Oct-03 ADASS VO Tutorial 25
Sample Code (Sequential mode) Get resource element Savot. Resource current. Resource = sb. get. Next. Resource(); TRSet tr = current. Resource. get. TRSet(0); Fetch all the rows From first table TDSet the. TDs = tr. get. TDSet(0); Fetch first row from the set of rows. for (int k = 0; k < the. TDs. get. Item. Count(); k++) { System. out. println(the. TDs. get. Content(k)); } Print the data Field wise 12 -Oct-03 ADASS VO Tutorial 26
VOTable JAVA Streaming Writer Developed at IUCAA and Persistent Systems. Written in JAVA. It supports only writing of data. Takes a streaming approach i. e. event based approach to write the data. More information can be found at: http: //vo. iucaa. ernet. in/~voi/votable. Stream Writer. htm 12 -Oct-03 ADASS VO Tutorial 27
VOTable to be generated <VOTABLE> <RESOURCE> <TABLE> <FIELD name=“Planet” datatype=“char”> <DATA><TABLEDATA> <TR><TD>Mercury</TD></TR> </TABLEDATA></DATA> </TABLE> </RESOURCE> </VOTABLE> 12 -Oct-03 ADASS VO Tutorial 28
Sample Code Initiallize the streaming writer VOTable. Stream. Writer vo. Write = new VOTable. Stream. Writer(prn. Stream) VOTable vot = new VOTable() vo. Write. write. VOTable(vot) Create & write VOTable element. VOTable. Resource vo. Resource = new VOTable. Resource() vo. Write. write. Resource(vo. Resource) Create & write Resource element. VOTable vo. Tab = new VOTable() Create a table VOTable. Field vo. Field = new VOTable. Field() element vo. Field. set. Name("Planet") vo. Field. set. Data. Type(“char”) Create a field & add vo. Tab. add. Field(vo. Field 1) it to the table. 12 -Oct-03 ADASS VO Tutorial 29
Sample code contd. Write the table to output stream vo. Write. write. Table(vo. Tab) ; String [] first. Row = {"Mercury"} ; vo. Write. add. Row(first. Row, 1) ; Create a row & write it to output stream vo. Write. end. Table() ; vo. Write. end. Resource() ; End the respective elements. vo. Write. end. VOTable() ; 12 -Oct-03 ADASS VO Tutorial 30
C++ Parser Developed at Persistent Systems and IUCAA. Written in C++. Available as both streaming and nonstreaming parser. It runs on Windows NT 4. 0, Windows 2000, and Redhat Linux 7. 1. 12 -Oct-03 ADASS VO Tutorial 31
C++ Parser Currently, supports reading of VOTables and pure-xml table data. Currently, doesn’t support reading of binary or FITS data & doesn’t support writing of VOTables. More information can be found at: http: //vo. iucaa. ernet. in/~voi/cplusparser. ht m 12 -Oct-03 ADASS VO Tutorial 32
VOTable PERL modules Developed by the Class. X project at GSFC (NVO). Written in PERL. Builds a tree structure in memory. Optimizations to handle large TABLEDATA elements. Can be used for parsing existing VOTables and creating new ones from scratch. More information can be found at: http: //heasarc. gsfc. nasa. gov/classx/pub/votable/ 12 -Oct-03 ADASS VO Tutorial 33
PERL formatting & printing Developed at NCSA (NVO). Written in PERL. It supports writing of VOTable documents. Takes a streaming i. e. event driven approach to write data. More information can be found at: http: //monet. ncsa. uiuc. edu/~rplante/VO/VO Table-DOM. pm 12 -Oct-03 ADASS VO Tutorial 34
VOTable Applications: VOPLOT & MIRAGE Sonali Kale (sonali@persistent. co. in) 12 -Oct-03 ADASS VO Tutorial 35
Plotting Tools for VOTables VOPlot Developed under Virtual Observatory India initiative. Mirage Developed by Lucent Technologies, Bell Labs. Topcat Developed by Starlink, UK. 12 -Oct-03 ADASS VO Tutorial 36
VOPlot – Introduction Visualization toolkit developed by Persistent Systems and IUCAA in collaboration with CDS. Motivation – Allow web-based visualization of astronomy data. Available as web-based version as well as a standalone desktop application. Web-based version is successfully integrated with Vizier Catalogue Service. 12 -Oct-03 ADASS VO Tutorial 37
VOPlot: Visualization Has all the features of versatile graphical tool Scatter plots Connected plots Histograms Logarithmic axes Overlaying Auto-ranging 12 -Oct-03 ADASS VO Tutorial 38
VOPlot – Features Column transformations based on expression. Data subset creation based on filter condition. Save graph in EPS format. Ability to select points on graph. View meta data and data in tabular and VOTable format. 12 -Oct-03 ADASS VO Tutorial 39
VOPlot – Features Inter-operable with Aladin developed by CDS. This enables simultaneous visualization of some graph in VOPlot together with a representation as a sky plot in Aladin. Selecting some region in the graph highlights the corresponding points in the sky plot and vice versa. 12 -Oct-03 ADASS VO Tutorial 40
VOPlot – Features VOPlot can be used for basic statistical analysis. Basic uni-variate and multivariate functions supported. Displays box plot. A box plot provides a visual summary important aspects of data distribution. 12 -Oct-03 ADASS VO Tutorial 41
VOPlot – Features Allows overlaying from multiple catalogues. Allows drawing of error bars. Can be integrated with any web-based catalogue service that creates output in VOTable format. Example: Successfully integrated with LEDAS (UK). 12 -Oct-03 ADASS VO Tutorial 42
Launch VOPlot from Vizier 12 -Oct-03 ADASS VO Tutorial 43
Launch VOPlot from Vizier (cont. ) 12 -Oct-03 ADASS VO Tutorial 44
Launch VOPlot from Vizier (cont. ) Choose Y column 12 -Oct-03 ADASS VO Tutorial 45
VOPlot: Inter-operation with Aladin 12 -Oct-03 ADASS VO Tutorial 46
Inter-operation with Aladin (cont. ) Point highlighted in VOPlot 12 -Oct-03 ADASS VO Tutorial Object selected in Aladin. 47
VOPlot: References VOPlot http: //vo. iucaa. ernet. in/~voi/voplot. htm Vizier Catalogue Service http: //vizier. u-strasbg. fr/viz-bin/Vizie. R Aladin http: //aladin. u-strasbg. fr/ Virtual Observatory India http: //vo. iucaa. ernet. in/~voi/ 12 -Oct-03 ADASS VO Tutorial 48
Mirage: Introduction Mirage is a Java-based tool for data visualization tool and exploratory analysis. Developed at Bell Laboratories, Lucent Technologies. Support for VOTables added in collaboration with John Hopkins University. Ray Plante produced an XSL stylesheet to convert from VOTable to Mirage format. 12 -Oct-03 ADASS VO Tutorial 49
Mirage: Data Visualization Features Mirage provides multiple simultaneous views of data. Data visualization is provided through: Data matrix view Scatter Plots Histogram. Feature vector plot. 12 -Oct-03 ADASS VO Tutorial 50
Mirage: Operations On Graph Points on plot can be selected using different selection methods (box, Bezier curve or free hand). Selection can be broadcasted to other views. Automatic walk in through graphs. Grid of plots can be created. Coloring of plot points. 12 -Oct-03 ADASS VO Tutorial 51
Mirage: Command Interpreter Command line interpreter supports: Creation of new columns based on arithmetic expressions. Addition of new columns from other data file. Selection of points using a logical expression. Color selection of points. 12 -Oct-03 ADASS VO Tutorial 52
Mirage: Data Matrix View Data read from file One row for each entry and one column for each attribute. 12 -Oct-03 ADASS VO Tutorial 53
Mirage: Scatter Plot Walking through a plot. Choose a column for plotting 12 -Oct-03 ADASS VO Tutorial 54
Mirage: Histogram Change bin width 12 -Oct-03 ADASS VO Tutorial 55
Mirage: Multiple Simultaneous Views Make a selection Broadcast selection 12 -Oct-03 ADASS VO Tutorial 56
Mirage: Grid of plots Zoom in and out Comman d to add column 12 -Oct-03 ADASS VO Tutorial 57
Mirage: References Mirage http: //www. bell-labs. com/project/mirage/ VO Enabled Mirage http: //skyservice. pha. jhu. edu/develop/vo/mirage/ XSL Stylesheet to convert VOTable to Mirage format: http: //www. eurovo. org/pub/articles/Science. With. Proto. VOtools /vot 2 mirage. xml 12 -Oct-03 ADASS VO Tutorial 58
VOTable Applications : TOPCAT & Treeview Mark Taylor (m. b. taylor@bristol. ac. uk) 12 -Oct-03 ADASS VO Tutorial 59
TOPCAT Tool for OPerations on Catalogues And Tables Generic table read/view/edit/analyse/plot/write Format-neutral: VOTable, FITS, SQL and others Pure java Full online help Powerful expression language Supports large tables Under active development 12 -Oct-03 ADASS VO Tutorial 60
TOPCAT capabilities View/edit table data View/edit table and column metadata Units, UCDs, type, array shape, format-specific items. . . Create new columns using algebraic expressions Powerful language – conditionals, nulls, string handling. . . Move/delete columns Sort rows Create row subsets in various ways Algebraic, graphical, boolean columns, selected rows Plot columns against each other Distinguish different defined subsets Calculate per-column statistics 12 -Oct-03 ADASS VO Tutorial 61
Table I/O format-neutral view VOTable TABLEDATA, FITS, STREAM? FITS TABLE, BINTABLE SQL. . . others 12 -Oct-03 TOPCAT (view/edit GUI) Tablecopy (command line) VOTable FITS SQL ASCII La. Te. X Mirage …others Tables class library (your code here) % tablecopy -ofmt fits my-votable. xml ADASS VO Tutorial 62
12 -Oct-03 ADASS VO Tutorial 63
Treeview Find tables in deep hierarchies: Directory trees VOTable hierarchical structure Multi-extension FITS files Compression: . gz, . bz 2, . Z Tar, zip, jar files FTP, HTTP Quick preview of VOTables XML, table data, table/column metadata Launch TOPCAT Recent versions of TOPCAT (v 0. 4) have embedded Treeview 12 -Oct-03 ADASS VO Tutorial 64
References TOPCAT http: //www. starlink. ac. uk/topcat/ Treeview http: //www. starlink. ac. uk/treeview/ Starlink http: //www. starlink. rl. ac. uk/ 12 -Oct-03 ADASS VO Tutorial 65
- Slides: 65