Storage and Dissemination of SEGY Data in JPEG





















- Slides: 21
Storage and Dissemination of SEGY Data in JPEG 2000 Format Bob Courtney Geological Survey of Canada (Atlantic) Bob. courtney@nrcan. gc. ca
2 The Context & Problems Problem § Marine program collecting digital SEGY data since early 90’s § Scientists use printed field records in preference to digital products § New digital systems (e. g. , 3. 5 khz chirp on multibeam vessels) strictly digital. § Digital processing issues – data size ( > 1 TB/yr) , comparison to gold-standard printed records, time vs return § Discovery and Dissemination problems – cost of copying analog records, record degradation, size of digital SEGY archives § Database population and update issues – no validation of digital data
3 GSC Implementation of JPEG 2000 § GSC has implemented the JPEG 2000 framework to consolidate, encode, archive, interpret and disseminate digital SEGY data. § Experience suggests between 10: 1 to 40: 1 compression effective § Approach applied to seismic, sidescan, and sounder data. Will (? ) be extended to image trace data of multibeam sounders (water column imaging), other gridded data sets. § All ancillary data encoded via XML schemas metadata harvesting for database during normal processing (carrot vs stick approach)
4 § § § § § What is JPEG 2000 ? JPEG 2000 is definitely not JPEG Open file standard ISO/IEC 15444 -1: 2000 Wavelet based, multiresolution representation Up to 38 bit signed data – not just images Up to 16, 000 planes/channels Entropy-based (MQ) bit-plane encoding (save 20 bits instead of 32, white space costs almost nothing) Lossless/lossy encoding - harmonic distortion for lossy compression Flexible file format –XML-aware, UUID defined boxes Random access to ROI, transcoding, quality layers, etc Internet ready : JPIP => low bandwidth optimized Industry support: e. g. , Lizardtech, Adobe Photoshop
5 SEGY JPEG 2000 Processing Framework JPEG 2000 Viewers Tape Harvest Encode Convert QC Register Interpret GIS Archive DVD Scan Internet
6 Harvest Demultiplex and Combine File 1 File 2 File n-1 File n Demultiplex Combine Channels Concatenate Big SEGY >200, 000 pings 2 GB
Harvest 7 Demultiplex and Combine § § Reduce number of files => 1 file/day rather than 50 Database-linked nomenclature Composite channel files; sidescan, 2 channel high res Self descriptive file names § Expedition_datatype_instrument_xdcr_starttime_endtime § 2007006_SEISMIC_KNUDSON_3. 5 khz_132_0007_to_132_1217. sgy
Encode 8 SEGY SGYJP 2 XML to Database SEGY. xsd XML 1: 1 SEGY headers Summary data SEGY Filter Signal Cond. Outliers GZIP Bipolar Envelope Half-wave - zero padding trace delays - lossless or lossy 10: 1 SGYJP 2 Waveform Data - keep only significant bit-depth - choose reduced bit-depth scaled to highest amplitude; JPEG 2000 Compression Engine 10: 1 - 40: 1
9 Encode SEGY SGYJP 2
10 Encode SEGY SGYJP 2
11 Encode SEGY SGYJP 2
12 Encode SEGY SGYJP 2 Sample from 3. 5 khz Knudsen – Creed St. Lawrence Estuary 69333 traces; 13333 samples/tr ; 12 hr data; 10: 1 compression Lizardtech IE plugin
13 Encode SEGY SGYJP 2 Signal amplitudes (in this case; envelope) encoded in file; Anti-aliasing at all zoom levels
Encode 14 SEGY SGYJP 2 50: 1 / 0. 3 bpp 10: 1 1. 6 bpp Comparison => 10: 1 to 50: 1 compression
15 Encode SEGY SGYJP 2 Sidescan trace encoding; equally effective for MBES
Interpret 16 SGYJP 2 horizons. xsd XML to Database XML Horizons SGYJP 2 View & Interpret XML Markers GZIP SGYJP 2 Automation Shapefiles GIS XML Sections
17 Interpret SGYJP 2
18 Interpret SGYJP 2
19 Ongoing Efforts SGYJP 2 • Shapefiles and ESRI automation • Google Earth KMZ • Drivers for Klein digital formats • GSF, XTF encoding ( XML schemas)
20 Research Efforts SGYJP 2 • Multiplane data => MBES; multichannel seismic • Wavelet transforms – custom based, KLT? • Web services => extend/adapt JPIP • Multiscale methods of data cleaning, characterization • Bathymetry gridding => 1 m, 2 m, 4 m =>multiscale • Rate versus distortion =>how much accuracy do you need? 1% , 0. 1%, etc
21 Software SGYJP 2 • Tools and schemas developed to disseminate GSC data • Tools and schemas are free • Single user, no distribute (need to measure impact) • Email request to bob. courtney@nrcan. gc. ca • No support – we have limited capacity • Welcome research partnerships to extend and continue efforts