Microsoft Research SKYSERVER Jim Gray Distinguished Engineer Microsoft

  • Slides: 12
Download presentation
Microsoft Research SKYSERVER Jim Gray Distinguished Engineer Microsoft Research San Francisco

Microsoft Research SKYSERVER Jim Gray Distinguished Engineer Microsoft Research San Francisco

Microsoft Research Organization goal: Advance state of the art More than 700 staff, 55

Microsoft Research Organization goal: Advance state of the art More than 700 staff, 55 areas Labs in US, Europe, Asia Internationally recognized teams University organizational model Open research environment Close ties to universities Close working relations with development.

My Research Goal Information at your fingertips Bring all scientific literature and data online

My Research Goal Information at your fingertips Bring all scientific literature and data online Focus on large database issues, and scalable servers. Experiments & Instruments facts Other Archives facts Literature Simulations facts ts fac ? questions answers

World Wide Telescope Premise: Most Astronomy data is online The Internet is the world’s

World Wide Telescope Premise: Most Astronomy data is online The Internet is the world’s best telescope It has data on every part of the sky In every measured spectral band: As deep as the best instruments It is up when you are up. The “seeing” is always great (no working at night, no clouds no moons no. . ). It’s a smart telescope: links data with literature.

Sky. Server. SDSS. org Built with Johns Hopkins U. A modern archive Raw data

Sky. Server. SDSS. org Built with Johns Hopkins U. A modern archive Raw data in file servers Catalog data (derived objects) in Database 10 billon records, 2 TB Also used for education 150 hours of online Astronomy Interesting things Based on Web Services Spatial data search Cloned by other surveys (a design template)

Service Oriented Architecture Data Federations of Web Services Massive datasets live near their owners:

Service Oriented Architecture Data Federations of Web Services Massive datasets live near their owners: Near instrument software pipeline, apps DB Near data knowledge and curation Each Archive publishes a web service Schema: documents the data DB Methods on objects (queries) Uniform access to multiple Archives A common global schema DB DB DB Scientists get “personalized” extracts

Sky. Query Structure Each Sky. Node publishes Portal Schema Web Service Plans Query (2

Sky. Query Structure Each Sky. Node publishes Portal Schema Web Service Plans Query (2 phase) Data Query Web Service Integrates answers Is itself a web service Image Cutout SDSS 2 MASS Sky. Query Portal FIRST INT

Federation: Sky. Query. Net Combines 15 archives Send query to portal, portal joins data

Federation: Sky. Query. Net Combines 15 archives Send query to portal, portal joins data from archives. Problem: want to do multi-step data analysis (not just single query). Solution: Allow personal databases on portal Problem: some queries are monsters Solution: “batch scheduler” on portal server, Deposits answer in personal db.

Current Status: CERN → Pasadena Multi Stream tpc/ip 7. 1 Gbps ~900 MBps New

Current Status: CERN → Pasadena Multi Stream tpc/ip 7. 1 Gbps ~900 MBps New speed record @ http: //ultralight. caltech. edu/lsr-winhec/ Single Stream tpc/ip 6. 5 Gbps ~800 MBps File Transfer Speed ~450 MBps mbps per second 7, 000 6, 000 5, 000 4, 000 3, 000 2, 000 1, 000 0 2001 2002 2003 2004 2005

Challenge: Move Data from CERN to Remote Centers @ 1 GBps ~PBps Filter ~1

Challenge: Move Data from CERN to Remote Centers @ 1 GBps ~PBps Filter ~1 GBps • Disk-to-Disk Experiment CERN • gigabyte / second Tier 1~5 GBps data rates • 80 TB/day ~1 GBps Tier 2 • 30 petabytes by 2008 ~1 GBps Tier 3 Physics • 1 exabyte by 2014 data cache INP 3 RAL INFN FNAL … Tier 2 Tier 2 Institute Tier 4 Institute . 1 GBps Institute Workstations s p b G 9. 9 = 2 OC 19 Graphics courtesy of Harvey Newman @ Caltech

Summary Microsoft Research is active inside and outside Microsoft. World Wide Telescope is coming

Summary Microsoft Research is active inside and outside Microsoft. World Wide Telescope is coming Exemplifies service oriented architecture Built with web services and databases Has interesting spatial database algorithms 10 Gbps Networking is coming, x-64 is coming and we are investing to make them real. Details on my website: http: //research. microsoft. com/~Gray

© 2003 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only.

© 2003 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.