Volunteer Computing with BOINC David P Anderson Space
Volunteer Computing with BOINC David P. Anderson Space Sciences Laboratory University of California, Berkeley
High-throughput computing • Goal: finish lots of jobs in a given time • Paradigms: – – – Supercomputing Cluster computing Grid computing Cloud computing Volunteer computing
Cost of 1 TFLOPS-year • Cluster: $145 K – Computing hardware; power/AC infrastructure; network hardware; storage; power; sysadmin • Cloud: $1. 75 M • Volunteer: $1 K - $10 K – Server hardware; sysadmin; web development
Performance • Current – – • 500 K people, 1 M computers 6. 5 Peta. FLOPS (3 from GPUs, 1. 4 from PS 3 s) Potential – – – 1 billion PCs today, 2 billion in 2015 GPU: approaching 1 TFLOPS How to get 1 Exa. FLOPS: • – 4 M GPUs * 0. 25 availability How to get 1 Exabyte: • 10 M PC disks * 100 GB
History of volunteer computing 1995 2000 2005 now distributed. net, GIMPS Applications SETI@home, Folding@home Climateprediction. net Predictor@home IBM World Community Grid Einstein@home Rosetta@home. . . Academic: Bayanihan, Javelin, . . . Middleware Commercial: Entropia, United Devices, . . . BOINC
The BOINC computing ecosystem projects volunteers LHC@home CPDN attachments WCG • Projects compete for volunteers • Volunteers make their contributions count • Optimal equilibrium
What apps work well? • Bags of tasks – – – • simulations with perturbed initial conditions compute-intensive data analysis Native, legacy, Java, GPU – • parameter sweeps soon: VM-based Job granularity: minutes to months
Data size issues • Most current projects not data-intensive • Probably works for data-intensive also Commodity Internet Institution ~ 1 Gbps non-dedicated underutilized ~ 1 Mbps (450 MB/hr) possibly sporadic non-dedicated underutilized
Example projects • Einstein@home • Climateprediction. net • Rosetta@home • IBM World Community Grid • GPUGRID. net • Primegrid
Creating a volunteer computing project • Set up a server • Port applications, develop graphics • Develop software for job submission and result handling • Develop web site • Ongoing: – – publicity, volunteer communication system, DB admin (Linux, My. SQL)
How many CPUs will you get? • Depends on: – – – PR efforts and success public appeal availability of internal resources • 12 projects have > 10, 000 active hosts • 3 projects have > 100, 000 active hosts
Organizational issues • Creating a volunteer computing project has startup costs and requires diverse skills • This limits use by individual scientists and research groups • Better model: umbrella projects – Institutional • – Corporate • – Lattice, VTU@home IBM World Community Grid Community • Almere. Grid
Summary • Volunteer computing is an important paradigm for high-throughput computing – – price/performance potential • Low technical barriers to entry (due to BOINC) • Organizational structure is critical • Use GPUs if developing new app
- Slides: 13