Big Data Interest Group Smart Data Analytics Markus
Big Data Interest Group Smart Data Analytics Markus Götz Member of the Helmholtz Association Jülich Supercomputing Center (JSC) // University of Iceland Morris Riedel Jülich Supercomputing Center (JSC) // University of Iceland 03/10/2015 | RDA Fifth Plenary Meeting | San Diego, USA | Paradise Point Resort
Member of the Helmholtz Association Outline Introduction § Research Group, Research Area Smart Data Analytics Use Cases and Techniques § Classification, Land Cover Type, pi. SVM § Clustering, „Drunken Flies“, HPDBSCAN § Deep-Learning, Cortex Layers, pylearn CNN Conclusion § Results and RDA Context 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 2
Introduction Member of the Helmholtz Association Research Group § Jülich Supercomputing Center (HPC/HTC) § High Productivity Data Processing Group Research Area § Smart Data Analytics Methods § Evaluation and Development of Scalable Tools § Processing Platform Requirements § Application in Scientific Use Case 03/10/2015 Parallel Data Analytics Data Mining Methods Machine Learning Algorithms Smart Data Analytics Scientific Community Application Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Data Analzsis Tools Generic Data Methods 3
Classification // Land Cover Type Member of the Helmholtz Association Land Cover Type Problem § Collaboration with University of Iceland § Determine Land Cover Type in Satellite Images § Different Types - Road, Building, Vegetation, … Classification § Supervised Learning Technique § Known Set of Groups or Classes § Determine Membership of New Items 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 4
Classification // Land Cover Type Member of the Helmholtz Association Approach § Support Vector Machines (SVM) § Existing Solution: pi. SVM (MPI) § In-house Optimization of Parallel Code Area 03/10/2015 Standard deviation Inertia Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 5
Clustering // „Drunken Flies“ Member of the Helmholtz Association „Drunken Flies“ § Collaboration with University of Cologne § Investigate Influence of Genetics on Alcohol Consumption § Literally Make Flies Drunk Clustering § Unsupervised Learning Technique § Subdivide Database into Similar Groups § Similarity Metrics 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 6
Clustering // „Drunken Flies“ Approach § Image Processing Pipeline § HPDBSCAN § In-house Development Member of the Helmholtz Association (MPI+Open. MP) 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 7
Deep Learning // Cortex Layers Member of the Helmholtz Association Cortex Layer Problem § Institute for Neuro-Medicine (INM) at FZJ § Segment the Seven Layers of the Cortex § Images of Actual Brain Slices § Each Gigabytes (60 k square resolution) Deep Learning § Supervised Learning Technique (Classification) § More Advanced Mathemical Models § Various Flavors of Neural Networks 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 8
Deep Learning // Cortex Layers Member of the Helmholtz Association Approach § Convolutional Neural Networks § Existing Serial Toolkit § Pylearn 2/Theano § Scaling Issues 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 9
Conclusion Member of the Helmholtz Association Results § Big Data Challenge is Real! § Gap between Analytics Requirements and Actual Implementations Interest for RDA § Code is Open-source @ Git. Hub and Bitbucket § Data is Open and Freely Published @ B 2 SHARE § Choice of Dataformats § Question of Future Processing Platforms 03/10/2015 Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 10
Thanks you for the attention… Member of the Helmholtz Association Fifth Plenary Meeting 08 – 12 March 2015 San Diego, USA | Paradise Point Resort Contact: m. goetz@fz-juelich. de 03/10/2015 Slides: Big Data IG > Wiki > 5 th Plenary Markus Götz | Smart Data Analytics | Forschungszentrum Jülich 11
- Slides: 11