Scientific Data Mining Emerging Developments and Challenges F

  • Slides: 12
Download presentation
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of

Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics University of Maryland - Baltimore County

Bioinformatics: A View from the Trenches

Bioinformatics: A View from the Trenches

Some Needed Developments: Simultaneous data mining of databases • Different types of information in

Some Needed Developments: Simultaneous data mining of databases • Different types of information in separate databases Gen. Bank, PDB, HIV-Web, Pub. Med, … Data selection Generic solution

Some Needed Developments: Simultaneous data mining of databases • Same information in different databases

Some Needed Developments: Simultaneous data mining of databases • Same information in different databases Meta-analysis e. g. Gene expression data Pre-processing different technologies sources of variability

Some Needed Developments: Data mining of heterogeneous databases Many different types of information in

Some Needed Developments: Data mining of heterogeneous databases Many different types of information in same database e. g. Patient records - diagnostics lab results, DNA, microarray 2 D gel images data compression features

Some Needed Developments: New Algorithms • Molecular evolution Phylogenetic reconstruction Large number of sequences

Some Needed Developments: New Algorithms • Molecular evolution Phylogenetic reconstruction Large number of sequences Statistical evolutionary models MCMC, E-M algorithm Parallel processors Emerging models

Some Needed Developments: New Algorithms • Proteomics images of 2 D gels clean up,

Some Needed Developments: New Algorithms • Proteomics images of 2 D gels clean up, alignment group composite image biological vs. experimental variability easily updated

Some Needed Developments: New Algorithms • Functional genomics microarray data background estimation (subjectivity) automation

Some Needed Developments: New Algorithms • Functional genomics microarray data background estimation (subjectivity) automation of analytical protocols

Some Challenges • Public domain software • Easily implementation on any computing platform •

Some Challenges • Public domain software • Easily implementation on any computing platform • Incorporation of state-of-the-art statistical techniques clustering, classification longitudinal models spatio-temporel models