Data Quality Data Cleaning and Treatment of Noisy

  • Slides: 22
Download presentation
Data Quality, Data Cleaning and Treatment of Noisy Data DIMACS Workshop November 3 -4,

Data Quality, Data Cleaning and Treatment of Noisy Data DIMACS Workshop November 3 -4, 2003 Organizer: Tamraparni Dasu, AT&T Labs - Research

Workshop • Talks cover different aspects of the complex DQ issue • Outstanding set

Workshop • Talks cover different aspects of the complex DQ issue • Outstanding set of speakers from academia, industrial labs and industry • Cover theoretical, methodological, applied aspects – case studies! • From a wide range of disciplines and areas

Welcome!

Welcome!

Rene Miller • University of Toronto • Renee is an Associate Professor of Computer

Rene Miller • University of Toronto • Renee is an Associate Professor of Computer Science at the University of Toronto. S. B. , Mathematics, MIT. S. B. , Cognitive Science, MIT. Ph. D. , Computer Science, U. Wisconsin-Madison. • Heterogeneous databases, data mining, and data warehousing. • “Managing Inconsistency in Data Exchange and Integration”

Grace Zhang • Morgan Stanley Institutional Equity Division IT. Master of Philosophy in Computer

Grace Zhang • Morgan Stanley Institutional Equity Division IT. Master of Philosophy in Computer Science from Columbia University, and a Master and B. S. in Computer Science from Zhongshan University, China. • Develop tools to check data quality issues in equity trading data, design and build the standard destination referential data repository. • “Data Quality in Trading Surveillance”

Ted Johnson • AT&T Labs – Research • Database Research department. B. S. in

Ted Johnson • AT&T Labs – Research • Database Research department. B. S. in Mathematics, Johns Hopkins University, Ph. D. in Computer Science, New York University, 1990. • Data warehousing and data mining • “Bellman - A Data Quality Browser “

Ron Pearson • Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson

Ron Pearson • Daniel Baugh Institute for Functional Genomics and Computational Biology, Thomas Jefferson University. B. S. in physics from the University of Arkansas at Monticello and M. S. E. E. and Ph. D in electrical engineering from M. I. T. in 1982. • Design and analysis of nonlinear digital filters, exploratory data analysis and the validation of analytical results. • “The Data Cleaning Problem -- Some Key Issues and Practical Approaches”

Dhammika Amaratunga, Javier Cabrera, Nandini Raghavan • Johnson & Johnson, Rutgers University, Johnson &

Dhammika Amaratunga, Javier Cabrera, Nandini Raghavan • Johnson & Johnson, Rutgers University, Johnson & Johnson • “Pre-processing of Microarray Data”

S. Muthukrishnan • Rutgers University, AT&T Labs – Research • Associate Professor of Computer

S. Muthukrishnan • Rutgers University, AT&T Labs – Research • Associate Professor of Computer Science • Design and analysis of algorithms • “Checks and Balances: Monitoring Data Quality Problems in Network Traffic Databases”

T. Bonates, P. Hammer, A. Kogan, and I. Lozina • Rut. COR, Rutgers University

T. Bonates, P. Hammer, A. Kogan, and I. Lozina • Rut. COR, Rutgers University • Operations Research • Maximum Patterns and Outliers in the Logical Analysis of Data (LAD)

Jiawei Han • Professor, Simon Fraser University. Currently at University of Illinois, UC. Ph.

Jiawei Han • Professor, Simon Fraser University. Currently at University of Illinois, UC. Ph. D. from University of Wisconsin, Madison in 1985. • Data mining (knowledge discovery in databases), data warehousing, spatial databases, multimedia databases, deductive and object-oriented databases, and logic programming • “Data Mining: A Powerful Tool for Data Cleaning”

Jon Hill • British Telecommunications • Jon leads a team of information experts to

Jon Hill • British Telecommunications • Jon leads a team of information experts to deliver solutions within asset management, process control and billing assurance. Jon uses a wide range of information quality tools within projects and has extensive experience in investigation and solving IQ problems. • “A $220 Million Success Story”

G. Vesonder, J. Wright & T. Dasu • AT&T Labs - Research • Head

G. Vesonder, J. Wright & T. Dasu • AT&T Labs - Research • Head of Adaptive Systems research • AI, Knowledge Engineering, Expert Systems • “Life Cycle Datamining”

Andrew Hume • AT&T Labs – Research • Very large data systems, string searching,

Andrew Hume • AT&T Labs – Research • Very large data systems, string searching, performance measurement • Tamed many legacy systems • “Managing Data Streams”

Bing Liu • Associate Professor at National Singapore University, on leave at University of

Bing Liu • Associate Professor at National Singapore University, on leave at University of Illinois at Chicago • Data mining and knowledge discovery; web, text and image mining; Bioinformatics • Web page cleaning for web data mining

R. K. Pearson and M. Gabbouj • Collaboration with Moncef Gabbouj from the Tampere

R. K. Pearson and M. Gabbouj • Collaboration with Moncef Gabbouj from the Tampere University of Technology in Finland. • “Relational Nonlinear FIR Filters”

Thank you!

Thank you!