Development of DNA Profile Information Retrieval System December
Development of DNA Profile Information Retrieval System 유전자 프로필 검색 시스템 구축 December 1, 2000 Hwan-Seung Yong, Hun-Joo Lee * Dept. of Computer Science and Engineering, Ewha Womans University ** Chem. I. Net Inc.
Introduction l DNA Profile Database can be heavily used – For Scientific Investigation – For Research – For Medicine l Trends – DNA Profile Database System in England, USA – DNA Profile Venture business starts in Korea IDGENE
Purpose of Development l Build DNA profile Database for Identification in NISI (National Instibute of Scientific Investigation) – – l Data insert/retrieval Web based approach Can be used from NISI branches Additional Personal Profile Data should also managed Asset for further research – Use Data Mining Technique for high level analysis
System Characteristics Current Web Database Technology l High Retrieval Performance l Use International Standard l – For Image Data format – For interoperability with other system
System Diagram
Operating Environment Server OS DBMS Web Server Client OS Software tools MS Windows 2000 Server MS-SQL Server 7. 0 MS Internet Information Server MS Windows 95 above Internet Explorer 5. 0 above Rational Rose '98 MS Visual C++6. 0 MS Visual Interdev
CE Visualizer의 사양 -1 CE (Capillary Electrophoresis) Visualizer and Input Interface Manage GENO-TYPER output for DNA Locus Analysis l USE-CASE Model of CE Visualizer – Object-oriented Visual Tool l Rational Rose’s USE CASE VIEW Active X Component for CE data input – ASTM AIA Andi(Analytical Data Exchange Format) Category 3. 0 – International Standard
Experiment << read. cdf >> Analyst << include >> Select Andi. cdf file Chrom. View Entry System Visualize chromatogram << include >> Register chromatogram Visualize Event Specify ISTD peak Identify Chromatogram Extract prelimilary features
CE Visualizer의 사양 -3 Object Model for CE Visualizer Class and Class Relationshop Model l Use UML(Unified Modeling Language) l Class Component l – – – – Raw. Data. Manager class Result. Data. Manager class Chrom. Infor. Center. Manager class Report. Manager class Chrom. Graph. Manager class Sample. List class Feature. Extractor class Error. Manager class
Raw. Data. Info Result. Data Chrom. Graph n n 1 1 Result. Data. Manager Raw. Data. Manager Chrom. Graph. Manager : Result. Data : Chrom. Graph Chrom. Info. Center : Raw. Data. Manager : Result. Data. Manager Chrom. Report. Manager n Feature. Extractor 1 Chrom. Info. Center. Manager Chrom. Typed. Array : Chrom. Info. Center Report. Manager CTyped. Ptr. Array Sample. List : Sample 1 n Sample Error. Manager
Developed System: Window Interface l Starting Page and Login l Input of DNA Profile Data l Retrieval of DNA Profile l Detail Information Retrieval l Report Writing l System Management
시스템 개발내용-1 Starting Page and Login
시스템 개발내용-2 DNA Profile Data Registration Suspect Personal Information and DNA Profile Registration l CE (Capillary Electrophoresis) Raw chromatogram Registration l DNA Locus Registration based on Category l – Locus Name and Data – Probability Data for Korean
Suspect Information and DNA Profile registration
CE(Capillary Electrophoresis) Raw chromatogram Registration
DNA Locus Registration Interface
시스템 개발내용-3 DNA Profile database retrieval DNA Profile Retrieval from Personal Data l Suspect Information Retrieval from Unsolved Case DNA Profile Data l Chromatogram Retrieval using CE-Visualizer l
DNA Profile data retrieval using Personal Profile data
Suspect Retrieval from Unsolved Case DNA Profile Data
Registration for Center and Branches of NISI
User Management, Compatible with LIMS in NISI
Performance Evaluaton l At present, only 1, 000 DNA Profile Data – This data will increase rapidly l Example DNA profile data generation – Based on Probability of DNA Locus Distribution – Number of Generated DNA Profile: 500, 000 – Table size : 500, 000 tuples, 61 MB – For Index: 9 MB for each Locus – Number of Locus: 20 l Hardware: – PC Server (Dual Pentium CPU) – H/W Cost: 5, 000 Won
검색 성능 평가 Retrieval time for 20 Exact Match (All under 1. 4 sec)
검색 성능 평가 Retrieval time for 19 Exact Match (All under 1. 4 sec)
검색 성능 평가 Database Size with Number of Index (Under 250 MB)
Conclusions Develop DNA Profile Retrieval System l Based on State of Art Web Database Technology l All H/W and S/W Environment are Cheap l – PC Server, Microsoft SQL Server, Windows NT Extendibility for Later Upgrade l Achieve High Performance l – For 500, 000 DNA Profile Data, l Without index: under 1. 4 sec l With index: 0. 2 sec.
Further Research and Development l OLAP(Online Analytical Processsing) – Multi-dimensional Analysis l DNA Data Mining – Knowledge Discovery – New Fact Finding Technology l Some Critical Mass DNA Profile Data is required
- Slides: 27