Information Management IBM NikolayKulikovru ibm com IBM Software
® Information Management Новые тенденции управления данными в продуктах компании IBM Nikolay_Kulikov@ru. ibm. com IBM Software Group © 2005 IBM Corporation
Information Management Agenda § Part I – XML: Почему и Где § Part II – Native XML в DB 2 Viper § Part III – Немного о других новых возможностях DB 2 Viper § Part III – Informix Dynamic Server 10 § Part IV – Вопросы и ответы 2
Пример 1: Финансовые данные (FIXML) Information Management §Покупка 1000 акций компании IBM. 8=FIX. 4. 2^9=251^35=D^49=AFUNDMGR^56=ABROKER^34=2 ^52=20030615 -01: 14: 49^11=12345^1=111111^63=0^64=2003 0621^21=3^110=1000^111=50000^55=IBM^48=459200101^22= 1^54=1^60=2003061501: 14: 4938=5000^40=1^44=15. 75^15=USD ^59=0^10=127 Старый протокол FIX Новый FIXML протокол §Расширяемый §Снижает затраты на поддержку и разработку 5
Information Management XML-Enabled Databases: две опции Shredding CLOB/Varchar XML DOC Извлечение определенных элементов/ атрибутов XML DOC Фиксированное отображение “Decompositon” Shredder Сторонние таблицы XML DOC (regular tables for faster lookup) varchar CLOB Обычные таблицы 7
Information Management Нарезка: Простой пример <DEPARTMENT deptid="15" deptname="Sales"> <EMPLOYEE> <EMPNO>10</EMPNO> <FIRSTNAME>CHRISTINE</FIRSTNAME> <LASTNAME>SMITH</LASTNAME> <PHONE>408 -463 -4963</PHONE> <SALARY>52750. 00</SALARY> </EMPLOYEE> <EMPNO>27</EMPNO> <FIRSTNAME>MICHAEL</FIRSTNAME> <LASTNAME>THOMPSON</LASTNAME> <PHONE>406 -463 -1234</PHONE> <SALARY>41250. 00</SALARY> </EMPLOYEE> </DEPARTMENT> 9
Information Management Нарезка: Изменение схемы… <DEPARTMENT deptid="15" deptname="Sales"> <EMPLOYEE> <EMPNO>10</EMPNO> <FIRSTNAME>CHRISTINE</FIRSTNAME> <LASTNAME>SMITH</LASTNAME> <PHONE>408 -463 -4963</PHONE> <PHONE>415 -010 -1234</PHONE> <SALARY>52750. 00</SALARY> </EMPLOYEE> <EMPNO>27</EMPNO> <FIRSTNAME>MICHAEL</FIRSTNAME> <LASTNAME>THOMPSON</LASTNAME> <PHONE>406 -463 -1234</PHONE> <SALARY>41250. 00</SALARY> </EMPLOYEE> </DEPARTMENT> Требует: • Нормализации текущих данных ! • Изменение отображения • Изменения приложения Costly! 10
Information Management Обзор Интерфейса § Определение данных create table dept(dept. ID int, deptdoc xml); § Вставка insert into dept(dept. ID, deptdoc) values (? , ? ) § Индексирование create index xmlindex 1 on dept(deptdoc) generate key using xmlpattern ‘/dept/name’ as varchar(30); § Извлечение select deptdoc from dept where dept. ID = ? § Запросы select dept. ID, xmlquery('$d/dept/name' passing deptdoc as “d") from dept where dept. ID <> “PR 27”; 14
Information Management Разбор документа XML Documents парсится в соответствии с XQuery Data Model Создаются дополнительная информация в каталоге • Строки и пути хранятся в каталоге DB 2 • Компрессия строк сохраняет место dept employee String table 0 4 1 5 2 3 dept employee name id phone office SYSIBM. SYSXMLSTRINGS employee Path table id name phone 901 John Doe 408 -555 1212 office 344 id name phone 902 Peter Pan 408 -555 9918 office 216 0 1 2 3 4 5 6 / /0 /0/4/5 /0/4/1 /0/4/2 /0/4/3 SYSIBM. SYSXMLPATHS 15
Information Management Индексирование INDEX CREATE index-name ON table-name(xml-column-name) UNIQUE GENERATE KEY USING XMLPATTERN AS SQL VARCHAR (integer) VARCHAR (HASHED) DOUBLE DATE TIMESTAMP xmlpattern: / // element-tag * / // xmlpattern XPath без предиката text() @attribute-tag @* create index idx 1 on dept(deptdoc) generate key using xmlpattern '/dept/employee/name' as sql varchar(35) 19
Information Management Поиск по документу используя XQuery § Полная поддержка XQuery и XPath 2. 0 4 Включаяя FLWOR встроенный (nested) FLWOR § XQuery по всем документам в столбце 4 FOR $d in db 2 -fn: xmlcolumn(‘deptdoc’)… § XQuery по документам используя SQL предикаты 4 FOR $d in db 2 -fn: sqlquery(“select deptdoc from dept where dept. ID LIKE ‘PR%’ ”)… 21
Information Management Выражение FLWOR § FOR: проходит по последовательности документов, § LET: привязка переменных к элементам § WHERE: фильтрация элементов итерации § ORDER: перегруппирует элементы итерации § RETURN: конструирует результаты запроса FOR $movie in db 2 -fn: xmlcolumn(‘table 1. movies’) <movie> LET $actors : = $movie//actor <title>Chicago</title> WHERE $movie/duration > 90 <actor>Renee Zellweger</actor> ORDER by $movie/@year <actor>Richard Gere</actor> RETURN <movie> <actor>Catherine Zeta-Jones</actor> {$movie/title, $actors} </movie> 22
Information Management <dept bldg=“ 101”> <employee id=“ 901”> <name>John Doe</name> <phone>408 555 1212</phone> <office>344</office> </employee> <employee id=“ 902”> <name>Peter Pan</name> <phone>408 555 9918</phone> <office>216</office> </employee> </dept> Выражение FLWOR create table dept (dept. ID char(8), deptdoc xml); for $d in xmlcolumn(‘deptdoc’)/dept where $d/@bldg = 101 return <namelist> {$d/employee/name} </namelist> for $d in xmlcolumn(‘deptdoc’)/dept where $d/@bldg = 101 return $d/employee/name <namelist> <name>John Doe</name> <name>Peter Pan</name> </namelist> for $d in xmlcolumn(‘deptdoc’)/dept where $d/@bldg = 101 return $d/employee/name/text() <name>John Doe</name> <name>Peter Pan</name> Результат не обязательно XML ! John Doe Peter Pan 23
Information Management XQuery с объединением for $book in db 2 -fn: xmlcolumn('BOOKS')/book for $entry in db 2 -fn: xmlcolumn('REVIEWS')/entry where $book/title = $entry/title return <review> {$entry/review/text()} </review>; 24
Information Management SQL/XML Dept Unit dept. ID unit. ID 100 10 10 220 20 240 20 310 30 deptdoc <dept bldg="G"> unit. ID Empcount bldg <name>Engineering</name> 10 234 G <manager>Anjul</manager> 20 123 H 30 32 H <backup>Susan</backup> <admin>Mary</admin> <chargecode>CW 345</chargecode> </dept> select d. dept. ID, xmlquery(‘$deptdoc/dept/name’ passing d. deptdoc as “deptdoc“), u. empcount from dept d, unit u where d. unit. ID = u. unit. ID and u. empcount > 200 and xmlquery(‘$deptdoc/dept/@bldg’ passing d. deptdoc as “deptdoc“) = u. bldg and xmlexists(‘$deptdoc/dept/name’ passing d. deptdoc as “deptdoc“) 26
Information Management SQL/XML – публикация XML документов FIRSTNAME LASTNAME DEPARTMENT SEAN LEE A 00 MICHAEL JOHNSON B 01 VINCENZO BARELLI A 00 SELECT XMLELEMENT (NAME "Department", XMLATTRIBUTES (e. department AS "name" ), XMLAGG ( XMLELEMENT(NAME "emp", e. firstname))) AS "department_list" FROM employee e GROUP BY e. department; department_list <Department name="A 00"> <emp>VINCENZO </emp> <emp>SEAN</emp> </Department> <Department name="B 01"> <emp>MICHAEL</emp> </Department> 27
Information Management XMLTABLE - XML->relational SELECT X. * from XMLTABLE (‘db 2 -fn: xmlcolumn(“PORDERS. PO”)//customer’ COLUMNS “CID” INTEGER PATH ‘@id’, “Name” VARCHAR(30) PATH ‘name’, “Zip. Type” CHAR(2) PATH ‘zip/@type’, “Zip” XML PATH ‘zip’ ) AS “X” CID Name Zip. Type Zip 1325 Bobby US <zip>33129<zip> 4711 null US <zip>95023<zip> 28
Information Management Функции публикации SQL/XML § Скалярные функции 4 XMLELEMENT – generate XML element 4 XMLATTRIBUTES - used within XMLELEMENT, specifies attributes 4 XMLFOREST - produces a forest of XML elements from SQL values 4 XMLCONCAT - concatenates a variable number of XML values 4 XMLNAMESPACES – produces a namespace declarations § Функции агрегации 4 XMLAGG - to group or aggregate XML data § Табличные 4 XMLTABLE – materializes a table from XML documents § Функции преобразования типов 4 XMLCAST - converts between XML data type and standard relational types 4 XMLSERIALIZE – converts XML data type to serialized XML as a char/varchar/clob/blob 29
Information Management JDBC API Enhancements for XML § Поддержка нового XML типа § Поддержка нового интерфейса com. ibm. db 2. jcc. DB 2 Xml § Поддержка XQuery § Поддержка регистрации XML Schema 4 Type Code java. sql. Types. OTHER до тех пор пока не появится новый тип JDBC java. sql. Types. XML 30
Information Management Sample Java Program create table dept (id char(8), doc xml); Prepared. Statement stmt 1 = con. prepare. Statement("Select doc from dept where id = ‘ 001’ ”); Result. Set rs = stmt 1. execute. Query(); rs. next(); String xml. String = rs. get. String(1); Input. Stream is = rs. get. Binary. Stream(1); com. ibm. db 2. jcc. DB 2 Xml xml = (com. ibm. db 2. jcc. DB 2 Xml) rs. get. Object (1); String xml. String = xml. Out. get. DB 2 String(); Input. Stream is = xml. Out. get. DB 2 Xml. Binary. Stream("ISO-10646 -UCS-2”); … = xml. Out. get. DOM() or xml. Out. get. SAX(); //future Prepared. Statement stmt 2 = con. prepare. Statement("update dept set doc = ? where id = ‘ 001”); stmt 2. set. Object(1, xml. Out); stmt 2. set. Binary. Stream(1, new File. Input. Stream(file), (int)file. length()); stmt 2. set. String(1, xml. String); stmt 2. execute. Update(); 31
Information Management Sample Java Program – XQuery create table customer (customerinfo xml); Prepared. Statement stmt=null; Result. Set rs; String sqls = "XQUERY "+ "for $info in db 2 fn: sqlquery('SELECT info FROM CUSTOMER WHERE cid > ? ') "+ "where $info/*: customerinfo/*: addr[@country=" England"] "+ "return $info/*: customerinfo/*: phone/text()"; stmt = conn. prepare. Statement(sqls); stmt. set. Int(1, cid. To. Filter); rs = stmt. execute. Query(); 32
Information Management . NET DB 2. NET XML Support § DB 2. NET Provider § DB 2 Xml - Инкапсулирует тип XML 4 Любой доступ к XML типу осуществляется через другие объекты основанные на XML (Xml. Reader, XPath. Document) 4 Методы доступа к XML столбцам § DB 2 Xml. Get. Xml. Reader § DB 2 Xml. Get. String § DB 2 Xml. Get. Bytes § DB 2 Data. Reader § DB 2 Command § XML Input и Output Parameters 4 DB 2 Type. Xml § Поддержка XQuery 4 DB 2 Xml. Command 4 DB 2 Xml. Adapter § Поддержка Xml. Schema 33
Information Management Sample. NET Program - XQuery create table employee (empinfo XML)) DB 2 Xml. Command xcmd = new DB 2 Xml. Command(); //Retrieve the name of all employees in department #100 xcmd. Command. Text = “for $e in db 2 -fn: xmlcolumn (‘EMPLOYEE. EMPINFO)/employee where $e/deptno = 100 return {$b/name}” //Set the root tag xcmd. Root. Tag = “deptlist”; //Retrieve the result of the xquery expression as an Xml. Reader //the result will be wrapped with <deptlist> </deptlist> Xml. Reader xrdr = xcmd. Execute. Xml. Reader(); //Retrieve the result of the xquery expression as a Stream xmlstream = xcmd. Execute. Stream() 34
Information Management Проверка с помощью XML Schemas § Проверка опциональна и осуществляется на уровне документа 4 Без проверки § insert into dept(deptdoc) values (? ) 4 С проверкой § insert into dept(deptdoc) values (xmlvalidate(? )) § Схема может быть перезаписана и указывать на схему в репозитории DB 2 4 insert into dept(deptdoc) values ( xmlvalidate(? according to xmlschema id “ibm. invoice”) 4 insert into dept(deptdoc) values ( xmlvalidate(? according to xmlschema uri ‘http: //my. world. com’) 35
Information Management Управление XML Schema § Репозиторий XML Schema (XSR) 4 Хранит зарегистрированные схемы 4 Управляется как часть каталога DB 2 4 Таблицы и представления создаются автоматически § SYSCAT. XSROBJECTS, SYSCAT. XSROBJECTCOMPONENTS § SYSCAT. XSROBJECTAUTH, SYSCAT. XSROBJECTHIERARCHIES 4 Интерфейс командной строки, API, Хранимые процедуры Регистрация XSR_REGISTER(rschema, name, schemalocation, content, docproperty) Add XSR_ADDSCHEMADOC(rschema, name, schemalocation, content, docproperty) Complete XSR_COMPLETE(rschema, name, schemaproperties, isusedforshred) 36
Information Management Аннотация схемы для Нарезки § Отображение из XML в реляционные таблицы 4 Отображение контролируется аннотациями XML в DB 2 XSR Аннотация Цель db 2 -xdb: default. SQLSchema The default schema for the table names db 2 -xdb: row. Set The table name the element/attribute should be mapped to db 2 -xdb: column The column name of the table the item should be mapped to db 2 -xdb: content. Handling Text value, string value or xml fragment to be treated as the content to be inserted into the database db 2 -xdb: truncate Truncate content if the size is greater than the specified size in the database db 2 -xdb: normalization White space treatment db 2 -xdb: expression Custom expressions to apply to the data before insert db 2 -xdb: condition Conditions to apply to rows before insert db 2 -xdb: table. Mapping Multiple mappings, to same or different tables, for an element/attribute db 2 -xdb: dependence Referential constraints 37
Information Management <ipo: purchase. Order …. . order. Date="1999 -12 -01“ ELECTRONICITEMS order. ID=“ 19991201 -AZFG”> ………. . <item part. Num="833 -AA"> <product. Name>Phone </product. Name> <quantity>1</quantity> <USPrice>132. 95</USPrice> <ship. Date>1999 -12 -05</ship. Date> </item> <item part. Num=“ 533 -AC"> <product. Name>Cycle</product. Name> ORDERID PARTNUM QTY PRICE 19991201 -AZFG 833 -AA 1 132. 95 <quantity>1</quantity> <USPrice>149. 95</USPrice> <ship. Date>1999 -12 -05</ship. Date> </item></items></ipo: purchase. Order> …<sequence> <attribute name="order. ID" type=“xs: string“ <element name="product. Name“ type="string“/> <element name="quantity“ db 2 -xdb: row. Set = “PURCHASE_ORDER” db 2 -xdb: column type=“ipo: derived. Positive. Integer. Type” db 2 -xdb: row. Set=“ELECTRONICITEMS” db 2 -xdb: column=“QTY”/> = “ORDERID” > <annotation> <appinfo> <db 2 -xdb: table. Mapping> ……. <!– ignoring mapping of PRICE --> …. <attribute name="part. Num" type="ipo: SKU“ db 2 -xdb: row. Set =” ELECTRONICITEMS” <db 2 -xdb: row. Set>ELECTRONICITEMS</db 2 -xdb: row. Set> <db 2 -xdb: column>ORDERID</db 2 -xdb: column> </db 2 -xdb: table. Mapping> db 2 -xdb: column=“PARTNUM” </appinfo> db 2 -xdb: expr =“udf_convert. To. Internal. Part($SELF)” </annotation> db 2 -xdb: cond=“udf_is. Electronic. Item($SELF) = </attribute> ‘true’/>……. . 38
Information Management DB 2 Development Workbench Project Explorer Server Explorer Visualization & Editors Properties & Output Area § Eclipse based § Support for XML type in tables, views, SPs and UDFs § Support for XML Index § Support for XML type in SQL Query Builder § Graphical XML Query Builder § XSD (XML Schema) Editor § XML viewer § Support for XML Schema registration § XML & XSL Parsers § XML<->Relational mapping 39
Information Management Java XQuery Builder § § Graphical XQuery builder § Can view in progress XQuery statement § § Save Statements for re-use Supports selecting from actual documents to build query Execute and view results 40
Information Management MS Visual Studio. NET – DB 2 XML add-in . NET § Server Explorer § XML Editor § XML Schema Editor § XSR Registration § XML Index Builder 41
Information Management Partner Solutions Just. System 43
Information Management STMM в действии – Удаление важного индекса TPCH Query 21 - After drop index - Average times for the 10 streams 7000 Avg = 6205 Time in seconds 6000 Reduced 63% 5000 Indexes Dropped 4000 Avg = 2285 3000 2000 Avg = 959 1000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Order of execution 46
Information Management LBAC Hierarchy Update/Read A 255 B 250 B 1 100 C 254 A 255 D 200 A 1 254 B 2 100 B 1 -1 50 B 250 B 1 100 C 254 D 253 B 2 100 B 1 -2 50 B 1 -1 50 B 2 -3 50 B 1 -2 50 B 2 -3 50 48
Information Management Hybrid Partitioning HASH 999 Machines 32 K Partitions RANGE 64 G A-C 64 G D-M 64 G N-Q R-Z MDC 49
Information Management MDC изнутри Примеры обработки запросов Block INDEX ANDing CREATE TABLE SALES (Customer VARCHAR(80), Region CHAR(5), Year INT) ORGANIZE BY DIMENSIONS (Region, Year) Region East North South West SELECT * FROM SALES WHERE Region = 'West' AND Year = 00 5 th Block BID = 20 -0 0 th Block BID = 0 -0 Pg 20 Pg 1 Pg 22 Pg 3 Pg 23 • • • 'West' BIDs: 16 -0, 20 -0 '00 BIDs: 16 -0 Result of AND: 16 -0 Retrieve block 16 -0 in one I/O Mini-relation scan retrieves all records in block "AND"ing RID and Block INDEXes CREATE INDEX i 1 ON SALES (Customer) SELECT * FROM SALES WHERE Region='East' AND Customer='Joe' 97 98 99 Year 00 01 • 'East' BIDs: • 'Joe' RIDs: 0 -0, 4 -0 0 -3, 0 -211, 2 -97, 7 -1, 11 - 33 • Filter out RIDs not in page range indicated by BIDs 0 -3, 0 -211, 2 -97, 7 -1 • Directly fetch those records 50
Information Management Row Compression используется LZV Fred, Dept 500, 10000, Plano, TX, 24355, …John, Dept 500, 20000, Plano, TX, 24355 Компрессия множества строк в одну страницу не эффективно с точки зрения БД. 51
Information Management Row Compression Using Side Tables Fred, Dept 500, 10000, Plano, TX, 24355… 01 Dept 500 02 Plano, TX, 24355 … … Fred, (01), 10000, (02), John, (01), 20000, (02) John, Dept 500, 20000, Plano, TX, 24355, Site 3 Сторонние таблицы содержат повторяющуюся информацию из строк. 52
Information Management DB 2 - More Compression Ratios (Customer Data) Compression Type 32 KB Page Count Space Required on Disk No compression 5893888 179. 9 GB Row compression 1392446 42. 5 GB % Pages Saved: 76. 4% 53
® Information Management Informix Dynamic Server 10 IBM Software Group © 2005 IBM Corporation
Information Management Increase in Nightly Builds and Tests (IDS(4), CSDK, GLS, 4 GL) 74 4 57000 4000 56
Information Management IDS Уменьшение кол-ва ошибок 2001 2002 2003 2004 Defect Backlog IDS 9. 21 9. 30. UC 7 – 250 fixes IDS 9. 30 7. 31. UD 6 7. 31. UD 7 IDS 7. 31 9. 40. UC 3 IDS 9. 40 57
Information Management IBM Software Portfolio – использование компонентов Инвестиции В продукты Product Specific Investment Product Specific Investment Общие комопоненты Lotus DB 2/Informix Lotus Web. Sphere DB 2/Informix Web. Sphere Tivoli DB 2/Informix Tivoli Rational components DB 2/Informix Re-factor to SWG Product Offerings Общие возможности Componentization and Formation of Substrates Начальные продукты 58
Information Management 59
- Slides: 58