NET Database Technologies Using No SQL databases No
. NET Database Technologies: Using No. SQL databases
No. SQL – “Not only SQL” • Alternatives to the ubiquitous relational database which may be superior in specific application scenarios • Object-oriented databases (ODBMS) § They came, they saw, they. . § . . . didn’t conquer, but they are still around • No. SQL databases § The new kids on the block § General term applied to a range of different non-relational database systems § Largely emerging to meet the needs of large-scale Web 2. 0 applications
Object-oriented databases • ODBMSs use the same data model as object-oriented programming languages § no object-relational impedance mismatch due to a uniform model • An object database combines the features of an object- oriented language and a DBMS (language binding) § § treat data as objects • object identity • attributes and methods • relationships between objects extensible type hierarchy • inheritance, overloading and overriding as well as customised types
ODBMS history • Object Database Manifesto § Paper published in 1989 (Atkinson et. al) • Some ODBMS products § Early 1990 s: Gemstone, Objectivity § Late 1990 s: Versant, Object. Store, Poet , Matisse § 2000 s: db 4 o, Cache • ODMG (Object Data Management Group) § 1993: ODMG 1. 0 standard § 1997: ODMG 2. 0 § 1999: ODMG 3. 0, then ODMG disbanded § 2005: ODMG reformed, working towards new standard
ODMG • Object Database Management. Group (ODMG) founded in 1991 § standardisation body including all major. ODBMS vendors • Define a standard to increase the portability across different ODBMS products • Mirroring the SQL standard for RDBMS § Object Model § Object Definition Language (ODL) § Object Query Language (OQL) § language bindings • C++, Smalltalk and Java bindings
Characteristics of ODBMS • Support complex data models with no mapping issues • Tight integration with an object-oriented programming language (persistent programming language) • High performance in suitable application scenarios • Different products scale from small-footprint embedded db (db 4 o) to large-scale highlyconcurrent systems (e. g. Versant V/OD)
Persistence patterns and ODBMS • Some of Fowler’s patterns are specific to the use of a relational database, e. g. § Data Mapper § Foreign Key Mapping § Metadata Mapping § Single-table Inheritance, etc. • Some are not specific to the data storage model and are relevant when using an ODBMS, e. g. § Identity Map § Unit of Work § Repository § Lazy-Loading
db 4 o • Open-source object-database engine § Now owned by Versant § Complements their own V/OD product • Can be used in embedded or client-server modes § Embed in application simply by including DLLs • Native object database § Stores. NET (or Java) objects directly with no special requirements on classes § Other ODBMSs (e. g. V/OD) require classes to be marked as persistent through bytecode manipulation and also store class definitions § Tight integration with application, but trade-off in limited adhoc querying and reporting § Can replicate data to relational database if required
IObject. Container • IObject. Container interface is implemented by objects which provide access to database § IObject. Container is roughly equivalent to EF Object. Context § Unit of Work pattern if transparent persistence is enabled (see later) • Can access DB in embedded mode (direct file access) or client-server mode (local or remote) § IObject. Server instance required in client-server mode • IObject. Container instances created by factory classes, e. g. Db 40 Embedded • Queries on IObject. Container return IObject. Set (except LINQ queries)
Viewing data and ad-hoc querying • Object. Manager Enterprise § Visual Studio plug-in § Browsing and drag-and-drop queries • LINQPad § Need to include db 4 o DLLs and namespaces for stored classes § Executes LINQ queries and visualises results
db 4 o query APIs • Query-by-example (QBE) § Very limited - no comparisons, ranges, etc. • Simple Object Data Access (SODA) § Build query by navigating graph and adding constraints to nodes • Native Queries § Expressed completely in programming language § Type-safe § Optimised to SODA query at runtime if possible • LINQ § . NET version, not in Java (obviously)
Activation • Objects are stored in DB as an object graph • If db 4 o configured to cascade-on-activate (eager loading) then retrieving one object could potentially load a large number of related objects • Fixed activation depth limits depth of traversal of graph when retrieving objects § Default value is 5 • Can then explicitly activate related objects when needed • Lazy loading can be configured with transparent activation • Classes need to be “instrumented” at load time by running Db 4 o. Tool. exe § Code injected into assembly so that classes implement IActivatable interface
Update depth • Similar considerations apply to updates • Storing an updated object could cause unnecessary updates to related objects • Fixed update depth limits depth of traversal of graph when retrieving objects § Default value is 1 • Can configure transparent persistence which allows changes to be tracked § Only changed objects are updated in database § Behaves like change tracking in, for example, Entity Framework § Unit of Work
PI? • Stores POCOs without any need for mapping, so yes • Transparent Activation requires that classes implement a specific interface • But this is done at build time so domain classes don’t need any specific code • Has parallels with dynamic proxies in ORMs: § Classes are instances of domain classes, which have been modified ‘under the hood’ at build-time § Compare with dynamic proxy class which derive from domain classes and are created ‘under the hood’ at run-time
Further reading • www. odbms. org § Resource portal • Db 4 o Tutorial § included in product download • The Definitive Guide to db 4 o (Apress)
No. SQL databases • New breed of databases that are appearing largely in response to the limitations of existing relational databases • Typically: § Support massive data storage (petabyte+) § Distribute storage and processing across multiple servers • Contrast in architecture and priorities compared to relational databases • Hence term No. SQL • “Not only SQL” – absence of SQL is not a requirement
No. SQL features • Wide variety of implementations, but some features are common to many of them: • Schema-less • Shared-nothing architecture • Elasticity • Sharding and asynchronous replication • BASE, not ACID § Basically Available § Soft state § Eventually consistent
Map. Reduce • Algorithm for dividing a work load into units suitable for parallel processing • Useful for queries against large sets of data: the query can be distributed to 100’s or 1000’s of nodes, each of which works on a subset of the target data • The results are then merged together, ultimately yielding a single “answer” to the original query • Example: get total word count of a large number of documents § § Map: calculate word count of each document • Each node works on a subset of the overall data set • Results emitted to intermediate storage Reduce: calculate total of intermediate results
Brewer’s CAP theorem • Can optimize for only two of three priorities in a distributed database: • Consistency § All clients have same view of the data § Requires atomicity, transaction isolation • Availability § Every request received by a non-failing node must result in a response • Partition Tolerance § Partitions happen if certain nodes can’t communicate § No set of failures less than total network failure is allowed to cause the system to respond incorrectly
Implications of CAP theorem • Any two properties can be achieved • CP § If messages between nodes are lost then system waits § Possible that no response returned at all § No inconsistent data returned to client • CA § No partitions, system will always respond and data is consistent • AP § Response always returned even if some messages between nodes § Different nodes may have different views of the data
Implications of CAP theorem • Choose a database whose priorities match the application http: //blog. nahurst. com/visual-guide-to-nosql-systems
Using a No. SQL database in a. NET application • Application typically makes connection to remote cluster • Some (but not many) No. SQL databases are supported by native. NET clients § Handle “mapping” from. NET objects to data model • Many No. SQL databases are accessed through a REST interface § Application must construct request and handle response format, e. g. JSON § Application can be written in any suitable language • Azure Table Storage is Microsoft’s No. SQL storage for cloud -based applications • However the data is accessed, you need to understand the data model, which will be significantly different from a typical relational database or object model
No. SQL database types and examples • Key/value Databases § These manage a simple value or row, indexed by a key § e. g. Voldemort, Vertica • Big table Databases § “a sparse, distributed, persistent multidimensional sorted map” § e. g. Google Big. Table, Azure Table Storage, Amazon Simple. DB • Document Databases § Multi-field documents (or objects) with JSON access § e. g. Mongo. DB, Raven. DB (. NET specific), Couch. DB • Graph Databases § Manage nodes, edges, and properties § e. g. Neo 4 j, sones
Mongo. DB • Scalable, high-performance, open source, document- oriented database • Stores JSON-style (actually BSON) documents with dynamic schema • Replication, high-availability and auto-sharding • Supports document-based queries and map/reduce • Command line tools : § mongod – starts server as a service or daemon § mongo – client shell • Store documents defined as JSON • Retrieved documents form query displayed as JSON
Mongo. DB and HTTP • Admin console at http: //<server name>: 28017 • REST interface on http: //<server name>: 28018 § Enabled by starting server with mongod --rest § Server responds to RESTful HTTP requests, e. g. • http: //127. 0. 0. 1: 28017/company/Employee/? filter_Name= Fernando § Response is in JSON format § Could be consumed by client-side code in Ajax application
Mongo. DB. NET driver • Can access documents as instances of Document class • Represents document as key-value pairs • Or, can serialize POCOs to database format (JSON) • Deserialize database documents to POCOs • Supports LINQ queries • Map. Reduce queries can be expressed as LINQ queries
Mongo. DB schema design • Collections are essentially named groupings of documents § Roughly equivalent to relational database tables • Less "normalization" than a relational schema because there are no server-side joins • Generally, you will want one database collection for each of your top level objects § Don’t want a collection for every "class" - instead, embed objects relational document
Document example • Save: • Query: http: //www. 10 gen. com/video/mongosv 2010/schemadesign
Mongo. DB in C# applications - PI? • Up to a point • Collection class needs Id property of a specific type (Mongo. DB. Oid) • Object model needs to be designed with document schema in mind
Further reading • http: //nosql-database. org/ • http: //www. nosqlpedia. com/ • http: //www. mongodb. org/ • http: //www. codeproject. com/KB/database/Mongo. DBCS. aspx § Nice code example for C# and Mongo. DB
- Slides: 30