Mongo DB Introduction Installation Execution By Prof B
Mongo. DB Introduction, Installation & Execution By Prof. B. A. Khivsara Note: The material to prepare this presentation has been taken from internet and are generated only for students reference and not for commercial use.
Outline Difference Between SQL and No. SQL Study of Open Source NOSQL Database Mongo. DB Installation, Basic CRUD operations, Execution
• SQL Standard • SQL Characteristics • SQL Database Examples • No. SQL Databases • • No. SQL Defintion General Characteristics No. SQL Database Types No. SQL Database Examples 3 • SQL Databases 09 September 2020 Difference Between SQL and No. SQL
Data stored in tables 4 Relationships represented by data 09 September 2020 SQL Characteristics Data Manipulation Language Data Definition Language Transactions Abstraction from physical layer
Create Table Constraints to define and enforce relationships • Primary Key • Foreign Key • Etc. Triggers to respond to Insert, Update , & Delete Stored Modules Alter … Drop … Security and Access Control 5 Schema defined at the start 09 September 2020 Data Definition Language
Data Aggregation Compound statements Functions and Procedures Explicit transaction control 6 Data manipulated with Select, Insert, Update, & Delete 09 September 2020 Data Manipulation Language (DML)
Consistent – A transaction transforms the database from one consistent state to another consistent state. Consistency is defined in terms of constraints. Isolated – The results of any changes made during a transaction are not visible until the transaction has committed. Durable – The results of a committed transaction survive failures 7 Atomic – All of the work in a transaction completes (commit) or none of it completes 09 September 2020 Transactions – ACID Properties
• • IBM DB 2 Oracle RDMS Microsoft SQL Server Sybase SQL Anywhere • Open Source (with commercial options) • My. SQL • Ingres Significant portions of the world’s economy use SQL databases! 8 • Commercial 09 September 2020 SQL Database Examples
• non-relational, • distributed, • open-source and • horizontal scalable. Often more characteristics apply as: • schema-free, • easy replication support, • simple API, • eventually consistent / BASE (not ACID), • huge data amount, and more. 9 Next Generation Databases mostly addressing some of the points: 09 September 2020 No. SQL Definition- From www. nosql-database. org
10 http: //www. nosql-database. org/ lists 122 No. SQL Databases • Cassandra • Couch. DB • Hadoop & Hbase • Mongo. DB • Stupid. DB • Etc. 09 September 2020 No. SQL Products/Projects
• Google’s “big data” Scalable replication and distribution • Potentially thousands of machines • Potentially distributed around the world Queries need to return answers quickly Schema-less ACID transaction properties are not needed – BASE CAP Theorem Open source development 11 Large data volumes 09 September 2020 No. SQL Distinguishing Characteristics
• Basically Available, • Soft state, • Eventually Consistent • Characteristics • • • Weak consistency – stale data OK Availability first Best effort Approximate answers OK Aggressive (optimistic) Simpler and faster 12 • Acronym contrived to be the opposite of ACID 09 September 2020 BASE Transactions
09 September 2020 Brewer’s CAP Theorem Consistency Availability 13 CAP Theorem Partition tolerance
14 • all nodes see the same data at the same time – Wikipedia • client perceives that a set of operations has occurred all at once – Pritchett • More like Atomic in ACID transaction properties 09 September 2020 Consistency
• Every operation must terminate in an intended response – Pritchett 15 • node failures do not prevent survivors from continuing to operate – Wikipedia 09 September 2020 Availability
• Operations will complete, even if individual components are unavailable – Pritchett 16 • the system continues to operate despite arbitrary message loss – Wikipedia 09 September 2020 Partition Tolerance
Outline Difference Between SQL and No. SQL Study of Open Source NOSQL Database Mongo. DB Installation, Basic CRUD operations, Execution
18 Small upfront software costs 09 September 2020 Open Source Suitable for large scale distribution on commodity hardware
Key-Value Store – Hash table of keys Document Store – stores documents made up of tagged elements 19 Column Store – Each storage block contains data from only one column 09 September 2020 No. SQL Database Types
XML Databases Graph Databases Codasyl Databases Object Oriented Databases Etc… 20 • • • 09 September 2020 Other Non-SQL Databases
Example: Hadoop/Hbase • http: //hadoop. apache. org/ • Yahoo, Facebook Example: Ingres Vector. Wise • Column Store integrated with an SQL database • http: //www. ingres. com/products/vectorwise 21 Each storage block contains data from only one column 09 September 2020 No. SQL Example: Column Store
• Multiple row/record/documents are inserted at the same time so updates of column blocks can be aggregated • Retrievals access only some of the columns in a row/record/document 22 • More efficient than row (or document) store if: 09 September 2020 Column Store Comments
Hash tables of Keys Values stored with Keys Fast access to small data values Example – Project-Voldemort • http: //www. project-voldemort. com/ • Linkedin • Example – Mem. Cache. DB • http: //memcachedb. org/ • Backend storage is Berkeley-DB 23 • • 09 September 2020 No. SQL Examples: Key-Value Store
• Map • Extract sets of Key-Value pairs from underlying data • Potentially in Parallel on multiple machines • Reduce • Merge and sort sets of Key-Value pairs • Results may be useful for other searches 24 • Technique for indexing and searching large data volumes • Two Phases, Map and Reduce 09 September 2020 Map Reduce
System and method for efficient large-scale data processing A large-scale data processing system and method includes one or more application-independent map modules configured to read input data and to apply at least one application-specific map operation to the input data to produce intermediate data values, wherein the map operation is automatically parallelized across multiple processors in the parallel processing environment. A plurality of intermediate data structures are used to store the intermediate data values. One or more application-independent reduce modules are configured to retrieve the intermediate data values and to apply at least one application-specific reduce operation to the intermediate data values to provide output data. 25 Google granted US Patent 7, 650, 331, January 2010 09 September 2020 Map Reduce Patent
• http: //couchdb. apache. org/ • BBC Example: Mongo. DB • http: //www. mongodb. org/ • Foursquare, Shutterfly JSON – Java. Script Object Notation 26 Example: Couch. DB 09 September 2020 No. SQL Example: Document Store
{ "_id": "guid goes here", "_rev": "314159", 27 "type": "abstract", 09 September 2020 Couch. DB JSON Example "author": "Keith W. Hare" "title": "SQL Standard and No. SQL Databases", "body": "No. SQL databases (either no-SQL or Not Only SQL) are currently a hot topic in some parts of computing. ", "creation_timestamp": "2011/05/10 13: 30: 00 +0004" }
• GUID – Global Unique Identifier • Passed in or generated by Couch. DB • "_rev" • Revision number • Versioning mechanism • "type", "author", "title", etc. • Arbitrary tags • Schema-less • Could be validated after the fact by user-written routine 28 • "_id" 09 September 2020 Couch. DB JSON Tags
Mongo. DB
What is Mongo. DB ? • Scalable High-Performance Open-source, Document-orientated database. • Built for Speed • Rich Document based queries for Easy readability. • Full Index Support for High Performance. • Replication and Failover for High Availability. • Auto Sharding for Easy Scalability. • Map / Reduce for Aggregation.
Why use Mongo. DB? • SQL was invented in the 70’s to store data. • Mongo. DB stores documents (or) objects. • Now-a-days, everyone works with objects (Python/Ruby/Java/etc. ) • And we need Databases to persist our objects. Then why not store objects directly ? • Embedded documents and arrays reduce need for joins. No Joins and No-multi document transactions.
What is Mongo. DB great for? • RDBMS replacement for Web Applications. • Semi-structured Content Management. • Real-time Analytics & High-Speed Logging. • Caching and High Scalability Web 2. 0, Media, SAAS, Gaming Health. Care, Finance, Telecom, Government
Not great for? • Highly Transactional Applications. • Problems requiring SQL. Some Companies using Mongo. DB in Production
Schema less : Number of fields, content and size of the document can be differ from one document to another. No complex joins Data is stored as JSON style Index on any attribute Replication and High availability
RDBMS Database Table, View Row Mongo. DB Database Collection Document (JSON, BSON) Column Index Join Foreign Key Partition Field Index Embedded Document Reference Shard 35 Mongo. DB Terminologies for RDBMS concepts
JSON “Java. Script Object Notation” Easy for humans to write/read, easy for computers to parse/generate Objects can be nested Built on • name/value pairs • Ordered list of values http: //json. org/
BSON “Binary JSON” Binary-encoded serialization of JSON-like docs Embedded structure reduces need for joins Goals • Lightweight • Traversable • Efficient (decoding and encoding) http: //bsonspec. org/
BSON Example { "_id" : "37010" “City" : “Nashik", “Pin" : 423201, "state" : “MH", “Postman” : { name: “Ramesh Jadhav” address: “Panchavati” } }
Data Types of Mongo. DB Integer Boolean Date Double Binary data Object ID String Null Arrays
Data Types • String : This is most commonly used datatype to store the data. String in mongodb must be UTF-8 valid. • Integer : This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending upon your server. • Boolean : This type is used to store a boolean (true/ false) value. • Double : This type is used to store floating point values. • Min/ Max keys : This type is used to compare a value against the lowest and highest BSON elements. • Arrays : This type is used to store arrays or list or multiple values into one key. • Timestamp : ctimestamp. This can be handy for recording when a document has been modified or added. • Object : This datatype is used for embedded documents.
Data Types • Null : This type is used to store a Null value. • Symbol : This datatype is used identically to a string however, it's generally reserved for languages that use a specific symbol type. • Date : This datatype is used to store the current date or time in UNIX time format. You can specify your own date time by creating object of Date and passing day, month, year into it. • Object ID : This datatype is used to store the document’s ID. • Binary data : This datatype is used to store binay data. • Code : This datatype is used to store javascript code into document. • Regular expression : This datatype is used to store regular expression
Outline Difference Between SQL and No. SQL Study of Open Source NOSQL Database Mongo. DB Installation, Basic CRUD operations, Execution
Find version of Windows enter the following commands in the Command Prompt or Powershell: wmic os get caption wmic os get osarchitecture
Installation in Windows • Download Mongo. DB from Website: https: //www. mongodb. org/downloads • Select option Windows • Download and Run
Starting Mongo. DB in Windows Create one folder (eg SNJB) in bin folder of Mongo. DB Goto command prompt Goto bin dir of Mongo. DB and write following command mongod --storage. Engine=mmapv 1 --dbpath SNJB (Server will started and listen at 27017 port) Open another command prompt and give command mongo (Client will be started)
Installation in Ubuntu To install: • sudo apt-get install mongodb • sudo apt-get update To Start • sudo service mongodb start • mongo To Stop • exit sudo service mongodb stop
Starting Mongo. DB in Ubuntu Create a folder in bin directory of mongodb Open terminal Goto mongodb bin folder (cd mongo…. ) Type. /mongod (Server is started) Open another terminal Goto mongodb bin folder (cd mongo…. ) Type. /mongo (client will be started) Run all commands on client terminal
Outline Difference Between SQL and No. SQL Study of Open Source NOSQL Database Mongo. DB Installation, Basic CRUD operations, Execution
Basic Database Operations Database collection
Basic Database Operations- Database use <database name> db show dbs db. drop. Database () • switched to database provided with ciommand • To check currently selected database use the command db • Displays the list of databases • To Drop the database
Basic Database Operations- Collection db. create. Collection (name) Ex: - db. create. Collection(Stud) >show collections • List out all names of collection in current database db. databasename. insert ({Key : Value}) Ex: - db. Stud. insert({{Name: ”Jiya”}) db. collection. drop() Example: - db. Stud. drop() • To create collection • In mongodb you don't need to create collection. Mongo. DB creates collection automatically, when you insert some document. • Mongo. DB's db. collection. drop() is used to drop a collection from the database.
CRUD Operations Insert Find Update Delete
References • https: //docs. mongodb. com/manual/introduction/ • http: //metadata-standards. org/Document-library/Documentsby-number/WG 2 -N 1501 N 1550/WG 2_N 1537_SQL_Standard_and_No. SQL_Databases% 202011 -05. ppt • https: //www. slideshare. net/raviteja 2007/introduction-tomongodb-12246792 • https: //docs. mongodb. com/manual/core/databases-andcollections/ • https: //docs. mongodb. com/manual/crud/
- Slides: 53