Using Q 4 M a message queue storage








































![queue_wait(tbl_cond, [tbl_cond…, timeout]) n. Accepts multiple tables and timeout n. Data searched from leftmost queue_wait(tbl_cond, [tbl_cond…, timeout]) n. Accepts multiple tables and timeout n. Data searched from leftmost](https://slidetodoc.com/presentation_image_h2/8dfb01fde4d0926957973163f4e38a9f/image-41.jpg)












- Slides: 53
Using Q 4 M a message queue storage engine for My. SQL Cybozu Labs, Inc. Kazuho Oku
Background Apr 22 2009 Using Q 4 M 2
Who am I? n. Name: Kazuho Oku (奥 一穂) n. Original Developer of Palmscape / Xiino n The oldest web browser for Palm OS n. Working at Cybozu Labs since 2005 n Research subsidiary of Cybozu, Inc. in Japan Apr 22 2009 Using Q 4 M 3
About Cybozu, Inc. n. Japan’s largest groupware vendor n Mostly provides as software products, not as services n Some of our apps bundle My. SQL as storage Apr 22 2009 Using Q 4 M 4
About Pathtraq n. Started in Aug. 2007 n. Web ranking service n One of Japan’s largest n like Alexa, but semi-realtime, and per-page n running on My. SQL n. Need for a fast and reliable message relay n for communication between the main server and content analysis server(s) Apr 22 2009 Using Q 4 M 5
Design Goals of Q 4 M n. Robust n Do not lose data on OS crash or power failure n. Fast n Transfer thousands of messages per second n. Easy to Use n Use SQL for access / maintenance n Integration into My. SQL nno more separate daemons to take care of Apr 22 2009 Using Q 4 M 6
What is a Message Queue? Apr 22 2009 Using Q 4 M 7
What is a Message Queue? n. Middleware for persistent asynchronous communication n communicate between fixed pairs (parties) n a. k. a. Message Oriented Middleware n. MQ is intermediate storage n RDBMS is persistent storage n. Senders / receivers may go down Apr 22 2009 Using Q 4 M 8
Minimal Configuration of a MQ n. Senders and receivers access a single queue Sender Receiver Queue Apr 22 2009 Using Q 4 M 9
MQ and Relays n. Separate queue for sender and receiver n. Messages relayed between queues Relay Sender Queue Apr 22 2009 Receiver Queue Using Q 4 M 10
Merits of Message Relays n. Destination can be changed easily n Relays may transfer messages to different locations depending on their headers n. Robustness against network failure n no loss or duplicates when the relay fails n. Logging and Multicasting, etc. Apr 22 2009 Using Q 4 M 11
Message Brokers n. Publish / subscribe model n Separation between components and their integration n Components read / write to predefined queues n Integration is definition of routing rules between the message queues n Messages are often transformed (filtered) within the relay agent Apr 22 2009 Using Q 4 M 12
What about Q 4 M? n. Q 4 M itself is a message queue n. Can connect Q 4 M instances to create a message relay n. Provides API for creating message relays and brokers Apr 22 2009 Using Q 4 M 13
Performance of Q 4 M nover 7, 000 mess/sec. n message size: avg. 512 bytes n syncing to disk n. Outperforming most needs n if you need more, just scale out n Can coexist with other storage engines without sacrificing their performance see http: //labs. cybozu. co. jp/blog/kazuhoatwork/2008/06/q 4 m_06_release_and_benchmarks. php Apr 22 2009 Using Q 4 M 14
Applications of Q 4 M Apr 22 2009 Using Q 4 M 15
Asynchronous Updates n. Mixi (Japan's one of the largest SNS) uses Q 4 M to buffer writes to DB, to offload peak demands from http: //alpha. mixi. co. jp/blog/? p=272 Apr 22 2009 Using Q 4 M 16
Connecting Distant Servers n. Pathtraq uses Q 4 M to create a relay between its database and content analysis processes → Contents to be analyzed → Pathtraq DB My. SQL conn. over SSL, gzip Content Analysis Processes ← Results of the analysis ← Apr 22 2009 Using Q 4 M 17
To Prefetch Data nlivedoor Reader (web-based feed aggregator) uses Q 4 M to prefetch data from database to memcached nuses Q 4 M for scheduling web crawlers as well from http: //d. hatena. ne. jp/mala/20081212/1229074359 Apr 22 2009 Using Q 4 M 18
Scheduling Web Crawlers n. Web crawlers with retry-on-error n. Sample code included in Q 4 M dist. If failed to fetch, store URL in retry queue Store Result Spiders Read URL DB Request Queue Retry Queue Rescheduler Apr 22 2009 Using Q 4 M 19
Delayed Content Generation n. Hatetter (RSS feed-to-twitter-API gateway) uses Q 4 M to delay content generation n Source code: github. com/yappo/website-hatetter Apr 22 2009 Using Q 4 M 20
User Notifications n. For sending notifications from web services DB App. Logic SMTP Agent Queue(s) Apr 22 2009 Using Q 4 M IM Agent 21
Installing Q 4 M Apr 22 2009 Using Q 4 M 22
Installing Q 4 M n. Compatible with My. SQL 5. 1 n. Download from q 4 m. 31 tools. com n Binary releases available for some platforms n. Installing from source: n requires source code of My. SQL n. /configure && make install n run support-files/install. sql Apr 22 2009 Using Q 4 M 23
Configuration Options of Q 4 M n--with-sync=no|fsync|fdatasync|fcntl n Controls synchronization to disk n default: fdatasync on linux n--enable-mmap n Mmap’ed reads lead to higher throughput n default: yes n--with-delete=pwrite|msync n msync recommended on linux>=2. 6. 20 if you need really high performance Apr 22 2009 Using Q 4 M 24
Q 4 M Basics Apr 22 2009 Using Q 4 M 25
The Model n Various publishers write to queue n Set of subscribers consume the entries in queue Publisher Q 4 M table Publisher Subscribers Publisher Apr 22 2009 Using Q 4 M 26
Creating a Q 4 M Table n ENGINE=QUEUE creates a Q 4 M table n No primary keys or indexes n Sorted by insertion order (it’s a queue) Apr 22 2009 mysql> CREATE TABLE qt ( -> id int(10) unsigned NOT NULL, -> message varchar(255) NOT NULL -> ) ENGINE=QUEUE; Query OK, 0 rows affected (0. 42 sec) Using Q 4 M 27
Modifying Data on a Q 4 M Table n No restrictions for INSERT and DELETE n No support for UPDATE mysql> INSERT INTO qt (id, message) -> VALUES -> (1, 'Hello'), -> (2, 'Bonjour'), -> (3, 'Hola'); Query OK, 3 rows affected (0. 02 sec) mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +---------+ 3 rows in set (0. 00 sec) Apr 22 2009 Using Q 4 M 28
SELECT from a Q 4 M Table n Works the same as other storage engines n SELECT COUNT(*) is cached mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +---------+ 3 rows in set (0. 00 sec) mysql> SELECT COUNT(*) FROM qt; +-----+ | COUNT(*) | +-----+ | 3 | +-----+ 1 row in set (0. 00 sec) How to subscribe to a queue? Apr 22 2009 Using Q 4 M 29
Calling queue_wait() n After calling, only one row becomes visible from the connection mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +---------+ 3 rows in set (0. 00 sec) mysql> SELECT queue_wait('qt'); +---------+ | queue_wait('qt') | +---------+ | 1 | +---------+ 1 row in set (0. 00 sec) mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 1 | Hello | +---------+ 1 row in set (0. 00 sec) Apr 22 2009 Using Q 4 M 30
OWNER Mode and NON-OWNER Mode n. In OWNER mode, only the OWNED row is visible n. OWNED row becomes invisible from other connections nrows of other storage engines are visible NON-OWNER Mode 1, 'Hello' 2, 'Bonjour' 3, 'Hola' Apr 22 2009 queue_wait() OWNER Mode 1, 'Hello' queue_end() queue_abort() Using Q 4 M 31
Returning to NON-OWNER mode n By calling queue_abort, the connection returns to NON-OWNER mode mysql> SELECT QUEUE_ABORT(); +--------+ | QUEUE_ABORT() | +--------+ | 1 | +--------+ 1 row in set (0. 00 sec) mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +---------+ 3 rows in set (0. 01 sec) Apr 22 2009 Using Q 4 M 32
Consuming a Row n By calling queue_end, the OWNED row is deleted, and connection returns to NON-OWNER mode mysql> SELECT queue_wait('qt'); (snip) mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 1 | Hello | +---------+ 1 row in set (0. 01 sec) mysql> SELECT queue_end(); +-------+ | queue_end() | +-------+ | 1 | +-------+ 1 row in set (0. 01 sec) mysql> SELECT * FROM qt; +---------+ | id | message | +---------+ | 2 | Bonjour | | 3 | Hola | +---------+ 2 rows in set (0. 00 sec) Apr 22 2009 Using Q 4 M 33
Writing a Subscriber n Call two functions: queue_wait, queue_end n Multiple subscribers can be run concurrently n each row in the queue is consumed only once while (true) { SELECT queue_wait('qt'); rows : = SELECT * FROM qt; if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } Apr 22 2009 # # # Using Q 4 M switch to owner mode obtain data if we have any data, then consume the row erase the row from queue 34
Writing a Subscriber (cont'd) n. Or call queue_wait as a condition n Warning: conflicts with trigger-based insertions while (true) { rows : = SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } Apr 22 2009 Using Q 4 M 35
The Model – with code INSERT INTO queue. . . Publisher while (true) { rows : = SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } INSERT INTO queue. . . Publisher Q 4 M table Subscribers INSERT INTO queue. . . Publisher Apr 22 2009 Using Q 4 M 36
Three Functions in Detail Apr 22 2009 Using Q 4 M 37
queue_wait(table) n. Enters OWNER mode n 0〜 1 row becomes OWNED n Enters OWNER mode even if no rows were available n Default timeout: 60 seconds n Returns 1 if a row is OWNED (0 on timeout) n. If called within OWNER mode, the owned row is deleted Apr 22 2009 Using Q 4 M 38
Revisiting Subscriber Code n. Calls to queue_end just before queue_wait can be omitted while (true) { rows : = SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } Apr 22 2009 Using Q 4 M 39
Conditional queue_wait() n. Consume rows of certain condition n Rows that do not match will be left untouched n Only numeric columns can be checked n Fast - condition tested once per each row examples: SELECT queue_wait('table: (col_a*3)+col_b<col_c'); SELECT queue_wait('table: retry_count<5'); Apr 22 2009 Using Q 4 M 40
queue_wait(tbl_cond, [tbl_cond…, timeout]) n. Accepts multiple tables and timeout n. Data searched from leftmost table to right n. Returns table index (the leftmost table is 1) of the newly owned row n Returns zero if no rows are being owned example: SELECT queue_wait('table_A', 'table_B', 60); Apr 22 2009 Using Q 4 M 41
Functions for Exiting OWNER Mode nqueue_end n Deletes the owned row and exits OWNER mode nqueue_abort n Releases (instead of deleting) the owned row and exits OWNER mode n Close of a My. SQL connection does the same thing Apr 22 2009 Using Q 4 M 42
Relaying and Routing Messages Apr 22 2009 Using Q 4 M 43
The Problem n. Relay (or router) consists of more than 3 processes, 2 conns n. No losses, no duplicates on crash or disconnection Q 4 M Table (source) Apr 22 2009 Relay Program Using Q 4 M Table (dest. ) 44
Internal Row ID n. Every row have a internal row ID n invisible from Q 4 M table definition n monotonically increasing 64 -bit integer n. Used for detecting duplicates n Use two functions to skip duplicates n Data loss prevented by using queue_wait / queue_end Apr 22 2009 Using Q 4 M 45
queue_rowid() n. Returns row ID of the OWNED row (if any) n Returns NULL if no row is OWNED n. Call when retrieving data from source Apr 22 2009 Using Q 4 M 46
queue_set_srcid(src_tbl_id, mode, src_row_id) n. Call before inserting a row to destination table n Checks if the row is already inserted into the table, and ignores next INSERT if true n. Parameters: n src_tbl_id - id to determine source table (0〜 63) n mode - "a" to drop duplicates, "w" to reset n src_row_id - row ID obtained from source table Apr 22 2009 Using Q 4 M 47
Pseudo Code n. Relays data from src_tbl to dest_tbl while (true) { # wait for data SELECT queue_wait(src_tbl) => src_db; # read row and rowid row : = (SELECT * FROM src_tbl => src_db); rowid : = (SELECT queue_rowid() => src_db); # insert the row after setting srcid SELECT queue_set_srcid(src_tbl_id, 'a', rowid) => dest_db; INSERT INTO dest_tbl (row) => dest_db; } Apr 22 2009 Using Q 4 M 48
q 4 m-forward n. Simple forwarder script n installed into mysql-dir/bin usage: q 4 m-forward [options] src_addr dest_addr example: % support-files/q 4 m-forward "dbi: mysql: database=db 1; table=tbl 1; user=foo; password=XXX" "dbi: mysql: database=db 2; table=tbl 2; host=bar; user=foo" options: --reset duplicate check info. --sender=idx slot no. used for checking duplicates (0. . 63, default: 0) --help Apr 22 2009 Using Q 4 M 49
Limitations and the Future of Q 4 M Apr 22 2009 Using Q 4 M 50
Things that Need to be Fixed n. Table compactions is a blocking operation n runs when live data becomes <25% of log file n very bad, though not as bad as it seems nit's fast since it's a sequential write operation n. Relays are slow n since transfer is done row-by-row n. Binlog does not work n since MQ replication should be synchronous Apr 22 2009 Using Q 4 M 51
Future of Q 4 M n 2 -phase commit with other storage engines (maybe) n queue consumption and Inno. DB updates can become atomic operation Apr 22 2009 Using Q 4 M 52
Thank you http: //q 4 m. 31 tools. com/ Apr 22 2009 Using Q 4 M 53