Configuring Quill Condor Week 2007 Greg Thain Computer

  • Slides: 29
Download presentation
Configuring Quill Condor Week 2007 Greg Thain Computer Sciences Department University of Wisconsin-Madison Gthain

Configuring Quill Condor Week 2007 Greg Thain Computer Sciences Department University of Wisconsin-Madison Gthain @cs. wisc. edu http: //www. cs. wisc. edu/condor

Typical Condor Pool = Process Spawned = Class. Ad Communication Pathway Central Manager master

Typical Condor Pool = Process Spawned = Class. Ad Communication Pathway Central Manager master negotiator Submit-Only collector master Execute-Only master startd schedd www. cs. wisc. edu/condor

What is Quill? A technology to store a read only version of the job

What is Quill? A technology to store a read only version of the job queue and job historical data in a relational database. www. cs. wisc. edu/condor

Why Quill? Offloads query overhead from sched h. Performance boost! › Easier to make

Why Quill? Offloads query overhead from sched h. Performance boost! › Easier to make web portal h. RDMS access easier than SOAP/CLI www. cs. wisc. edu/condor

Job Queue Management Without Quill schedd With Quill schedd Database quilld Job Queue www.

Job Queue Management Without Quill schedd With Quill schedd Database quilld Job Queue www. cs. wisc. edu/condor

Quill downsides › Additional latency › More complicated setup › Handful of attributes not

Quill downsides › Additional latency › More complicated setup › Handful of attributes not in DBMS www. cs. wisc. edu/condor

Quill and Quill++ › Quill in Condor since 6. 7. 11 › Quill++ (quillpp)

Quill and Quill++ › Quill in Condor since 6. 7. 11 › Quill++ (quillpp) coming soon. h. Support for all daemons h. Multiple schedds in one database h. Support for Oracle on some platforms h. Replaces quill › We’ll talk about both www. cs. wisc. edu/condor

Typical Quill’d Condor Pool = Process Spawned = Class. Ad Communication Pathway Central Manager

Typical Quill’d Condor Pool = Process Spawned = Class. Ad Communication Pathway Central Manager master negotiator Submit-Only collector master startd Execute-Only master startd schedd quill Execute-Only Database query condor_q postgres www. cs. wisc. edu/condor

Typical Quillpp’d Condor Pool = Process Spawned = Class. Ad Communication Pathway Central Manager

Typical Quillpp’d Condor Pool = Process Spawned = Class. Ad Communication Pathway Central Manager master negotiator Submit-Only quillpp collector master quillpp startd Execute-Only master quillpp startd master schedd quillpp quill Execute-Only Database query condor_q postgres www. cs. wisc. edu/condor

How to use Schema? › We’ll talk about this in another talk h. Quill

How to use Schema? › We’ll talk about this in another talk h. Quill Front End and Schema Bo. F • Thursday 11 am www. cs. wisc. edu/condor

Quill (not Quill++) Deployment › › One Quill daemon per schedd Quill daemons must

Quill (not Quill++) Deployment › › One Quill daemon per schedd Quill daemons must be uniquely named Each Quill daemon uses a unique DB name Currently uses Postgre. SQL h. Recommend Postgre. SQL 8. 2 or later • Better disk management www. cs. wisc. edu/condor

Quill++ deployment › One condor_quillpp per machine › One condor_dbmsd per database › Manual

Quill++ deployment › One condor_quillpp per machine › One condor_dbmsd per database › Manual installation of schema › One DB per pool › Uses Postgres or Oracle www. cs. wisc. edu/condor

Condor’s Interface to Quill › Modified two tools to utilize the DB hcondor_q hcondor_history

Condor’s Interface to Quill › Modified two tools to utilize the DB hcondor_q hcondor_history www. cs. wisc. edu/condor

A User Perspective: condor_q › condor_q changes h. When QUILL_ENABLED, goes to rdbms h-name

A User Perspective: condor_q › condor_q changes h. When QUILL_ENABLED, goes to rdbms h-name takes a Schedd. Name or Quill. Name h-avgqueuetime details average time in queue for all jobs www. cs. wisc. edu/condor

Condor_q -direct › -direct rdbms h (default when QUIL_ENABLE=true) › -direct quilld h(useful for

Condor_q -direct › -direct rdbms h (default when QUIL_ENABLE=true) › -direct quilld h(useful for firewall traversal) › -direct schedd h(100% up-to-date view) www. cs. wisc. edu/condor

A User Perspective: condor_history › condor_history changes h-name takes a Quill Name to retrieve

A User Perspective: condor_history › condor_history changes h-name takes a Quill Name to retrieve job histories from a remote quill’s database www. cs. wisc. edu/condor

Condor_history -direct › There isn’t any (yet) › Condor_history –f  h`condor_config_val HISTORY` ›

Condor_history -direct › There isn’t any (yet) › Condor_history –f h`condor_config_val HISTORY` › No –direct quilld equivalent www. cs. wisc. edu/condor

Postgre. SQL Configuration › Add two special user accounts: quillreader and quillwriter hcreateuser quillreader

Postgre. SQL Configuration › Add two special user accounts: quillreader and quillwriter hcreateuser quillreader --no-createdb --no-adduser --pwprompt hcreateuser quillwriter --createdb --no-adduser --pwprompt www. cs. wisc. edu/condor

Postgre. SQL Configuration (cont) › Allow TCP/IP connections h. Edit file postgresql. conf •

Postgre. SQL Configuration (cont) › Allow TCP/IP connections h. Edit file postgresql. conf • Add listen_address = '*' › Allow connections from specific hosts h. Edit file pg_hba. conf • host all quillreader 128. 105. 0. 0 255. 0. 0 password • host all quillwriter 128. 105. 0. 0 255. 0. 0 password › Note: only use ‘password’ authentication at this time. www. cs. wisc. edu/condor

Quill Configuration › User quillwriter needs a password. › Store it in › $(SPOOL)/.

Quill Configuration › User quillwriter needs a password. › Store it in › $(SPOOL)/. quillwritepassword (quill) › $(SPOOL)/. pgpass (quill++) h. pgpass has host: port: db: user: pass › Ensure only the condor uid can read it if Condor is running as root www. cs. wisc. edu/condor

Quill Configuration (cont) › Condor system specific attributes in file condor_config. local h QUILL

Quill Configuration (cont) › Condor system specific attributes in file condor_config. local h QUILL = h QUILL_LOG = h QUILL_ADDRESS_FILE = h DAEMON_LIST = h VALID_SPOOL_FILES = h DC_DAEMON_LIST = $(SBIN)/condor_quill $(LOG)/Quill. Log $(LOG)/. quill_address …, QUILL …, . quillwritepassword …, QUILL www. cs. wisc. edu/condor

Quill Configuration (cont) › Quill specific attributes h QUILL_ENABLED h h h = TRUE

Quill Configuration (cont) › Quill specific attributes h QUILL_ENABLED h h h = TRUE # The quill name must be unique across all # quill daemons AND schedds QUILL_NAME = psilord_quilld@merlin. cs QUILL_DB_NAME = psilord_db QUILL_DB_IP_ADDR = merlin. cs. wisc. edu: 42999 QUILL_POLLING_PERIOD = 10 (seconds) www. cs. wisc. edu/condor

Quill Configuration (cont) › › › QUILL_HISTORY_CLEANING_INTERVAL QUILL_HISTORY_DURATION QUILL_MANAGE_VACUUM QUILL_IS_REMOTELY_QUERYABLE QUILL_DB_QUERY_PASSWD = = =

Quill Configuration (cont) › › › QUILL_HISTORY_CLEANING_INTERVAL QUILL_HISTORY_DURATION QUILL_MANAGE_VACUUM QUILL_IS_REMOTELY_QUERYABLE QUILL_DB_QUERY_PASSWD = = = 24 (hours) 30 (days) FALSE TRUE xxx www. cs. wisc. edu/condor

Schema management › Quill automatically loads schema h. Upgrades itself automatically › Quill++ requires

Schema management › Quill automatically loads schema h. Upgrades itself automatically › Quill++ requires manual loading: h. Psql –Uquillwriter<common_createddl. sql h. Psql –Uquillwriter<pgsql_createddl. sql www. cs. wisc. edu/condor

Conversion to Quill++ › Conversion only matters for history › Conversion is one-way-only! ›

Conversion to Quill++ › Conversion only matters for history › Conversion is one-way-only! › Two steps: h. Dump quill history tables to file with • Condor_dump_history h. Load quill++ history tables from file with • Condor_load_history www. cs. wisc. edu/condor

Data Management › Constrain database size h. History truncation • Quill++ other tables, too

Data Management › Constrain database size h. History truncation • Quill++ other tables, too h. Postgres Index management h. Oracle cleans itself › Careful of long queries, esp with Quill www. cs. wisc. edu/condor

Data Management: Quill › HISTORY_CLEANING_INTERVAL h. In hours (24 hours) › HISTORY_DURATION h. How

Data Management: Quill › HISTORY_CLEANING_INTERVAL h. In hours (24 hours) › HISTORY_DURATION h. How long in days (7 days) › QUILL_SHOULD_REINDEX h. Boolean (false) › QUILL_MANAGE_VACUUM (false) www. cs. wisc. edu/condor

Data Management: Quill++ › Condor_dbmsd does all the work h. QUILL_DBSIZE_LIMIT (20 Gb) –

Data Management: Quill++ › Condor_dbmsd does all the work h. QUILL_DBSIZE_LIMIT (20 Gb) – Emails warning when 75% is hit h. DATABASE_PURGE_INTERVAL (s (24 hours)) h. DATABASE_REINDEX_INTERVAL (s (24 hours)) h. QUILL_DB_TYPE (oracle, pgsql) h. QUILL_RESOURCE_HISTORY_DURATION (7 days) h. QUILL_JOB_HISTORY_DURATION (10 years!) h. QUILL_RUN_HISTORY_DURATION (7 days) www. cs. wisc. edu/condor

Thank you! › Want more information? › BOF “Databases in Condor” www. cs. wisc.

Thank you! › Want more information? › BOF “Databases in Condor” www. cs. wisc. edu/condor