HTCondor Networking Concepts Disclaimers Not about configuration macros

  • Slides: 21
Download presentation
HTCondor Networking Concepts

HTCondor Networking Concepts

Disclaimers › Not about configuration macros › Not about host or daemon lookups ›

Disclaimers › Not about configuration macros › Not about host or daemon lookups › Not about HTCondor internals 2

Asking the Right Questions › › › There will be a quiz at the

Asking the Right Questions › › › There will be a quiz at the end Start by reviewing fairy-tale networking … then add IPv 6 … then add schedd firewalls … then add startd firewalls End by passing the quiz (open-manual) 3

Fairy-tale Networking › › Single network protocol All addresses publically routable No firewalls Fewer

Fairy-tale Networking › › Single network protocol All addresses publically routable No firewalls Fewer than ~25 k simultaneous running jobs 4

Working in a Fairy Tale negotiator collector schedd shadow* startd starter* * One shadow,

Working in a Fairy Tale negotiator collector schedd shadow* startd starter* * One shadow, starter per running job 5

IPv 6 negotiator collector schedd shadow startd starter IPv 4 IPv 6 6

IPv 6 negotiator collector schedd shadow startd starter IPv 4 IPv 6 6

IPv 6 + IPv 4 negotiator collector schedd shadow startd starter IPv 4 startd

IPv 6 + IPv 4 negotiator collector schedd shadow startd starter IPv 4 startd starter IPv 6 7

Shared Port › Problem: Firewall h. Admin willing to open only one port ›

Shared Port › Problem: Firewall h. Admin willing to open only one port › Problem: only ~60 k TCP ports h. Need one per shadow › Shared Port Service h. Listens on single port for incoming connections h. Hands each connection to intended recipient 8

schedd Internet 9 Fire wall Shared Port startd shared_port starter

schedd Internet 9 Fire wall Shared Port startd shared_port starter

Firewalled Submit Node startd starter Fire schedd shared port shadow Wall negotiator collector 10

Firewalled Submit Node startd starter Fire schedd shared port shadow Wall negotiator collector 10

TCP Forwarding Host › Problem: Private network with NAT › Traverse firewall via port

TCP Forwarding Host › Problem: Private network with NAT › Traverse firewall via port forwarding h. Allocate a public IP address h. Connections to public address forwarded by NAT to machine on private network › Common in the Cloud 11

Condor Connection Broker › Problem: Private network with NAT h. Or firewall with no

Condor Connection Broker › Problem: Private network with NAT h. Or firewall with no opening for HTCondor › Traverse firewall by reversing connection h. Client sends connection request via broker h. Server initiates TCP connection to client › Only bypasses one firewall h. Client and broker (CCB server) must have publically routable addresses 12

CCB: Condor Connection Broker schedd Internet 13 Outbound firewall CCB startd schedd

CCB: Condor Connection Broker schedd Internet 13 Outbound firewall CCB startd schedd

NATd Execute Nodes NAT Fire schedd shared port shadow Wall negotiator collector/CCB startd starter

NATd Execute Nodes NAT Fire schedd shared port shadow Wall negotiator collector/CCB startd starter 14

Port Usage (Digression) › Shadow for each running job › In fairy-tale setup h.

Port Usage (Digression) › Shadow for each running job › In fairy-tale setup h. Each shadow uses two ports h. Limit of ~25 k running jobs › With shared port and CCB h. Shadow use no ports h. No network limit on number of running jobs 15

Quiz 1. Why do schedds and central managers need to be mixed-mode in a

Quiz 1. Why do schedds and central managers need to be mixed-mode in a pool split between IPv 4 and IPv 6 nodes? 2. Why use CCB on execute nodes? 3. Why use both CCB and shared port? 4. If both the schedd and the execute nodes are NATd, what do you do? 16

Question 1 › Why do schedds and central managers need to be mixed-mode in

Question 1 › Why do schedds and central managers need to be mixed-mode in a pool split between IPv 4 and IPv 6 nodes? h. They need to be able to talk to all execute nodes 17

Question 2 › Why use CCB on execute nodes (and not submit nodes)? h.

Question 2 › Why use CCB on execute nodes (and not submit nodes)? h. Easier to make submit nodes publically accessible (fewer of them) 18

Question 3 › Why use both CCB and shared port? h. Can’t use CCB

Question 3 › Why use both CCB and shared port? h. Can’t use CCB for both schedd and startd h. No ports used for shadow, so no limit on number of running jobs 19

Question 4 › If both the schedd and the execute nodes are NATd, what

Question 4 › If both the schedd and the execute nodes are NATd, what do you do? h. If same NAT, no problem h. TCP Forwarding Host for schedd 20

Congratulations! HTCondor Administrator Networking 21

Congratulations! HTCondor Administrator Networking 21