What is a Web Server A web server
What is a Web Server? A web server is a software program that accepts HTTP requests typically from a web browser and responds with HTTP responses which contain data content – typically web (html) page Or, a web server is a computer that runs web server software (often simply called server) Web Servers 1
Web servers unlike user applications, web servers run behind the scenes web servers provide services for other applications which are called clients On linux/unix machines, the server is a daemon (an application that runs behind the scenes, responding to requests) hence, the name of the apache executable httpd (HTTP daemon) Web Servers 2
Web Servers Server monitors communication ports (default port is 80) on its host machine servers are designed to work over a network, but requests can come from a client running on the same machine http: //localhost or http: //127. 0. 0. 1 (called the loopback IP address since it refers to the machine running the server) Web Servers 3
Features common to web servers mapping of the path component of a URL into a file name (for static content) or to a program to be executed (for dynamic content) acceptance of HTTP requests and responding with HTTP responses logging of HTTP requests and responses content compression to reduce response time Web Servers 4
Features often provided authentication – request of username/password before allowing access to resource static content (display of file on server's file system) and dynamic content supported by interfaces such as PHP, CGI, ASP, JSP, etc. Secure Socket Layer (SSL) – establishment of an encrypted link between a client and a server (indicated Web Servers by https) 5
Features often provided virtual hosting – service of multiple websites using one IP address or service of multiple websites each with their own address handled by the same server large file support – ability to respond with large files bandwidth throttling – limit speed of responses in order to not overwhelm network Web Servers 6
Web server organization Two main directories Document root (servable documents) Server root (server software) Document root is accessed indirectly by clients Actual location is set by the server configuration file Requests are mapped to the actual location Web Servers 7
Web browsers Mosaic (1993) First browser to use a GUI; caused huge increase in use of web Initially for X-windows/Unix Browsers are clients – always initiate the communication and servers react; although sometimes server requires a response A browser is an HTTP client because it sends requests to a HTTP server (web server) Some requests are for program execution with a document returned as a response Web Servers 8
HTTP Protocol HTTP messages are either requests (sent by a client – often a browser) or responses (sent by a server) HTTP is a stateless protocol After server sends a response, connection between server and client is closed – no information is saved between transactions Web Servers 9
HTTP Protocol Request message form: HTTP-method domain Header fields Blank line Message body Web Serversion 10
HTTP Protocol Response message form: HTTP-version status-code status-msg Header fields Blank line Message body Web Servers 11
HTTP Protocol Example of the first line of a request: GET /cs. appstate. edu/index. html HTTP/1. 1 Example of the first line of a response: HTTP/1. 1 200 OK Web Servers 12
HTTP Protocol Header lines Form: Header-name: value HTTP 1. 1 defines 46 header types; one is required in requests (Host) Usually contains headers to describe the body; for example in an HTTP response Content-type: text/html Web Servers 13
Sample HTTP exchange To retrieve the file at: http: //www. cs. appstate. edu/path/file. html TCP connection made to host www. cs. appstate. edu A GET request is sent: GET path/file. html HTTP/1. 1 Host: www. cs. appstate. edu <blank line> Response from server: HTTP/1. 1 200 OK Content-type: text/html <html> <head> etc. Web Servers 14
HTTP Methods Get method Request for a resource Server responds with the resource; usually a web document Can also use it to send input to the resource, where the input is embedded with the URI (but not recommended – use Post) Head method Head request is just like a Get request except that only the Headers are returned, not the body Web Servers 15
HTTP Methods Post Put Sends a block of data in the message body to be processed by the server Extra headers (Content-Type, Content-Length) describe data URI is path to program (script) to process data Server response is normally program output Store a new document on the server Delete Remove a document from the server Web Servers 16
HTTP Protocol TRACE OPTIONS Echoes back the received request, so that a client can see what intermediate servers are adding or changing in the request. Returns the HTTP methods that the server supports. This can be used to check the functionality of a web server. CONNECT Use to support access to HTTPS sites Web Servers 17
Internet Protocol (IP) Addresses Every node has a unique numeric address, currently (IPv 6) 128 bits 152. 10. 40 (our CS server) Every organization has a group of IP addresses for their computers Web Servers 18
D o m a i n n a m e Domain names Name that corresponds to an IP address DNS (domain name server) converts domain name to an IP address Note that a URL (uniform resource locator) is not a domain name, but the domain name is part of it http: //www. cs. appstate. edu/~can/classes/5530 Web Servers 19
MIME extensions Multipurpose Internet Mail Extensions (originally developed for email) Used to specify to the browser the form of a file returned by the server (attached by the server to the beginning of the document) Form: type/subtype (text/plain, text/html, image/jpg) Server gets type from the requested file’s extension (. html implies text/html) Browser gets type explicitly from server Web Servers 20
Client Side Scripting Script that run’s on the client’s machine (run by the browser) Reduce requests needed to be passed to the server Can be used to validate user input and enhance web page via Dynamic HTML (programs manipulate elements via the DOM – document object model) Source code is viewable via the browser Popular Languages: Java. Script, Microsoft JScript Web Servers 21
Server Side Scripting Script executed by the web server (apache, for example) When browser requests an html file, server executes the script embedded within the html before returning the file Allows html to be customized according to the browser; allows database access on the server Can’t use browser to view server side script (because it has already been executed) Server side scripting technologies – ASP, ASP. NET, Coldfusion, ESP, JSP, Lasso, PHP, Server-side Java. Script, Server side includes, . . . Web Servers 22
Common Gateway Interface Standard interface through which users interact with applications on Web servers CGI script Executed by the operating system running on the server (not the web server) Can be written in any language – Perl and PHP are popular Web Servers 23
S C B u a G r e G Ilb Iro lm w v i p s e p t r. C re c G o r o Io g g m rp rp a r la o m u e rg n m rt e n e sa ipm d n f o g o n sro m e n c l i e n t 1) Browser sends request 2) Web server receives request and invokes CGI program 5) Web server responds to client with CGI program output Web Servers 3) CGI program runs outside of web server 4) Output of CGI program sent to web server 24
Apache Currently, most popular web server on the internet Open source allowing users to add features to their server Arguably, the most secure web server available and has the highest performance particularly when serving dynamic pages Web Servers 25
More on Apache large set of modules available some come as part of the Apache httpd distribution http: //httpd. apache. org/docs/2. 2/mod/ and must be compiled into the server If the server is configured to support dynamic linking, other modules can be dynamically linked in so recompilation of apache server not required modules not part of distribution http: //modules. apache. org/ Web Servers 26
How Apache got its name? has its roots in the httpd server developed by Rob Mc. Cool at the National Center for Supercomputing Applications (NCSA) Mc. Cool stopped working on the server when he left NCSA to work at Netscape Eight people worked together to develop an open source web server by writing patches to Mc. Cool's code -- thus name was born (a patchy server) Web Servers 27
How Apache got its name? However, the Apache httpd FAQ website indicates the name actually is a tribute to the American indian Apache tribe (but the other story is better) Web Servers 28
Apache installation Default installation is easy visit: http: //httpd. apache. org/ and click on Download Choose source for Unix Follow installation directions in the reference manual (depending upon which version you choose) here: http: //httpd. apache. org/docs/2. 2/install. html Configuring Apache is more challenging Web Servers 29
Starting and stopping Apache server starting: PREFIX/bin/apachectl start where PREFIX is the path to the bin directory, for example /usr/local/apache 2 This will start up one or more httpd daemons stopping PREFIX/bin/apachectl stop Each time httpd is started, information in configuration files (httpd. conf, for example) is read. Thus changes to the configuration are effective immediately without reinstalling the Web Servers server. 30
Configuring Apache at compile-time --prefix option can be used to changed the default installation directory other options available for specifying location of various part of the server (for example, libraries, documentation, header files, executables, configuration files, etc. ) --enable-layout=LAYOUT specifies that the layout of the server follow the LAYOUT specified in the file layout. config (example GNU layout installs in /usr/local rather than Web Servers /usr/local/apache) 31
Configuring Apache at compile-time --disable-FEATURE can be used to prevent the inclusion of a feature where FEATURE is provided by a module --enable-FEATURE[=shared] enables inclusion of a module; by default shared is yes; if =no then this has same function as --disable-FEATURE Web Servers 32
Configuring Apache at compile-time -enable-MODULE=shared corresponding module will be build as DSO (dynamically shared object) module --enable-MODULE=static by default enabled modules are linked statically. You can force this explicitly. What is the difference between a MODULE and a FEATURE? A module supplies a feature. Example: mod_auth supplies the feature auth (user based authentication) Web Servers 33
Configuring Apache at compile-time options available for cross-compilation (you can compile the server so that it will run on a different machine) options available for suexec (the feature of Apache that allows CGI scripts to be executed as something other than root) Web Servers 34
Configuring Apache at compile-time Example: . /configure –prefix=/usr/local/apache enable-rewrite=shared Note configure can also be given options to specify location of needed files and executables for more info, see configure page http: //httpd. apache. org/docs/2. 2/programs/configure. html Web Servers 35
Executables included with the Apache server httpd – this is the server, of course apachectl – server interface; used to start the server ab – server benchmarking tool apxs – used for installing modules to be dynamically linked in configure – tool for compilation configuration suexec – execute script as a different user Web Servers 36
Executables included with Apache server dbmmanage - Create and update user authentication files in DBM format for basic authentication htcacheclean - Clean up the disk cache htdigest - Create and update user authentication files for digest authentication htdbm - Manipulate DBM password databases. Web Servers 37
Executables included with Apache server htpasswd - Create and update user authentication files for basic authentication httxt 2 dbm - Create dbm files for use with Rewrite. Map logresolve - Resolve hostnames for IPaddresses in Apache logfiles rotatelogs - Rotate Apache logs without having to kill the server Web Servers 38
Configuring Apache after installation modifying configuration file a restart of the Apache server causes the configuration files to be read dynamically linking in modules can be downloaded and dynamically linked into the server to support additional features Web Servers 39
Apache configuration files httpd. conf main configuration file other configuration files can be included by the httpd. conf file using Include directive . htaccess can be stored within a directory any directives in the file apply to files in that directory and its subdirectories Web Servers 40
Configuration file syntax comment lines begin with # (no end of line comments) whitespace before directives is ignored but can be included for clarity one directive per line; lines can be continued with a at the end Web Servers 41
Configuration directives in the main configuration file (httpd. conf) apply to entire server can limit the scope of directives by placing them within a container directive: <Limit>, <Limit. Except> <Directory>, <Directory. Match> <Files>, <Files. Match> <Location>, <Location. Match> <Virtual. Host> <If. Define>, <If. Module>, <If. Version> <Proxy>, <Proxy. Match> Web Servers 42
Container directives <Limit>. . . </Limit> restricts scope of the directives contained within it to the HTTP methods specified Example: <Limit POST PUT DELETE> order deny, allow deny from allow from localhost </Limit> <Limit. Except>. . . </Limit> directives apply to HTTP methods not Web Servers specified 43
Container directives <Directory directory-path>. . . </Directory> directives apply only to specified directory and subdirectories directory-path can be a wild-card string if multiple <Directory> sections match a path then the directives are applied in order of shortest match first <Directory. Match directory-path>. . . </Directory. Match> like Directory, but directory-path can be a regular expression Web Servers 44
Container directives <Files filename>. . . </Files> limits the scope of the enclosed directives by file name filename should be the name of a file or a wild-card string (*. jpg) can be enclosed inside of a Directory section only container directive allowed in. htaccess <Files. Match filename>. . . </Files. Match> like Files, but filename can be a regular expression Web Servers 45
Container directives <Location url>. . . </Location> directives contained apply to url can be of the form: scheme: //servername/path (for proxy requests – Apache module acting as a proxy for the server) or of the form: /path (for non-proxy requests) <Location. Match url>. . . </Location> like Location directive, but url can be expressed using a regular expression Web Servers 46
Container directives <If. Define parameter>. . . </If. Define> <If. Define !parameter>. . . </If. Define> process directives only if parameter is/is not defined parameter can be defined (or not) elsewhere in configuration file <If. Module module>. . . </If. Module> <If. Module !module>. . . </If. Module> process directives only if module is/is not included in Apache – either compiled in or Web Servers dynamically loaded 47
Container directives <Virtual. Host addr[: port] [addr[: port]]. . . >. . . </Virtual. Host> used to define additional hosts and websites (with their own names and IP addresses) <If. Version [[!]operator] version>. . . </If. Version> process directives only for specific version(s) operator can be: =, ==, >=, <=, >, < Web Servers 48
Container directives <Proxy url>. . . </Proxy> Directives placed in <Proxy> sections apply only to matching proxied content Shell-style wildcards are allowed <Proxy. Match url>. . . </Proxy. Match> like Proxy, but url can be a regular expression Web Servers 49
Nesting of containers no container directive can nest within another container of the same type (for example, no <Directory> within a <Directory>) can sometimes nest containers of different types <Limit> can go in any other container, but not vice-versa <Files> and <Files. Match> allowed inside a <Directory> <Virtual. Host> can contain all of the other container types and behave like the server level configuration Web Servers 50
Ordering of Containers containers with the same scope in the server configuration are merged in the order in which Apache encounters them with later definitions overriding earlier ones if directives apply to multiple levels of directories, the directives are merged in order of increasing directory depth (no matter their order in the configuration file). Example: / /home/www/public Web Servers 51
Ordering of containers directives in <Virtual. Host> take effect after main server configuration so can override anything defined at the server level if they need to true even a <Directory> container at the server level and a <Directory> container within the <Virtual. Host> container point to same directory Web Servers 52
Non-container directives There a bunch of them! Full descriptions can be found at the Apache server web site http: //httpd. apache. org/docs/2. 2/mod/directives. html We'll go through a few Web Servers 53
Server-level directives must be defined outside of a container directive and can not be defined in a. htaccess file Load. Module module filename causes the module whose location is at filename to be linked into the server Core. Dump. Directory directory server switches to indicated directory before dumping core Web Servers 54
Server-level directives Listen [IP-address: ]portnumber [protocol] indicates to the server which IP addresses and ports to listen to if no IP-address is given, server listens to that portnumber on all interfaces Default protocol is http (default for port 443 is https) Server. Root directory-path sets the directory in which the server installation resides Web Servers 55
Server-level directives User userid sets the userid at which the server will answer requests typically, httpd will be started as root and other httpd processes will be created to respond to requests running as userid should not have log-in abilities nor be able to access files outside of server documents Group unix-group under which server will answer Web Servers requests 56
Server-level directives Types. Config file-path sets the location of the mimetypes configuration file-path is relative to the Server. Root configuration file contains a list of types and associated filename extensions application/pdf Web Servers 57
Server-level directives SSLRandom. Seed context source specifies source for seeding the pseudo random number generator for Open. SSL given indicated context Open. SSL – open source implementation of SSL (secure socket layer) protocol (communication between server and client is encrypted) source can be builtin (built-in pseudo-random number generator), a device (specified by a file name), or a program (specified by a path) context can be either startup (at startup time) or connect (when SSL connection established) Web Servers 58
Directives with server and virtual host scope some directives can be set at the server level and then repeated in Virtual. Host containers to override global setting Server. Name fully-qualified-domainname[: port] sets the hostname and port that the server uses to identify itself if inside Virtual. Host container, specifies what hostname must appear in the request's Host: header to match this virtual Web Servers host 59
Directives with server and virtual host scope Server. Admin email-address sets the email address that the server includes in error messages returned to the client Document. Root directorypath sets the document root from which httpd will serve files Web Servers 60
Directives with server and virtual host scope Access. File. Name filename [filename]. . . while processing a request the server looks for the first existing configuration file from this list of names in every directory of the path to the document, if distributed configuration files are enabled for that directory default filename is. htaccess Web Servers 61
Directives with server and virtual host scope Log. Format format|nickname [nickname] Describes a format for use in a log file if nickname not included, format applies to logs specified by subsequent Transfer. Log directives if nickname included, then format is explicitly associated with a nickname Log. Format "%v %h %l %u %t "%r" %>s %b" logname Web Servers 62
Directives with server and virtual host scope Keep. Alive On|Off if On, allows multiple requests to be sent over the same TCP connection Time. Out seconds amount of time server will wait for a GET request before failure amount of time between receipt of TCP packets on a PUT or POST before failure amount of time between transmission of packet and receipt of ACK before failure Web Servers 63
Directives with server and virtual host scope Error. Log file-path|syslogd[: facility] sets name of the file to which errors are logged or indicates error logging to be handled by Unix syslogd (where facility indicates where the message is coming from) Error. Log /var/log/httpd/error_log Log. Level level specifies verbosity of the message logged default is warn Web Servers 64
Directives with server and virtual host scope Custom. Log file|pipe format|nickname [env=[!]environment-variable] used to log requests to the server flexibility provided to only log certain requests (using env) and specific log formats can also pipe requests to indicated program Log. Format "%h %l %u %t "%r" %>s %b" common Custom. Log logs/access_log common Set. Env. If Request_URI . gif$ gif-image Custom. Log gif-requests. log common env=gif-image Custom. Log nongif-requests. log common env=!gif-image Web Servers 65
Directives with server and virtual host scope Transfer. Log file|pipe like Custom. Log directive, except that it does not allow the log format to be specified explicitly or support conditional logging of requests log format is determined by the most recently specified Log. Format directive which does not define a nickname Web Servers 66
Directives with server and virtual host scope Alias URL-path file-path|directory-path allows documents to be saved under the local file system instead of under the document root Alias /image /ftp/pub/image A request for http: //myserver/image/foo. gif would cause the server to return the file /ftp/pub/image/foo. gif Web Servers 67
Directives with server and virtual host scope Script. Alias URL-path file-path|dir-path like Alias except that it indicates the target directory contains CGI scripts (or indicates the target file is a CGI script) Script. Alias /cgi-bin/ /web/cgi-bin/ A request for http: //myserver/cgi-bin/foo would cause the server to run the script /web/cgi-bin/foo Web Servers 68
Directives with server and virtual host scope User. Dir directory specifies a directory path that is used to map URLs beginning with a tilde can also use this to enable or disable specific username to directory translations Suppose request for http: //domain/~bob/one/two. html User. Dir public_html path is: ~bob/public_html/one/two. html User. Dir /usr/web path is: /usr/web/bob/one/two. html User. Dir /home/*/www path is: /home/bob/www/one/two. html Web Servers 69
Directives with virtual host scope some directives are only meaningful within a Virtual. Host container because they only apply to name-based virtual hosting Server. Alias hostname [hostname]. . sets alternate name(s) for a host for use in name-based virtual hosting Server. Path sets the URL for a host for use in namebased virtual hosting Web Servers 70
Directives with global and local scopes some directives can be set at the server level for global effect and repeated within a virtual host, . htaccess file, and/or directory container Include file-path|directory-path allows inclusion of other configuration files if directory is given, includes all files in that directory and subdirectories Web Servers 71
Directives with global and local scopes Directory. Index local-url [local-url]. . . specifies resources to be returned if a directory is requested Directory. Index index. html Default. Type MIME-type|none content type returned to client if server can not determine the type of the resource requested (MIME-type mapping not available) Web Servers 72
Directives with global and local scopes Add. Type MIME-type extension [extension]. . . used to add or override a MIME-type mapping Add. Type image/gif. gif Add. Handler handler-name extension [extension]. . . Files having the named extension will be served by the specified handler-name (overrides any mappings that already exist for the same extension) A "handler" is an internal Apache representation of the action to be performed when a file is called (for example, handle as a cgi-script) Web Servers 73
Directives with global and local scopes Set. Handler handler-name|None causes all matching files (matching as determined by the container) to be processed by the indicated handler, regardless of the extension for example, could specify that all matching files to be handled as cgi scripts Browser. Match regex [!]env-var[=val] [[!]envvar[=val]]. . . sets environment variables conditional on the User-Agent HTTPWeb request header Servers 74
Directives with global and local scopes Redirect [status] URL-path URL maps an old URL into a new one by asking the client to refetch the resource at the new location status is a code returned to the client; default is code 302 (resource temporarily moved) Redirect 303 /here http: //example. com/there Redirect. Match [status] regex URL like Redirect except that a regex can be used for matching against the URL-path Web Servers 75
Directives with global and local scopes XBit. Hack on|off|full X bit refers to the execute permission attached to the html files (only applies to text/html files) on – text/html files parsed by server (for example, to handle server-side includes) if user execute bit set full – if user and group execute bit set, server parses file and indicates to client the date the file was last modified (supports caching) off – no parsing of file (the default) Web Servers 76
Directives with global and local scopes Add. Output. Filter filter[; filter. . . ] extension [extension]. . . maps the filename extension to the filters which will process responses from the server before they are sent to the client Add. Output. Filter INCLUDES; DEFLATE shtml process all. shtml files for server-side includes and then compress the output using mod_deflate Web Servers 77
Directives with global and local scopes Add. Output. Filter. By. Type filter[; filter. . . ] MIME-type [MIME-type]. . . assigns an output filter to a particular MIME -type Add. Output. Filter. By. Type DEFLATE text/html Web Servers 78
Okay, what's the difference between a module, handler and a filter? module - code that gets linked with or loaded by Apache and uses the Apache API; handlers and filters are types of modules handler – generates response sent back to the client; a request is handled by only one handler filter – filters can inspect the response and optionally change it Web Servers 79
Directives with global and local scopes Rewrite. Rule Pattern Substitution [flags] first Rewrite. Rule is applied to the URL-path of the request; subsequent patterns are applied to the output of the last matched Rewrite. Rule (unless flags indicate otherwise) flags – comma separated list enclosed by brackets (many flags available) Rewrite. Rule ^/localpath(. *) http: //host/otherpath$1 Web Servers 80
Directives with global and local scopes Rewrite. Engine on|off enables/disables the rewriting engine not inherited by virtual hosts; must have a Rewrite. Engine on directive for each virtual host in which you wish to use rewrite rules Rewrite. Engine on enables use of Rewrite. Rule directives Rewrite. Engine off Rewrite. Rule directives will be ignored Web Servers 81
Directives with global and local scopes Options [+|-]option [[+|-]option]. . . controls which server features are available in a particular directory All - all options except for Multi. Views (default) Exec. CGI - execution of CGI scripts permitted. Follow. Sym. Link - server will follow symbolic links in this directory Includes - server-side includes permitted Includes. NOEXEC - server-side includes are permitted, but the #exec cmd and #exec cgi are disabled (for executing scripts) Web Servers 82
Directives with global and local scopes Options, continued Indexes - if a URL which maps to a directory is requested and no Directory. Index (e. g. , index. html) in that directory, then return a formatted listing of the directory. Multi. Views - Content negotiated "Multi. Views" (for example, different languages) are allowed Sym. Links. If. Owner. Match - server will only follow symbolic links for which the target file or directory is owned by the same user id as the link Web Servers 83
Directives with local scope many directives require a container (like a Directory or a. htaccess file) to give them meaning Allow. Override All|None|directive-type [directive-type]. . . indicates to the server which directives in an. htaccess file can override earlier configuration directives Allow. Override directive can only be placed in a Directory container Web Servers 84
Directives with local scope Allow from all|host|env=[!]env-var [host|env=[!]env-var]. . . affects which hosts can access an area of the server Examples: Allow from all Allow from apache. org crazy. net Set. Env. If Request_URI “*. gif$” is_image Allow from env=is_image Web Servers 85
Directives with local scope Deny from all|host|env=[!]env-variable [host|env=[!]env-variable]. . . Controls which hosts are denied access to the server Examples: Deny from all Deny from apache. org crazy. net Set. Env. If Request_URI “*. gif$” is_image Deny from env=!is_image Web Servers 86
Directives with local scope Order ordering along with the Allow and Deny directives, controls a three-pass access control system Order Allow, Deny First, all Allow directives are evaluated; at least one must match, or the request is rejected; Next, all Deny directives are evaluated. If any matches, the request is rejected. Last, any requests which do not match an Allow or a Deny directive are denied. Order Deny, Allow First, all Deny directives are evaluated; if any match, the request is denied unless it also matches an Allow directive. Any requests which do not match any Allow or Deny directives are permitted. Web Servers 87
Modules An independent part of the Apache program static modules - compiled into the Apache httpd binary dynamic modules - stored separately and can be optionally loaded at run-time; also known as DSOs (Dynamic Shared Objects) base modules - modules that are included by default third-party modules - modules are that are not distributed as part of the Apache HTTP Server tarball Web Servers 88
Note: modules allows us to include directives mod_actions – supports execution of CGI scripts of specific types supports directives Action and Script mod_alias – supports mapping of URLs to some location in file system supports directives Alias, Alias. Match, Redirect. Permanent, Redirect. Temp, Script. Alias. Match Web Servers 89
Virtual Hosting virtual hosting is the practice of running more than one web site (such as www. company 1. com and www. company 2. com) on a single machine IP-based - different IP address for every web site name-based - multiple names running on each IP address Web Servers 90
Proxy Server intermediate server that stands between a client and a remote server and receives requests from the client; proxy server either fulfills, forwards or rejects request Benefits improve performance by caching documents so that contacting the remote server is unnecessary can add a layer of security between the client and the actual server Web Servers 91
Types of proxy servers As borrowed from the site http: //publib. boulder. ibm. com/infocenter/iseries/v 5 r 3/index. jsp? topic =/rzaieproxytypes. htm there are three common proxy server types forward proxy reverse proxy chaining I'll describe these separately, but note that a single proxy can actually be set up Servers 92 to do forward and. Webreverse proxying
Forward proxy requests from a client on an intranet (private network) are sent first to the forward proxy before being sent to the internet forward proxy can cache responses and respond to client from cache provides some network security and also can reduce network traffic Web Servers 93
Forward proxy Server A examines request first All network traffic goes through firewall Server B generates response Client receives response Server A forwards response and caches response for next time Web Servers 94
Reverse proxy used to pass requests from the internet to a (often) private, isolated network used to prevent Internet clients from having direct, unmonitored access to sensitive data residing on content servers on an isolated network, or intranet can also lessen network traffic by serving cached information rather than passing all requests to content servers Web Servers 95
Reverse proxy if request is valid and response is not cached, B generates response is request valid? is response cached? all network traffic goes through firewall client sends request client receives response is cached by A Web Servers 96
Proxy chaining uses two or more proxy servers to assist in server and protocol performance and security each server can fulfill, redirect, or reject a request can be changed so that it can be handled by the content server (see example on next slide) Web Servers 97
Proxy chaining client makes HTTP request to C A doesn't have response cached request goes through firewall response is cached and sent to client B changes HTTP request to FTP request C (handles only FTP requests) generates response Sends response using HTTP protocol Web Servers 98
terms applied to proxy servers caching proxy – keep copies of frequently requested resources web proxy – specifically for WWW content filtering proxy – uses a filter to control what content can be relayed through a proxy (example use: public schools) transparent proxy – does not modify request or response Web Servers 99
terms applied to proxy servers open proxy – proxy that is accessible to any internet user (note that in the forward proxy example, A was set up to handle only intranet traffic. If A could accept any internet traffic, it would be considered an open proxy. ) hostile proxy – installed by a hacker to eavesdrop on network traffic Web Servers 100
proxying with Apache can be set up so that it acts solely as a proxy or it combines proxying with normal content delivery can also disguise the fact that it is a proxy and make it appear to the client that the proxy is the actual origin of the document Web Servers 101
Configuring Apache as a Proxy only required directive is: Proxy. Requests on enables Apache for both forward and reverse proxying; however, intranet clients need to also be configured to use the forward proxy (options under Firefox Tools support this) requests for URLs that Apache is locally responsible for are served as normal; requests for other URLs cause Apache to retrieve the URL itself as a client and pass the response to actual client Web Servers 102
Simple example in order to test out the proxying capability, we can have Apache proxy requests to itself Listen 8080 Server. Name: www. mymachine. com. <Virtual. Host *: 8080> Server. Name proxy. mymachine. com Proxy. Requests On </Virtual. Host> configure client to use www. mymachine. com, port 8080 as a proxy; request for www. mymachine. com will first be sent to Apache at port 8080 and then resent to Apache by Apache Web Servers 103 at port 80
Apache proxy modules mod_proxy – supports proxying capability mod_proxy_http - serve HTTP proxy requests mod_proxy_ftp - serve FTP proxy requests mod_proxy_connect - serve SSL proxy requests using HTTP CONNECT method Web Servers 104
Enabling proxy modules Compile Apache with this option: --enable-proxy with this option, you get all four proxy modules Or, load dynamically using Load. Module proxy_modules/mod_proxy. so Web Servers 105
A little info about IP addresses (IPv 4) four bytes in length; often expressed in what is called quad dotted decimal notation, where each decimal can be between 0 and 255 consist of two halves network address - common to all hosts within a single organization local host address example: appstate machines have the network address 152. 10 Web Servers 106
A little info about IP addresses (IPv 4) netmask – when ANDed with an IP address, gives the network address example: netmask for all appstate machines is 255. 0. 0 can use an IP address and a netmask to specify a subset of machines on a network 192. 168. 100. 0/255. 0 - machines with the network address 192. 168. 100 Web Servers 107
Apache forward proxy <If. Module mod_proxy. c> #enable Apache to function as a forward proxy server Proxy. Requests On #match all proxied content with a * <Proxy *> #control access to the proxy server Order Deny, Allow Deny from all #allow only machines with network prefix 192. 168. 100 Allow from 192. 168. 100. 0/255. 0 </Proxy> #disallow access to www. playboy. com and any domain #names with xxx Proxy. Block www. playboy. com xxx Web Servers </If. Module> 108
proxy directives used forwarding Proxy. Requests – when On, Apache allowed to access as a forward proxy server Proxy - Directives placed in <Proxy> sections apply only to matching proxied content Proxy. Block - HTTP, HTTPS, and FTP document requests to sites whose names contain matched words, hosts or domains are blocked Web Serversby the proxy server 109
Important note about forwarding Proxy. Requests On – sets up your server as a forward proxy Before you do that, you need to use the Proxy directive to limit the access to your proxy server Otherwise, your server can be used by any client to access arbitrary hosts while hiding his or her true identity; dangerous both for your network and for the Internet at large. Web Servers 110
Apache reverse proxy #disallow forward proxy Proxy. Requests Off #specify who can use the proxy server (everyone) <Proxy *> Order deny, allow Allow from all </Proxy> #allows request for http: //host/foo/boo to be converted #into proxy request: http: //foo. example. com/bar/boo Proxy. Pass /foo http: //foo. example. com/bar #change the Location header in the response to #http: //host rather than http: //foo. example. com Proxy. Pass. Reverse /foo http: //foo. example. com/bar Web Servers 111
proxy directives used for reverse proxy Proxy. Pass - Maps remote servers into the local server URL-space; local server is called a reverse proxy or gateway Proxy. Pass. Reverse - Adjusts the URL in HTTP response headers sent from a reverse proxied server; prevents the remote server from revealing itself to the client (which then could possibly bypass the reverse proxy altogether) Web Servers 112
caching forward proxies reduce the bandwidth demands of clients accessing servers elsewhere on the Internet by caching frequently accesses pages reverse proxies cache frequently accessed pages so remote server not subject to constant requests for static pages when it has important dynamic queries to process Web Servers 113
Apache caching modules mod_cache – provides HTTP-aware caching (meaning it pays attention to the HTTP headers Expires, Cache-Control) mod_file_cache – simpler caching technique that may be useful for accessing local static files that do not change very often Web Servers 114
Memory based caching Load. Module cache_modules/mod_cache. so <If. Module mod_cache. c> Load. Module mem_cache_modules/mod_mem_cache. so <If. Module mod_mem_cache. c> Cache. Enable mem / #cache everything MCache. Size 4096 MCache. Max. Object. Count 100 MCache. Min. Object. Size 1 MCache. Max. Object. Size 2048 </If. Module> # When acting as a proxy, don't cache the list of security updates Cache. Disable http: //security. update. server/update-list/ </If. Module> Web Servers 115
Some caching directives Cache. Enable cache_type url-string Enable caching of specified URLs using a specified storage manager; cache_type mem - memory based storage manager disk - disk based storage manager Cache. Enable mem /manual Cache. Enable disk / When acting as a forward proxy server, urlstring can also be used to specify remote sites which caching should be enabled for Web Servers 116
Some caching directives MCache. Size Kbytes The maximum amount of memory used by the cache in Kbytes MCache. Max. Object. Count value The maximum number of objects allowed to be placed in the cache MCache. Max. Object. Size bytes The maximum size (in bytes) of a document allowed in the cache Web Servers 117
Some caching directives MCache. Min. Object. Size bytes The minimum size (in bytes) of a document to be allowed in the cache Cache. Disable url-string Disable caching of items at or below urlstring Cache. Disable /security Web Servers 118
Disk-based caching Load. Module cache_modules/mod_cache. so <If. Module mod_cache. c> Load. Module disk_cache_modules/mod_disk_cache. so <If. Module mod_disk_cache. c> Cache. Root c: /cacheroot Cache. Enable disk / Cache. Dir. Levels 5 Cache. Dir. Length 3 </If. Module> # When acting as a proxy, don't cache the list of security updates Cache. Disable http: //security. update. server/update-list/ </If. Module> Web Servers 119
More caching directives Cache. Root directory defines the name of the directory on the disk to contain cache files Cache. Dir. Levels levels sets the number of subdirectory levels in the cache; cached data will be saved this many directory levels below the Cache. Root directory Cache. Dir. Length length sets the number of characters for each subdirectory name in the cache hierarchy Web Servers 120
Apache Security Tips Protect your Apache code apache installation should only be modifiable by the root /, /usr/local/, /usr/local/apache 2 directories and subdirectories should have root as owner and group and have permissions drwxr_xr_x files in /usr/local/apache 2 subdirectories should have root as owner and group, but not be writeable by anyone other than root Web Servers 121
Apache Security Tips keep aware of updates to the software (subscribe to Apache HTTP Server Announcements List) Server-Side Includes enabling server-side includes causes Apache to parse files looking for these even when files do not contain them (increasing load); can limit this problem by requiring files with server-side includes to have a different extension (. shtml, for example) also SSI can execute CGI scripts Web Servers 122
Apache Security Tips CGI Scripts can essentially run arbitrary commands on the system (rm -R /) CGI scripts (by default) are run as the same user (the user of the server) so one CGI script can damage the output of another can use su. Exec to make CGI scripts run as a different user can limit the location of the CGI script to give some control over what can go in the script Web Servers 123
Apache Security Tips embedded scripts Embedded scripting options which run as part of the server itself, such as mod_php, mod_perl, mod_tcl, and mod_python, run under the identity of the server itself and can access anything the server user can control where. htaccess files can go <Directory /> Allow. Override None </Directory> #applies to all directories; can specifically #enable some Web Servers 124
Apache Security Tips Forbid access to all locations and then selectively enable access to certain locations <Directory /> Order Deny, Allow Deny from all </Directory> Monitor log files will show what kinds of attacks are being done on your system and whether the necessary security is present Web Servers 125
- Slides: 125