URLs Inet Addresses and URLConnections High Level Network

  • Slides: 93
Download presentation
URLs, Inet. Addresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab. unc.

URLs, Inet. Addresses, and URLConnections High Level Network Programming Elliotte Rusty Harold [email protected] unc. edu http: //metalab. unc. edu/javafaq/slides/ © 1999 Elliotte Rusty Harold 3/12/2021

We will learn how Java handles • • • Internet Addresses URLs CGI URLConnection

We will learn how Java handles • • • Internet Addresses URLs CGI URLConnection Content and Protocol handlers © 1999 Elliotte Rusty Harold 3/12/2021

I assume you • Understand basic Java syntax and I/O • Have a user’s

I assume you • Understand basic Java syntax and I/O • Have a user’s view of the Internet • No prior network programming experience © 1999 Elliotte Rusty Harold 3/12/2021

Applet Network Security Restrictions • Applets may: – send data to the code base

Applet Network Security Restrictions • Applets may: – send data to the code base – receive data from the code base • Applets may not: – send data to hosts other than the code base – receive data from hosts other than the code base © 1999 Elliotte Rusty Harold 3/12/2021

Some Background • • Hosts Internet Addresses Ports Protocols © 1999 Elliotte Rusty Harold

Some Background • • Hosts Internet Addresses Ports Protocols © 1999 Elliotte Rusty Harold 3/12/2021

Hosts • Devices connected to the Internet are called hosts • Most hosts are

Hosts • Devices connected to the Internet are called hosts • Most hosts are computers, but hosts also include routers, printers, fax machines, soda machines, bat houses, etc. © 1999 Elliotte Rusty Harold 3/12/2021

Internet addresses • Every host on the Internet is identified by a unique, four-byte

Internet addresses • Every host on the Internet is identified by a unique, four-byte Internet Protocol (IP) address. • This is written in dotted quad format like 199. 1. 32. 90 where each byte is an unsigned integer between 0 and 255. • There about four billion unique IP addresses, but they aren’t very efficiently allocated © 1999 Elliotte Rusty Harold 3/12/2021

Domain Name System (DNS) • Numeric addresses are mapped to names like "www. blackstar.

Domain Name System (DNS) • Numeric addresses are mapped to names like "www. blackstar. com" or "star. blackstar. com" by DNS. • Each site runs domain name server software that translates names to IP addresses and vice versa • DNS is a distributed system © 1999 Elliotte Rusty Harold 3/12/2021

The Inet. Address Class • The java. net. Inet. Address class represents an IP

The Inet. Address Class • The java. net. Inet. Address class represents an IP address. • It converts numeric addresses to host names and host names to numeric addresses. • It is used by other network classes like Socket and Server. Socket to identify hosts © 1999 Elliotte Rusty Harold 3/12/2021

Creating Inet. Addresses • There are no public Inet. Address() constructors. Arbitrary addresses may

Creating Inet. Addresses • There are no public Inet. Address() constructors. Arbitrary addresses may not be created. • All addresses that are created must be checked with DNS © 1999 Elliotte Rusty Harold 3/12/2021

The get. By. Name() factory method public static Inet. Address get. By. Name(String host)

The get. By. Name() factory method public static Inet. Address get. By. Name(String host) throws Unknown. Host. Exception Inet. Address utopia, duke; try { utopia = Inet. Address. get. By. Name("utopia. poly. edu"); duke = Inet. Address. get. By. Name("128. 238. 2. 92"); } catch (Unknown. Host. Exception e) { System. err. println(e); } © 1999 Elliotte Rusty Harold 3/12/2021

Other ways to create Inet. Address objects public static Inet. Address[] get. All. By.

Other ways to create Inet. Address objects public static Inet. Address[] get. All. By. Name(String host) throws Unknown. Host. Exception public static Inet. Address get. Local. Host() throws Unknown. Host. Exception © 1999 Elliotte Rusty Harold 3/12/2021

Getter Methods • • public © 1999 Elliotte Rusty Harold boolean String byte[] String

Getter Methods • • public © 1999 Elliotte Rusty Harold boolean String byte[] String is. Multicast. Address() get. Host. Name() get. Address() get. Host. Address() 3/12/2021

Utility Methods • public int hash. Code() • public boolean equals(Object o) • public

Utility Methods • public int hash. Code() • public boolean equals(Object o) • public String to. String() © 1999 Elliotte Rusty Harold 3/12/2021

Ports • In general a host has only one Internet address • This address

Ports • In general a host has only one Internet address • This address is subdivided into 65, 536 ports • Ports are logical abstractions that allow one host to communicate simultaneously with many other hosts • Many services run on well-known ports. For example, http tends to run on port 80 © 1999 Elliotte Rusty Harold 3/12/2021

Protocols • A protocol defines how two hosts talk to each other. • The

Protocols • A protocol defines how two hosts talk to each other. • The daytime protocol, RFC 867, specifies an ASCII representation for the time that's legible to humans. • The time protocol, RFC 868, specifies a binary representation, for the time that's legible to computers. • There are thousands of protocols, standard and non-standard © 1999 Elliotte Rusty Harold 3/12/2021

IETF RFCs • Requests For Comment • Document how much of the Internet works

IETF RFCs • Requests For Comment • Document how much of the Internet works • Various status levels from obsolete to required to informational • TCP/IP, telnet, SMTP, MIME, HTTP, and more • http: //www. faqs. org/rfc/ © 1999 Elliotte Rusty Harold 3/12/2021

W 3 C Standards • IETF is based on “rough consensus and running code”

W 3 C Standards • IETF is based on “rough consensus and running code” • W 3 C tries to run ahead of implementation • IETF is an informal organization open to participation by anyone • W 3 C is a vendor consortium open only to companies © 1999 Elliotte Rusty Harold 3/12/2021

W 3 C Standards • • HTTP HTML XML RDF Math. ML SMIL P

W 3 C Standards • • HTTP HTML XML RDF Math. ML SMIL P 3 P © 1999 Elliotte Rusty Harold 3/12/2021

URLs • A URL, short for "Uniform Resource Locator", is a way to unambiguously

URLs • A URL, short for "Uniform Resource Locator", is a way to unambiguously identify the location of a resource on the Internet. © 1999 Elliotte Rusty Harold 3/12/2021

Example URLs http: //java. sun. com/ file: ///Macintosh%20 HD/Java/Docs/JDK%201. 1. 1%20 docs/api/ja va. net.

Example URLs http: //java. sun. com/ file: ///Macintosh%20 HD/Java/Docs/JDK%201. 1. 1%20 docs/api/ja va. net. Inet. Address. html#_top_ http: //www. macintouch. com: 80/newsrecent. shtml ftp: //ftp. info. apple. com/pub/ mailto: [email protected] unc. edu telnet: //utopia. poly. edu ftp: //mp 3: mp [email protected] 247. 121. 61: 21000/c%3 a/stuff/mp 3/ http: //[email protected] oreilly. com/ http: //metalab. unc. edu/nywc/comps. phtml? category=Choral+Wo rks © 1999 Elliotte Rusty Harold 3/12/2021

The Pieces of a URL • the protocol, aka scheme • the authority –

The Pieces of a URL • the protocol, aka scheme • the authority – user info user name password – host name or address – port • the path, aka file • the ref, aka section or anchor • the query string © 1999 Elliotte Rusty Harold 3/12/2021

The java. net. URL class • A URL object represents a URL. • The

The java. net. URL class • A URL object represents a URL. • The URL class contains methods to – create new URLs – parse the different parts of a URL – get an input stream from a URL so you can read data from a server – get content from the server as a Java object © 1999 Elliotte Rusty Harold 3/12/2021

Content and Protocol Handlers • Content and protocol handlers separate the data being downloaded

Content and Protocol Handlers • Content and protocol handlers separate the data being downloaded from the protocol used to download it. • The protocol handler negotiates with the server and parses any headers. It gives the content handler only the actual data of the requested resource. • The content handler translates those bytes into a Java object like an Input. Stream or Image. Producer. © 1999 Elliotte Rusty Harold 3/12/2021

Finding Protocol Handlers • When the virtual machine creates a URL object, it looks

Finding Protocol Handlers • When the virtual machine creates a URL object, it looks for a protocol handler that understands the protocol part of the URL such as "http" or "mailto". • If no such handler is found, the constructor throws a Malformed. URLException. © 1999 Elliotte Rusty Harold 3/12/2021

Supported Protocols • The exact protocols that Java supports vary from implementation to implementation

Supported Protocols • The exact protocols that Java supports vary from implementation to implementation though http and file are supported pretty much everywhere. Sun's JDK 1. 1 understands ten: – file – ftp – gopher – http – mailto © 1999 Elliotte Rusty Harold –appletresource –doc –netdoc –systemresource –verbatim 3/12/2021

URL Constructors • There are four (six in 1. 2) constructors in the java.

URL Constructors • There are four (six in 1. 2) constructors in the java. net. URL class. public URL(String u) throws Malformed. URLException public URL(String protocol, String host, String file) throws Malformed. URLException public URL(String protocol, String host, int port, String file) throws Malformed. URLException public URL(URL context, String url) throws Malformed. URLException public URL(String protocol, String host, int port, String file, URLStream. Handler handler) throws Malformed. URLException public URL(URL context, String url, URLStream. Handler handler) throws Malformed. URLException © 1999 Elliotte Rusty Harold 3/12/2021

Constructing URL Objects • An absolute URL like http: //www. poly. edu/fall 97/grad. html#cs

Constructing URL Objects • An absolute URL like http: //www. poly. edu/fall 97/grad. html#cs try { URL u = new URL("http: //www. poly. edu/fall 97/grad. html#cs") ; } catch (Malformed. URLException e) {} © 1999 Elliotte Rusty Harold 3/12/2021

Constructing URL Objects in Pieces • You can also construct the URL by passing

Constructing URL Objects in Pieces • You can also construct the URL by passing its pieces to the constructor, like this: URL u = null; try { u = new URL("http", "www. poly. edu", "/schedule/fall 97/bgrad. html#cs"); } catch (Malformed. URLException e) {} © 1999 Elliotte Rusty Harold 3/12/2021

Including the Port URL u = null; try { u = new URL("http", "www.

Including the Port URL u = null; try { u = new URL("http", "www. poly. edu", 8000, "/fall 97/grad. html#cs"); } catch (Malformed. URLException e) {} © 1999 Elliotte Rusty Harold 3/12/2021

Relative URLs • Many HTML files contain relative URLs. • Consider the page http:

Relative URLs • Many HTML files contain relative URLs. • Consider the page http: //metalab. unc. edu/javafaq/index. html • On this page a link to “books. html" refers to http: //metalab. unc. edu/javafaq/books. html. © 1999 Elliotte Rusty Harold 3/12/2021

Constructing Relative URLs • The fourth constructor creates URLs relative to a given URL.

Constructing Relative URLs • The fourth constructor creates URLs relative to a given URL. For example, try { URL u 1 = new URL("http: //metalab. unc. edu/index. html" ); URL u 2 = new URL(u 1, ”books. html"); } catch (Malformed. URLException e) {} • This is particularly useful when parsing HTML. © 1999 Elliotte Rusty Harold 3/12/2021

Parsing URLs • The java. net. URL class has five methods to split a

Parsing URLs • The java. net. URL class has five methods to split a URL into its component parts. These are: public public © 1999 Elliotte Rusty Harold String int String get. Protocol() get. Host() get. Port() get. File() get. Ref() 3/12/2021

For example, try { URL u = new URL("http: //www. poly. edu/fall 97/grad. html#cs

For example, try { URL u = new URL("http: //www. poly. edu/fall 97/grad. html#cs "); System. out. println("The protocol is " + u. get. Protocol()); System. out. println("The host is " + u. get. Host()); System. out. println("The port is " + u. get. Port()); System. out. println("The file is " + u. get. File()); System. out. println("The anchor is " + u. get. Ref()); } catch (Malformed. URLException e) { } © 1999 Elliotte Rusty Harold 3/12/2021

Parsing URLs • JDK 1. 3 adds three more: public String get. Authority() public

Parsing URLs • JDK 1. 3 adds three more: public String get. Authority() public String get. User. Info() public String get. Query() © 1999 Elliotte Rusty Harold 3/12/2021

Missing Pieces • If a port is not explicitly specified in the URL it's

Missing Pieces • If a port is not explicitly specified in the URL it's set to -1. This means the default port is to be used. • If the ref doesn't exist, it's just null, so watch out for Null. Pointer. Exceptions. Better yet, test to see that it's non-null before using it. • If the file is left off completely, e. g. http: //java. sun. com, then it's set to "/". © 1999 Elliotte Rusty Harold 3/12/2021

Reading Data from a URL • The open. Stream() method connects to the server

Reading Data from a URL • The open. Stream() method connects to the server specified in the URL and returns an Input. Stream object fed by the data from that connection. public final Input. Stream open. Stream() throws IOException • Any headers that precede the actual data are stripped off before the stream is opened. • Network connections are less reliable and slower than files. Buffer with a Buffered. Reader or a Buffered. Input. Stream. © 1999 Elliotte Rusty Harold 3/12/2021

Webcat import java. net. *; import java. io. *; public class Webcat { public

Webcat import java. net. *; import java. io. *; public class Webcat { public static void main(String[] args) { for (int i = 0; i < args. length; i++) { try { URL u = new URL(args[i]); Input. Stream in = u. open. Stream(); Input. Stream. Reader isr = new Input. Stream. Reader(in); Buffered. Reader br = new Buffered. Reader(isr); String the. Line; while ((the. Line = br. read. Line()) != null) { System. out. println(the. Line); } } catch (IOException e) { System. err. println(e); } }© 1999 Elliotte Rusty Harold 3/12/2021

The Bug in read. Line() • What read. Line() does: – Sees a carriage

The Bug in read. Line() • What read. Line() does: – Sees a carriage return, waits to see if next character is a line feed before returning • What read. Line() should do: – Sees a carriage return, throw away next character if it's a linefeed © 1999 Elliotte Rusty Harold 3/12/2021

Webcat import java. net. *; import java. io. *; public class Webcat { public

Webcat import java. net. *; import java. io. *; public class Webcat { public static void main(String[] args) { for (int i = 0; i < args. length; i++) { try { URL u = new URL(args[i]); Input. Stream in = u. open. Stream(); Input. Stream. Reader isr = new Input. Stream. Reader(in); char c; while ((c = br. read()) != -1) { System. out. print(c); } } catch (IOException e) { System. err. println(e); } } © 1999 Elliotte Rusty Harold 3/12/2021

CGI • Common Gateway Interface • A lot is written about writing server side

CGI • Common Gateway Interface • A lot is written about writing server side CGI. I’m going to show you client side CGI. • We’ll need to explore HTTP a little deeper to do this © 1999 Elliotte Rusty Harold 3/12/2021

Normal web surfing uses these two steps: – The browser requests a page –

Normal web surfing uses these two steps: – The browser requests a page – The server sends the page • Data flows primarily from the server to the client. © 1999 Elliotte Rusty Harold 3/12/2021

Forms • There are times when the server needs to get data from the

Forms • There are times when the server needs to get data from the client rather than the other way around. The common way to do this is with a form like this one: © 1999 Elliotte Rusty Harold 3/12/2021

CGI • The user types the requested data into the form and hits the

CGI • The user types the requested data into the form and hits the submit button. • The client browser then sends the data to the server using the Common Gateway Interface, CGI for short. • CGI uses the HTTP protocol to transmit the data, either as part of the query string or as separate data following the MIME header. © 1999 Elliotte Rusty Harold 3/12/2021

GET and POST • When the data is sent as a query string included

GET and POST • When the data is sent as a query string included with the file request, this is called CGI GET. • When the data is sent as data attached to the request following the MIME header, this is called CGI POST © 1999 Elliotte Rusty Harold 3/12/2021

HTTP • Web browsers communicate with web servers through a standard protocol known as

HTTP • Web browsers communicate with web servers through a standard protocol known as HTTP, an acronym for Hyper. Text Transfer Protocol. • This protocol defines – how a browser requests a file from a web server – how a browser sends additional data along with the request (e. g. the data formats it can accept), – how the server sends data back to the client – response codes © 1999 Elliotte Rusty Harold 3/12/2021

A Typical HTTP Connection – Client opens a socket to port 80 on the

A Typical HTTP Connection – Client opens a socket to port 80 on the server. – Client sends a GET request including the name and path of the file it wants and the version of the HTTP protocol it supports. – The client sends a MIME header. – The client sends a blank line. – The server sends a MIME header – The server sends the data in the file. – The server closes the connection. © 1999 Elliotte Rusty Harold 3/12/2021

What the client sends to the server GET /javafaq/images/cup. gif Connection: Keep-Alive User-Agent: Mozilla/3.

What the client sends to the server GET /javafaq/images/cup. gif Connection: Keep-Alive User-Agent: Mozilla/3. 01 (Macintosh; I; PPC) Host: www. oreilly. com: 80 Accept: image/gif, image/x-xbitmap, image/jpeg, */* © 1999 Elliotte Rusty Harold 3/12/2021

MIME • MIME is an acronym for "Multipurpose Internet Mail Extensions". • an Internet

MIME • MIME is an acronym for "Multipurpose Internet Mail Extensions". • an Internet standard defined in RFCs 2045 through 2049 • originally intended for use with email messages, but has been adopted for use in HTTP. © 1999 Elliotte Rusty Harold 3/12/2021

Browser Request MIME Header • When the browser sends a request to a web

Browser Request MIME Header • When the browser sends a request to a web server, it also sends a MIME header. • MIME headers contain name-value pairs, essentially a name followed by a colon and a space, followed by a value. Connection: Keep-Alive User-Agent: Mozilla/3. 01 (Macintosh; I; PPC) Host: www. digitalthink. com: 80 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* © 1999 Elliotte Rusty Harold 3/12/2021

Server Response MIME Header • When a web server responds to a web browser

Server Response MIME Header • When a web server responds to a web browser it sends a MIME header along with the response that looks something like this: Server: Netscape-Enterprise/2. 01 Date: Sat, 02 Aug 1997 07: 52: 46 GMT Accept-ranges: bytes Last-modified: Tue, 29 Jul 1997 15: 06: 46 GMT Content-length: 2810 Content-type: text/html © 1999 Elliotte Rusty Harold 3/12/2021

Query Strings • CGI GET data is sent in URL encoded query strings •

Query Strings • CGI GET data is sent in URL encoded query strings • a query string is a set of name=value pairs separated by ampersands Author=Sadie, Julie&Title=Women Composers • separated from rest of URL by a question mark © 1999 Elliotte Rusty Harold 3/12/2021

URL Encoding • Alphanumeric ASCII characters (a-z, A-Z, and 0 -9) and the $-_.

URL Encoding • Alphanumeric ASCII characters (a-z, A-Z, and 0 -9) and the $-_. !*'(), punctuation symbols are left unchanged. • The space character is converted into a plus sign (+). • Other characters (e. g. &, =, ^, #, %, ^, {, and so on) are translated into a percent sign followed by the two hexadecimal digits corresponding to their numeric value. © 1999 Elliotte Rusty Harold 3/12/2021

For example, • The comma is ASCII character 44 (decimal) or 2 C (hex).

For example, • The comma is ASCII character 44 (decimal) or 2 C (hex). Therefore if the comma appears as part of a URL it is encoded as %2 C. • The query string "Author=Sadie, Julie&Title=Women Composers" is encoded as: Author=Sadie%2 C+Julie&Title=Women+Composers © 1999 Elliotte Rusty Harold 3/12/2021

The URLEncoder class • The java. net. URLEncoder class contains a single static method

The URLEncoder class • The java. net. URLEncoder class contains a single static method which encodes strings in x-www-form-urlencoded format URLEncoder. encode(String s) © 1999 Elliotte Rusty Harold 3/12/2021

For example, String qs = "Author=Sadie, Julie&Title=Women Composers"; String eqs = URLEncoder. encode(qs); System.

For example, String qs = "Author=Sadie, Julie&Title=Women Composers"; String eqs = URLEncoder. encode(qs); System. out. println(eqs); • This prints: Author%3 d. Sadie%2 c+Julie%26 Title%3 d. Women+Composers © 1999 Elliotte Rusty Harold 3/12/2021

String eqs = "Author=" + URLEncoder. encode("Sadie, Julie"); eqs += "&"; eqs += "Title=";

String eqs = "Author=" + URLEncoder. encode("Sadie, Julie"); eqs += "&"; eqs += "Title="; eqs += URLEncoder. encode("Women Composers"); • This prints the properly encoded query string: Author=Sadie%2 c+Julie&Title=Women+Composers © 1999 Elliotte Rusty Harold 3/12/2021

The URLDecoder class • In Java 1. 2 the java. net. URLDecoder class contains

The URLDecoder class • In Java 1. 2 the java. net. URLDecoder class contains a single static method which decodes strings in x-www-form-urlencoded format URLEncoder. decode(String s) © 1999 Elliotte Rusty Harold 3/12/2021

GET URLs String eqs = "Author=" + URLEncoder. encode("Sadie, Julie"); eqs += "&"; eqs

GET URLs String eqs = "Author=" + URLEncoder. encode("Sadie, Julie"); eqs += "&"; eqs += "Title="; eqs += URLEncoder. encode("Women Composers"); try { URL u = new URL("http: //www. superbooks. com/search. cgi? " + eqs); Input. Stream in = u. open. Stream(); //. . . } catch (IOException e) { //. . . © 1999 Elliotte Rusty Harold 3/12/2021

URLConnections • The java. net. URLConnection class is an abstract class that handles communication

URLConnections • The java. net. URLConnection class is an abstract class that handles communication with different kinds of servers like ftp servers and web servers. • Protocol specific subclasses of URLConnection handle different kinds of servers. • By default, connections to HTTP URLs use the GET method. © 1999 Elliotte Rusty Harold 3/12/2021

URLConnections vs. URLs • Can send output as well as read input • Can

URLConnections vs. URLs • Can send output as well as read input • Can post data to CGIs • Can read headers from a connection © 1999 Elliotte Rusty Harold 3/12/2021

URLConnection five steps: 1. The URL is constructed. 2. The URL’s open. Connection() method

URLConnection five steps: 1. The URL is constructed. 2. The URL’s open. Connection() method creates the URLConnection object. 3. The parameters for the connection and the request properties that the client sends to the server are set up. 4. The connect() method makes the connection to the server. (optional) 5. The response header information is read using get. Header. Field(). © 1999 Elliotte Rusty Harold 3/12/2021

I/O Across a URLConnection • Data may be read from the connection in one

I/O Across a URLConnection • Data may be read from the connection in one of two ways – raw by using the input stream returned by get. Input. Stream() – through a content handler with get. Content(). • Data can be sent to the server using the output stream provided by get. Output. Stream(). © 1999 Elliotte Rusty Harold 3/12/2021

For example, try { URL u = new URL("http: //www. sd 99. com/"); URLConnection

For example, try { URL u = new URL("http: //www. sd 99. com/"); URLConnection uc = u. open. Connection(); uc. connect(); Input. Stream in = uc. get. Input. Stream(); // read the data. . . } catch (IOException e) { //. . . © 1999 Elliotte Rusty Harold 3/12/2021

Reading Header Data • The get. Header. Field(String name) method returns the string value

Reading Header Data • The get. Header. Field(String name) method returns the string value of a named header field. • Names are case-insensitive. • If the requested field is not present, null is returned. String lm = uc. get. Header. Field("Last-modified"); © 1999 Elliotte Rusty Harold 3/12/2021

get. Header. Field. Key() • The keys of the header fields are returned by

get. Header. Field. Key() • The keys of the header fields are returned by the get. Header. Field. Key(int n) method. • The first field is 1. • If a numbered key is not found, null is returned. • You can use this in combination with get. Header. Field() to loop through the complete header © 1999 Elliotte Rusty Harold 3/12/2021

For example String key = null; for (int i=1; (key = uc. get. Header.

For example String key = null; for (int i=1; (key = uc. get. Header. Field. Key(i))!=null); i++) { System. out. println(key + ": " + uc. get. Header. Field(key)); } © 1999 Elliotte Rusty Harold 3/12/2021

get. Header. Field. Int() and get. Header. Field. Date() • These are utility methods

get. Header. Field. Int() and get. Header. Field. Date() • These are utility methods that read a named header and convert its value into an int and a long respectively. public int get. Header. Field. Int(String name, int default) public long get. Header. Field. Date(String name, long default) © 1999 Elliotte Rusty Harold 3/12/2021

 • The long returned by get. Header. Field. Date() can be converted into

• The long returned by get. Header. Field. Date() can be converted into a Date object using a Date() constructor like this: String s = uc. get. Header. Field. Date("Last-modified", 0); Date lm = new Date(s); © 1999 Elliotte Rusty Harold 3/12/2021

Six Convenience Methods • These return the values of six particularly common header fields:

Six Convenience Methods • These return the values of six particularly common header fields: public public © 1999 Elliotte Rusty Harold int String long get. Content. Length() get. Content. Type() get. Content. Encoding() get. Expiration() get. Date() get. Last. Modified() 3/12/2021

try { URL u = new URL("http: //www. sdexpo. com/"); URLConnection uc = u.

try { URL u = new URL("http: //www. sdexpo. com/"); URLConnection uc = u. open. Connection(); uc. connect(); String key=null; for (int n = 1; (key=uc. get. Header. Field. Key(n)) != null; n++) { System. out. println(key + ": " + uc. get. Header. Field(key)); } } catch (IOException e) { System. err. println(e); } © 1999 Elliotte Rusty Harold 3/12/2021

Writing data to a URLConnection • Similar to reading data from a URLConnection. •

Writing data to a URLConnection • Similar to reading data from a URLConnection. • First inform the URLConnection that you plan to use it for output • Before getting the connection's input stream, get the connection's output stream and write to it. • Commonly used to talk to CGIs that use the POST method © 1999 Elliotte Rusty Harold 3/12/2021

Eight Steps: 1. Construct the URL. 2. Call the URL’s open. Connection() method to

Eight Steps: 1. Construct the URL. 2. Call the URL’s open. Connection() method to create the URLConnection object. 3. Pass true to the URLConnection’s set. Do. Output() method 4. Create the data you want to send, preferably as a byte array. © 1999 Elliotte Rusty Harold 3/12/2021

5. Call get. Output. Stream() to get an output stream object. 6. Write the

5. Call get. Output. Stream() to get an output stream object. 6. Write the byte array calculated in step 5 onto the stream. 7. Close the output stream. 8. Call get. Input. Stream() to get an input stream object. Read from it as usual. © 1999 Elliotte Rusty Harold 3/12/2021

POST CGIs • A typical POST request to a CGI looks like this: POST

POST CGIs • A typical POST request to a CGI looks like this: POST /cgi-bin/booksearch. pl HTTP/1. 0 Referer: http: //www. macfaq. com/sampleform. html User-Agent: Mozilla/3. 01 (Macintosh; I; PPC) Content-length: 60 Content-type: text/x-www-form-urlencoded Host: utopia. poly. edu: 56435 username=Sadie%2 C+Julie&realname=Women+Composers © 1999 Elliotte Rusty Harold 3/12/2021

A POST request includes • the POST line • a MIME header which must

A POST request includes • the POST line • a MIME header which must include – content type – content length • a blank line that signals the end of the MIME header • the actual data of the form, encoded in xwww-form-urlencoded format. © 1999 Elliotte Rusty Harold 3/12/2021

 • A URLConnection for an http URL will set up the request line

• A URLConnection for an http URL will set up the request line and the MIME header for you as long as you set its do. Output field to true by invoking set. Do. Output(true). • If you also want to read from the connection, you should set do. Input to true with set. Do. Input(true) too. © 1999 Elliotte Rusty Harold 3/12/2021

For example, URLConnection uc = u. open. Connection(); uc. set. Do. Output(true); uc. set.

For example, URLConnection uc = u. open. Connection(); uc. set. Do. Output(true); uc. set. Do. Input(true); © 1999 Elliotte Rusty Harold 3/12/2021

 • The request line and MIME header are sent as soon as the

• The request line and MIME header are sent as soon as the URLConnection connects. Then get. Output. Stream() returns an output stream on which you can write the x-www-form-urlencoded name-value pairs. © 1999 Elliotte Rusty Harold 3/12/2021

Http. URLConnection • java. net. Http. URLConnection is an abstract subclass of URLConnection that

Http. URLConnection • java. net. Http. URLConnection is an abstract subclass of URLConnection that provides some additional methods specific to the HTTP protocol. • URL connection objects that are returned by an http URL will be instances of java. net. Http. URLConnection. © 1999 Elliotte Rusty Harold 3/12/2021

Recall • a typical HTTP response from a web server begins like this: HTTP/1.

Recall • a typical HTTP response from a web server begins like this: HTTP/1. 0 200 OK Server: Netscape-Enterprise/2. 01 Date: Sat, 02 Aug 1997 07: 52: 46 GMT Accept-ranges: bytes Last-modified: Tue, 29 Jul 1997 15: 06: 46 GMT Content-length: 2810 Content-type: text/html © 1999 Elliotte Rusty Harold 3/12/2021

Response Codes • The get. Header. Field() and get. Header. Field. Key() don't return

Response Codes • The get. Header. Field() and get. Header. Field. Key() don't return the HTTP response code • After you've connected, you can retrieve the numeric response code--200 in the above example--with the get. Response. Code() method and the message associated with it-OK in the above example--with the get. Response. Message() method. © 1999 Elliotte Rusty Harold 3/12/2021

HTTP Protocols • Java 1. 0 only supports GET and POST requests to HTTP

HTTP Protocols • Java 1. 0 only supports GET and POST requests to HTTP servers • Java 1. 1/1. 2 supports GET, POST, HEAD, OPTIONS, PUT, DELETE, and TRACE. • The protocol is chosen with the set. Request. Method(String method) method. • A java. net. Protocol. Exception, a subclass of IOException, is thrown if an unknown protocol is specified. © 1999 Elliotte Rusty Harold 3/12/2021

get. Request. Method() • The get. Request. Method() method returns the string form of

get. Request. Method() • The get. Request. Method() method returns the string form of the request method currently set for the URLConnection. GET is the default method. © 1999 Elliotte Rusty Harold 3/12/2021

disconnect() • The disconnect() method of the Http. URLConnection class closes the connection to

disconnect() • The disconnect() method of the Http. URLConnection class closes the connection to the web server. • Needed for HTTP/1. 1 Keep-alive © 1999 Elliotte Rusty Harold 3/12/2021

For example, try { URL u = new URL("http: //www. amnesty. org/"); Http. URLConnection

For example, try { URL u = new URL("http: //www. amnesty. org/"); Http. URLConnection huc = (Http. URLConnection) u. open. Connection(); huc. set. Request. Method("PUT"); huc. connect(); Output. Stream os = huc. get. Output. Stream(); int code = huc. get. Response. Code(); if (code >= 200 && < 300) { // put the data. . . } huc. disconnect(); } catch (IOException e) { //. . . © 1999 Elliotte Rusty Harold 3/12/2021

using. Proxy • The boolean using. Proxy() method returns true if web connections are

using. Proxy • The boolean using. Proxy() method returns true if web connections are being funneled through a proxy server, false if they're not. © 1999 Elliotte Rusty Harold 3/12/2021

Redirect Instructions • Most web servers can be configured to automatically redirect browsers to

Redirect Instructions • Most web servers can be configured to automatically redirect browsers to the new location of a page that's moved. • To redirect browsers, a server sends a 300 level response and a Location header that specifies the new location of the requested page. © 1999 Elliotte Rusty Harold 3/12/2021

GET /~elharo/macfaq/index. html HTTP/1. 0 HTTP/1. 1 302 Moved Temporarily Date: Mon, 04 Aug

GET /~elharo/macfaq/index. html HTTP/1. 0 HTTP/1. 1 302 Moved Temporarily Date: Mon, 04 Aug 1997 14: 21: 27 GMT Server: Apache/1. 2 b 7 Location: http: //www. macfaq. com/macfaq/index. html Connection: close Content-type: text/html <HTML><HEAD> <TITLE>302 Moved Temporarily</TITLE> </HEAD><BODY> <H 1>Moved Temporarily</H 1> The document has moved <A HREF="http: //www. macfaq. com/macfaq/index. html">he re</A>. <P> </BODY></HTML> © 1999 Elliotte Rusty Harold 3/12/2021

 • HTML is returned for browsers that don't understand redirects, but most modern

• HTML is returned for browsers that don't understand redirects, but most modern browsers do not display this and jump straight to the page specified in the Location header instead. • Because redirects can change the site which a user is connecting without their knowledge so redirects are not arbitrarily followed by URLConnections. © 1999 Elliotte Rusty Harold 3/12/2021

Following Redirects Http. URLConnection. set. Follow. Redirects (true) method says that connections will follow

Following Redirects Http. URLConnection. set. Follow. Redirects (true) method says that connections will follow redirect instructions from the web server. Untrusted applets are not allowed to set this. Http. URLConnection. get. Follow. Redirects () returns true if redirect requests are honored, false if they're not. © 1999 Elliotte Rusty Harold 3/12/2021

To Learn More • Java Network Programming – O’Reilly & Associates, 1997 – ISBN

To Learn More • Java Network Programming – O’Reilly & Associates, 1997 – ISBN 1 -56592 -227 -1 • Java I/O – O’Reilly & Associates, 1999 – ISBN 1 -56592 -485 -1 • Web Client Programming with Java – http: //www. digitalthink. com/catalog/cs/cs 308/index. html © 1999 Elliotte Rusty Harold 3/12/2021

Questions? © 1999 Elliotte Rusty Harold 3/12/2021

Questions? © 1999 Elliotte Rusty Harold 3/12/2021