Characters and Strings Characters In Java a char

  • Slides: 18
Download presentation
Characters and Strings

Characters and Strings

Characters • In Java, a char is a primitive type that can hold one

Characters • In Java, a char is a primitive type that can hold one single character • A character can be: – A letter or digit – A punctuation mark – A space, tab, newline, or other whitespace – A control character • Control characters are holdovers from the days of teletypes

char literals • A char literal is written between single quotes: 'a' 'A' '5'

char literals • A char literal is written between single quotes: 'a' 'A' '5' '? ' '' • Some characters cannot be typed directly and must be written as an “escape sequence”: – Tab is 't' – Newline is 'n' • Some characters must be escaped to prevent ambiguity: – Single quote is ''' (quote-backslash-quote) – Backslash is '\'

Additional character literals n newline t tab b backspace r return f \ '

Additional character literals n newline t tab b backspace r return f \ ' " form feed backslash single quote double quote

Character encodings • A character is represented as a pattern of bits • The

Character encodings • A character is represented as a pattern of bits • The number of characters that can be represented depends on the number of bits used • For a long time, ASCII (American Standard Code for Information Interchange) has been used • ASCII is a seven-bit code (allows 128 characters) • ASCII is barely enough for English – Omits many useful characters: ¢ ½ç“”

Unicode • Unicode is a new 16 bit (two byte) standard that is designed

Unicode • Unicode is a new 16 bit (two byte) standard that is designed to replace ASCII • “Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. ” • Java uses Unicode to represent characters – You should know that Java uses Unicode, but – Except for having these extra characters available, it seldom makes any difference to how you program

Unicode character literals • The rest of the ASCII characters can be written as

Unicode character literals • The rest of the ASCII characters can be written as octal numbers from to 377 • Any Unicode character can be written as a hexadecimal number between u 0000 and u. FFFF • Since there are over 64000 possible Unicode characters, the list occupies an entire book – This makes it hard to look up characters • Unicode letters in any alphabet can be used in identifiers

Glyphs and fonts • A glyph is the printed representation of a character •

Glyphs and fonts • A glyph is the printed representation of a character • For example, the letter ‘A’ can be represented by any of the glyphs A A • A font is a collection of glyphs • Unicode describes characters, not glyphs

Strings • A String is a kind of object, and obeys all the rules

Strings • A String is a kind of object, and obeys all the rules for objects • In addition, there is extra syntax for string literals and string concatenation • A string is made up of zero or more characters • The string containing zero characters is called the empty string

String literals • A string literal consists of zero or more characters enclosed in

String literals • A string literal consists of zero or more characters enclosed in double quotes "" "Hello" "This is a String literal. " • To put a double quote character inside a string, it must be backslashed: ""Wait, " he said, "Don't go!"" • Inside a string, a single quote character does not need to be backslashed (but it can be)

String concatenation • Strings can be concatenated (put together) with the + operator "Hello,

String concatenation • Strings can be concatenated (put together) with the + operator "Hello, " + name + "!" • Anything “added” to a String is converted to a string and concatenated • Concatenation is done left to right: "abc" + 3 + 5 gives "abc 35" 3 + 5 + "abc" gives "8 abc" 3 + (5 + "abc") gives "35 abc"

Newlines • The character 'n' represents a newline • When “printing” to the screen,

Newlines • The character 'n' represents a newline • When “printing” to the screen, you can go to a new line by printing a newline character • You can also go to a new line by using System. out. println with no argument or with one argument • When writing to a file, you should avoid n and use System. out. println instead – I’ll explain this when we talk about file I/O

System. out. print and println • System. out. println can be called with no

System. out. print and println • System. out. println can be called with no arguments (parameters), or with one argument • System. out. println is called with one argument • The argument may be any of the 8 primitive types • The argument may be any object • Java can print any object, but it doesn’t always do a good job – Java does a good job printing Strings – Java typically does a poor job printing types you define

Printing your objects • In any class, you can define the following instance method:

Printing your objects • In any class, you can define the following instance method: public String to. String() {. . . } • This method can return any string you choose • If you have an instance x, you can get its string representation by calling x. to. String() • If you define your to. String() method exactly as above, it will be used by System. out. print and System. out. println

Constructing a String • You can construct a string by writing it as a

Constructing a String • You can construct a string by writing it as a literal: "This is special syntax to construct a String. " • Since a string is an object, you could construct it with new: new String("This also constructs a String. ") • But using new for constructing a string is foolish, because you have to write the string as a literal to pass it in to the constructor – You’re doing the same work twice!

String methods • This is only a sampling of string methods • All are

String methods • This is only a sampling of string methods • All are called as: my. String. method(params) – length() -- the number of characters in the String – char. At(index) -- the character at (integer) position index, where index is between 0 and length-1 – equals(another. String) -- equality test (because == doesn’t do quite what you expect • Don’t learn all 48 String methods unless you use them a lot--instead, learn to use the API!

Vocabulary • escape sequence -- a code sequence for a character, beginning with a

Vocabulary • escape sequence -- a code sequence for a character, beginning with a backslash • ASCII -- an 7 -bit standard for encoding characters • Unicode -- a 16 -bit standard for encoding characters • glyph -- the printed representation of a character • font -- a collection of glyphs • empty string -- a string containing no characters • concatenate -- to join strings together

The End

The End