String C and Data Structures Baojian Hua bjhuaustc
String C and Data Structures Baojian Hua bjhua@ustc. edu. cn
What’ s a String? n A string is a sequence of characters: n n n Every character c is taken from some char table (say the ASCII or the Uni. Code) Ex: “hello, world”, “string 1tstring 2n” Essentially, a string is a linear list n But different operations
Isn’t String a char*? n C’s convention for string representation n n Operations (see the library): n n C has no built-in string type Every string is a char array (char *) terminated with char ‘ ’ char *strcpy (char *s, const char *ct); char *strcat (char *s, const char *ct); … Such operations are array-based and thus efficient
“string” ADT n Weakness of C’s “char *” string: n Some strings can not even be represented n n Some operations are dangerous n n Ex: “aaa bbb c” it’s programmers’ duty to guarantee its safety Ex: strcpy (“ab”, “ 1234”) some viruses take advantage this… We want an ADT “string” that: n n hides the concrete representation of string offers safe operations
Abstract Data Types in C: Interface // in file “str. h” #ifndef STR_H #define STR_H typedef struct str *str; str new (char *s); str new 2 (char *s, int size); int size (str s); bool is. Empty (str s); char nth (str s, int n); str concat (str s 1, str s 2); str sub (str s, int i, int j); str clone (str s); #endif
Array-based Implementation // in file “str. c” #include “str. h” struct str { char *s; int size; }; str s size 0 size-1
Operations: “new” str new (char *s) { int len = strlen (s); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc (len * sizeof(char)); for (int i=0; s[i]; i++) (p->s)[i] = s[i]; p->size = len; return p; }
Operations: “new” str new (char *s) { int len = strlen (s); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc (len * sizeof(char)); for (int i=0; s[i]; i++) p (p->s)[i] = s[i]; $%^& @#$% p->size = len; return p; }
Operations: “new” str new (char *s) { int len = strlen (s); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc (len * sizeof(char)); for (int i=0; s[i]; i++) (p->s)[i] = s[i]; p s @#$% p->size = len; return p; } 0 size-1
Operations: “new” str new (char *s) { int len = strlen (s); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc (len * sizeof(char)); for (int i=0; s[i]; i++) (p->s)[i] = s[i]; p s p->size = len; @#$% return p; } 0 size-1
Operations: “new” str new (char *s) { int len = strlen (s); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc (len * sizeof(char)); for (int i=0; s[i]; i++) (p->s)[i] = s[i]; p s p->size = len; size return p; } 0 size-1
Operations: “size” int size (str s) { return s->size; } p s size 0 size-1
Operations: “nth” char nth (str s, int n) { if (n<0 || n>=size(s)) error (“invalid index”); return (s->s)[n]; } s s size 0 size-1
Operations: “concat” str concat (str s 1, str s 2) { int n 1 = size (s 1); int n 2 = size (s 2); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc ((n 1+n 2) * sizeof(char)); for (int i=0; i<n 1; i++) (p->s)[i] = nth (s 1, i); for (int j=0; j<n 2; j++) (p->s)[n 1+j] = nth (s 2, j); p->size = n 1+n 2; return p; }
Operations: “concat” s 1 s size p s size s 2 s size
Operations: “clone” str clone (str s) { int n = size (s); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc (n * sizeof(char)); for (int i=0; i<n; i++) (p->s)[i] = nth (s, i); p->size = n; return p; }
Operations: “clone” s s size q s size
Operations: “sub” str sub (str s, int from, int to) { int n = size (s); if (from>to || from<0 || to>=n) error (“invalid index”); str p = checked. Malloc (sizeof (*p)); p->s = checked. Malloc ((tofrom+1)*sizeof(char)); for (int i=from; i<to; i++) (p->s)[i-from] = nth (s, i); p->size = to-from+1; return p;
Operations: “clone” s s size from to q s size
Summary n The string representation discussed in this class is functional n n n functional: data never change, instead, we always make new data from older ones Java and ML also have functional strings In general, functional data structures are easy to implement, maintain and reason about n and thus have much to be recommended
- Slides: 20