Arrays Lists in Python one thing after another

Problem n n Given 5 numbers, read them in and calculate their average THEN

Data Structure Needed n n Need some way to hold onto all the individual

An Array (a List) n n You need a way to have many variables

Semantics n numbered from 0 to n-1 where n is the number of elements

Properties of an array (list) n n n Heterogeneous (any data type!) Contiguous Have

Syntax n n Use [] to give initial value to, like x = [1,

List Operations you know Operator <seq> + <seq> * <int-expr> <seq>[] len(<seq>) <seq>[: ]

Indexing an Array n n n n The index is also called the subscript

List Operations n Lists are often built up one piece at a time using

List Operations Method Meaning <list>. append(x) Add element x to end of list. <list>.

Using a variable for the size n n It is very common to use

Solution to starting problem SIZE = 5 n = [0]*SIZE total = 0 for

Solution to problem - cont'd average = total / SIZE for ct in range(5):

Scope of counter in a for loop n The counter variable has usual scope

Initialization of arrays n n a = [1, 2, 9, 10] # has 4

Watch out index out of range! n n n Subscripts range from 0 to

Assigning Values to Individual Array Elements temps = [0. 0] * 5 m=4 temps[2]

What values are assigned? SIZE =5 temps = [0. 0]* SIZE for m in

Indexes n n n Subscripts can be constants or variables or expressions If i

Variable Subscripts temps = [0. 0]*5 m=3. . . What is temps[m + 1]

Random access of elements n n Problem : read in numbers from a file,

Parallel arrays n n n Sometimes you have data of different types that are

Parallel arrays, cont'd for i in range(SIZE): name[i], gpa[i] = float(input(“Enter”)) n Logically the

Parallel Arrays Parallel arrays are two or more arrays that have the same index

SIZE = 50 id. Number = [“ “] *SIZE hourly. Wage =[0. 0] *SIZE

Selection sort - 1 -d array Algorithm for the sort 1. find the maximum

Find the maximum in the list # n is number of elements max =

Find the location of the max = 0 # max is now location of

Swap with highest numbered Remember element at right end of list is numbered n-1

The Python way! n n n The previous code of finding the max and

The Python Way To find the max of the whole list mx = max(a)

The Python Way n n The swap then becomes a[loc], a[n-1] = a[n-1], a[loc]

Find next largest element and swap (generic way) max = 0; for i in

put a loop around the general code to repeat for n-1 passes for pss

The whole thing the Python way for pss in range(n, 1, -1): # n-1

2 -dimensional arrays n n Data sometimes has more structure to it than just

2 -dimensional arrays n syntax n n n a = [[0]*5 for i in

EXAMPLE -- Array for monthly high temperatures for all 50 states NUM_STATES = 50

Processing a 2 -d array by rows finding the total for the first row

Processing a 2 -d array by rows total for ALL elements by adding first

Processing a 2 -d array by columns total for ALL elements by adding first

Finding the average high temperature for Arizona total = 0 for month in range(NUM_MONTHS):

Passing an array as an argument n n n Arrays (lists) are passed by

Arrays versus Files n n Arrays are usually smaller than files Arrays are faster

Using Multidimensional Arrays Example of three-dimensional array 46

NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture NUM_MONTHS = 12 NUM_STORES =

Example of filling a 3 -d array def main(): NUM_DEPTS = 5 # mens,

Find the average of monthly_sales total = 0 for m in range(NUM_MONTHS): for d

Problem: student data in a file n n The data is laid out as

Read in the data inf = open(“students”, ”r”) studs = [] for line in

Find the student with highest GPA max = 0 for j in range(1, len(studs)):

Slides: 52

Download presentation

Arrays (Lists) in Python one thing after another

Problem n n Given 5 numbers, read them in and calculate their average THEN print out the ones that were above average

Data Structure Needed n n Need some way to hold onto all the individual data items after processing them making individual identifiers x 1, x 2, x 3, . . . is not practical or flexible the answer is to use an ARRAY a data structure - bigger than an individual variable or constant

An Array (a List) n n You need a way to have many variables all with the same name but distinguishable! In math they do it by subscripts or indexes n n x 1, x 2, x 3 and so on In programming languages, hard to use smaller fonts, so use a different syntax n x [1], x[0], table[3], point[i]

Semantics n numbered from 0 to n-1 where n is the number of elements 0 1 2 3 4 5

Properties of an array (list) n n n Heterogeneous (any data type!) Contiguous Have random access to any element Ordered (numbered from 0 to n-1) Number of elements can change very easily (use method. append) Python lists are mutable sequences of arbitrary objects

Syntax n n Use [] to give initial value to, like x = [1, 3, 5] refer to individual elements n n uses [ ] with index in the brackets most of the time you don’t refer to the whole array as one thing, or just by the array name (one time you can is when passing a whole array to a function as an argument)

List Operations you know Operator <seq> + <seq> * <int-expr> <seq>[] len(<seq>) <seq>[: ] for <var> in <seq>: <expr> in <seq> Meaning Concatenation Repetition Indexing Length Slicing Iteration Membership (Boolean) Python Programming, 2/e 8

Indexing an Array n n n n The index is also called the subscript In Python, the first array element always has subscript 0, the second array element has subscript 1, etc. Subscripts can be variables – they have to have integer values k =4 items = [3, 9, ’a’, True, 3. 92] items[k] = 3. 92 items[k-2] = items[2] = ‘a’ 9

List Operations n Lists are often built up one piece at a time using append. nums = [] x = float(input('Enter a number: ')) while x >= 0: nums. append(x) x = float(input('Enter a number: ')) n Here, nums is being used as an accumulator, starting out empty, and each time through the loop a new value is tacked on. Python Programming, 2/e 10

List Operations Method Meaning <list>. append(x) Add element x to end of list. <list>. sort() Sort (order) the list. A comparison function may be passed as a parameter. <list>. reverse() Reverse the list. <list>. index(x) Returns index of first occurrence of x. <list>. insert(i, x) Insert x into list at index i. <list>. count(x) Returns the number of occurrences of x in list. <list>. remove(x) Deletes the first occurrence of x in list. <list>. pop(i) Deletes the ith element of the list and returns its value. Python Programming, 2/e 11

Using a variable for the size n n It is very common to use a variable to store the size of an array SIZE = 15 arr = [] for i in range(SIZE): arr. append(i) n Makes it easy to change if size of array needs to be changed

Solution to starting problem SIZE = 5 n = [0]*SIZE total = 0 for ct in range(SIZE): n[ct] = float(input("enter a number “)) total = total + n[ct] cont'd on next slide

Solution to problem - cont'd average = total / SIZE for ct in range(5): if n[ct] > average: print (n[ct])

Scope of counter in a for loop n The counter variable has usual scope (body of the function it’s in) n n n for i in range(5): counter does exist after for loop finishes what‘s its value after the loop?

Initialization of arrays n n a = [1, 2, 9, 10] # has 4 elements a = [0] * 5 # all are zero

Watch out index out of range! n n n Subscripts range from 0 to n-1 Interpreter WILL tell you if an index goes out of that range BUT the negative subscripts work as they do with strings (which are, after all, arrays of characters) x = [5]*5 x[-1] = 4 # x is [5, 5, 4]

Assigning Values to Individual Array Elements temps = [0. 0] * 5 m=4 temps[2] = 98. 6; temps[3] = 101. 2; temps[0] = 99. 4; temps[m] = temps[3] / 2. 0; temps[1] = temps[3] - 1. 2; // What value is assigned? 7000 99. 4 temps[0] 7004 7008 ? 98. 6 temps[1] temps[2] 7012 101. 2 temps[3] 7016 50. 6 temps[4] 18

What values are assigned? SIZE =5 temps = [0. 0]* SIZE for m in range(SIZE): temps[m] = 100. 0 + m * 0. 2 for m in range(SIZE-1, -1): print(temps[m]) 7000 7004 7008 7012 7016 ? ? ? temps[0] temps[1] temps[2] temps[3] temps[4] 19

Indexes n n n Subscripts can be constants or variables or expressions If i is 5, a[i-1] refers to a[4] and a[i*2] refers to a[10] you can use i as a subscript at one point in the program and j as a subscript for the same array later - only the value of the variable matters

Variable Subscripts temps = [0. 0]*5 m=3. . . What is temps[m + 1] ? What is temps[m] + 1 ? 7000 100. 0 temps[0] 7004 7008 7012 7016 100. 2 100. 4 100. 6 100. 8 temps[1] temps[2] temps[3] temps[4] 21

Random access of elements n n Problem : read in numbers from a file, only single digits - and count them report how many of each there were Use an array as a set of counters n n ctr [0] is how many zero's, ctr[1] is how many ones, etc. ctr[num] +=1 is the crucial statement

Parallel arrays n n n Sometimes you have data of different types that are associated with each other like name (string) and GPA (float) You CAN store them in the same array n n ar = [“John”, 3. 24, “Mary”, 3. 9, “Bob”, 2. 7] You can also use two different arrays "side by side"

Parallel arrays, cont'd for i in range(SIZE): name[i], gpa[i] = float(input(“Enter”)) n Logically the name in position i corresponds to the gpa in position i n Nothing in the syntax forces this to be true, you just have to program it to be so.

Parallel Arrays Parallel arrays are two or more arrays that have the same index range and whose elements contain related information, possibly of different data types EXAMPLE SIZE = 50 id. Number = [“ “]*SIZE hourly. Wage = [0. 0] *SIZE parallel arrays 25

SIZE = 50 id. Number = [“ “] *SIZE hourly. Wage =[0. 0] *SIZE // Parallel arrays hold // Related information id. Number[0] 4562 hourly. Wage[0] 9. 68 id. Number[1] 1235 hourly. Wage[1] 45. 75 id. Number[2] 6278 hourly. Wage[2] 12. 71 . . . id. Number[48] 8754 hourly. Wage[48] 67. 96 id. Number[49] 2460 hourly. Wage[49] 8. 97 26

Selection sort - 1 -d array Algorithm for the sort 1. find the maximum in the list 2. put it in the highest numbered element by swapping it with the data that was at that location 3. repeat 1 and 2 for shorter unsorted list - not including highest numbered location 4. repeat 1 -3 until list goes down to one

Find the maximum in the list # n is number of elements max = a[0] # value of largest element # seen so far for i in range(1, n): # note start at 1, not 0 if max < a[i]: max = a[i] # now max is value of largest element in list

Find the location of the max = 0 # max is now location of the # largest seen so far for i in range(1, n): if a[max] < a[i]: max = i # now max is location of the largest in # array

Swap with highest numbered Remember element at right end of list is numbered n-1 temp = a[max] = a[n-1] = temp # there is a shorter way in Python!

The Python way! n n n The previous code of finding the max and its location will work in ANY highlevel language. Python has some nice functions and methods to make it easier! Let’s try that.

The Python Way To find the max of the whole list mx = max(a) loc = a. index(mx) Is using index SAFE here? If it doesn’t find mx in a, it will crash! But you just got mx from the list using the max function, so it IS in the list a. n

The Python Way n n The swap then becomes a[loc], a[n-1] = a[n-1], a[loc] Python “hides” the temporary third variable

Find next largest element and swap (generic way) max = 0; for i in range(1, n-1): # note n-1, not n if a[max] < a[i]: max = i temp = a[max] = a[n-2] = temp

put a loop around the general code to repeat for n-1 passes for pss in range(n, 1, -1): max = 0 for i in range(1, pss): if a[max] <= a[i]: max = i temp = a[max] = a[pss-1] = temp

The whole thing the Python way for pss in range(n, 1, -1): # n-1 passes mx = max(a[0: pss]) loc = a. index(mx) a[loc], a[pss-1] = a[pss-1], a[loc]

2 -dimensional arrays n n Data sometimes has more structure to it than just "a list" It has rows and columns You use two subscripts to locate an item The first subscript called “row”, second called “column”

2 -dimensional arrays n syntax n n n a = [[0]*5 for i in range(4)] # 5 columns, 4 rows Twenty elements, numbered from [0][0] to [4][3] a = [[0]*COLS for i in range(ROWS)] n Which has ROWS rows and COLS columns in each row (use of variables to make it easy to change the size of the array without having to edit every line of the program)

EXAMPLE -- Array for monthly high temperatures for all 50 states NUM_STATES = 50 NUM_MONTHS = 12 state. Highs = [[0]*NUM_MONTHS for i in range(NUM_STATES)] [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10][11] [0] [1] [2] row 2, . col 7. might be. Arizona’s high for [48] August [49] 66 64 72 78 85 90 99 105 98 90 88 80 state. Highs[2][7] 39

Processing a 2 -d array by rows finding the total for the first row for i in range(NUM_MONTHS): total = total + a[0][i] finding the total for the second row for i in range(NUM_MONTHS): total = total + a[1][i]

Processing a 2 -d array by rows total for ALL elements by adding first row, then second row, etc. for i in range(NUM_STATES): for j in range(NUM_MONTHS): total = total + a[i][j]

Processing a 2 -d array by columns total for ALL elements by adding first column, second column, etc. for j in range(NUM_MONTHS): for i in range(NUM_STATES): total = total + a[i][j]

Finding the average high temperature for Arizona total = 0 for month in range(NUM_MONTHS): total = total + state. Highs[2][month] average = round (total / NUM_MONTHS) average 85 43

Passing an array as an argument n n n Arrays (lists) are passed by reference = they CAN be changed permanently by the function Definition def fun 1 (arr): Call the function as x = fun 1 (myarr)

Arrays versus Files n n Arrays are usually smaller than files Arrays are faster than files Arrays are temporary, in RAM - files are permanent on secondary storage Arrays can do random or sequential, files we have seen are only sequential

Using Multidimensional Arrays Example of three-dimensional array 46

NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture NUM_MONTHS = 12 NUM_STORES = 3 # White Marsh, Owings Mills, Towson monthly. Sales = [[[0]*NUM_MONTHS for i in range(NUM_DEPTS)] for j in range(NUM_STORES)] 5 DEPTS rows S E R O s ST eet 3 sh monthly. Sales[3][7][0] sales for electronics in August at White Marsh 12 MONTHS columns 47

Example of filling a 3 -d array def main(): NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture NUM_MONTHS = 12 NUM_STORES = 3 # White Marsh, Owings Mills, Towson monthly. Sales = [[[0]*NUM_MONTHS for i in range(NUM_DEPTS)] for j in range(NUM_STORES)] store. Names = ["White Marsh", "Owings Mills", "Towson"] dept. Names = ["mens", "womens", "childrens", "electronics", "furniture"] for store in range(NUM_STORES): print (store. Names[store], end=" ") for dept in range(NUM_DEPTS): print (dept. Names[dept], end = " ") for month in range(NUM_MONTHS): print("for month number ", month+1) monthly. Sales[store][dept] [month] = float(input("Enter the sales ")) print() print (monthly. Sales)

Find the average of monthly_sales total = 0 for m in range(NUM_MONTHS): for d in range(NUM_DEPTS): for s in range(NUM_STORES): total += monthly. Sales [s][d][m] average = total / (NUM_MONTHS * NUM_DEPTS * NUM_STORES)

Problem: student data in a file n n The data is laid out as Name, section, gpa n n John Smith, 15, 3. 2 Ralph Johnson, 12, 3. 9 Bob Brown, 9, 2. 5 Etc.

Read in the data inf = open(“students”, ”r”) studs = [] for line in inf: data = line. split(“, ”) studs. append(data) inf. close() #studs looks like [[“John Smith”, 15, 3. 2], #[“Ralph Johnson”, 12, 3. 9], [“Bob Brown”…]]

Find the student with highest GPA max = 0 for j in range(1, len(studs)): if studs[max][2] < studs[j][2]: max = j #max is now location of highest gpa studs[max][0] is the name of the student studs[max][1] is the student’s section