Microprocessors and microcontrollers Digital technics review and programming

II. 2. Instruction set (processor) n Instruction set: the set of instructions a processor

Instruction set, programming Assembly: lowest level programming language, processor dependent. n It is made

Instruction set, programming Software originally written in other programming languages have to be converted

Low level languages Nearest to hardware, n thus easiest to compile to machine code,

Assembly n n n Practically a readable version of machine code Takes lot of

Assembly n n n More suited for small programs, for low-end hardware, for microcontrollers

Assembly example (8086) clrscr proc near mov ax, 0 b 800 h mov es,

Addressing modes of instructions n Immediate addressing: instr. includes a constant ¨ eg. :

Instruction set example Intel 8085 A n 8 bit data bus, 16 bit address

8085 n Instruction types ¨ arithmetics: add, subtract, increment ¨ logics: and, or, xor,

8085 addressing modes example n n n mov r 1, r 2 ¨ 1

8085 addressing modes example n n n n lda addr ¨ 3 byte (code+2

Medium level languages n eg. C (though often put in other categories) ¨ ¨

C n examples ¨ int array[10]; what is value of array[11] ? n what

Higher level languages n n n In high level languages simple instructions can realize

Higher level languages n n n High level languages: Fortran, Algol, Cobol, Basic, Pascal,

Special high level languages n Windowed system development ¨ Visual n Basic / C

Special high level languages n Dataflow programming ¨ Lab. View, n VEE, Simulink program's

Special high level languages n Mathematical, logical ¨ Mat. Lab, Maple, Mathematica ¨ R,

Special high level languages n Scripting languages ¨ for small, quick tasks, limited abilities

Markup languages Markup: formatted data, usually text documents n Eg. XML, HTML n Aided

Database query languages n Contain commands for creating and managing databases ¨ eg. d.

Network systems n Server side languages ¨ run by web server programs ¨ eg.

Portability of source code Portable code n source that can be compiled (without modification

Portability of source code n n n Usually needed when porting to systems which

Readability „Read-only programming language” (easy to read, hard to write) „Write only programming language”

Programming languages interpreted n compiled n ¨ to machine code ¨ to bytecode n

Operating system n n Technically not necessary – programs can be run without it

Operating system n provides a layer btw HW and user SW ¨ when changing

Number systems Most computers store and process numbers in a binary form. n In

Number systems Examples for how the numbers are written in programming environments or books:

Number systems n Decimal to binary conversion (natural notation): ¨ Divide by two, repeat

Number systems Hexadecimal (base 16): 0, 1, 2, 3, 4, 5, 6, 7, 8,

Number formats n Binary can mean: ¨ "normal" binary integers ¨ signed integers ¨

Integer numbers Unsigned integer (uint) n Signed integer (int) n ¨ possible storing formats:

Sign bit format first bit is sign n 01010011 b=83 d n 11010011 b=-83

Two’s complement n n Take binary number, invert and add 1 adder circuit can

Offset binary Add 2(n-1) to the signed number to make it unsigned for storage

Numbers larger than data bus width n n n Eg. 8 b system and

$Fractional numbers Fixed point n Floating point n Traditional fraction n$

Fixed point practically store as an integer (eg. when there are no floating point

Floating point s: sign n b: base (2 or 10) n c: significant digits

IEEE 745 (floating point standard) n n first significant digit is always 1 ->

IEEE 745 (floating point) n n n Single precision: 32 b (24+8) approx. 6.

BCD - Binary Coded Decimal n 4 bits store a decimal digit ¨ eg.

Storing multi-byte numbers n n n word: a number (or piece of data) made

Storing multi-byte numbers n Little endian: n n Intel-AMD x 86, x 86 -64

Boolean type n n n Boolean is a data type storing True or False

Arrays, strings Array: realization of a matrix n Various support in hw. and sw.

Arrays n Hardware support: ¨ scalar processors: access one element at a time; access

Arrays n Using pointers ¨a pointer shows the base address, ie. the first element

Arrays n multi-dimensional arrays: ¨ if the language supports only 1 D array: create

Characters Alphanumerical characters are stored by using a table to turn them into numbers;

ASCII n American Standard Code for Information Interchange ¨ originally 7 b coding, 8.

ASCII n Extensions ¨ the 8. bit is used, thus another 128 characters ¨

Unicode Made to be able to express most of the world's writing systems. n

Unicode encodings n UTF-8 ¨ variable width encoding: 1 to 4 bytes ¨ Today

Unicode encodings n UTF-16 ¨ variable length, 2 bytes or 2 x 2 bytes

Slides: 67

Download presentation

Microprocessors and microcontrollers Digital technics review and programming last modified: 2021. 3. 30.

II. 2. Instruction set (processor) n Instruction set: the set of instructions a processor knows in hardware n Machine code: a program containing instructions from the instruction set, in binary or hexadecimal format. It can generally be natively run by the processor (without the need for other software).

Instruction set, programming Assembly: lowest level programming language, processor dependent. n It is made by assigning easy to remember words (mnemonics) to machine code instructions; make easier data and number formatting; make labels and constants available; make some simple functions n

Instruction set, programming Software originally written in other programming languages have to be converted to machine code using a compiler (in older times, by hand). For very high level languages it can involve several middle steps. n Question: in what language and machine are compilers written? n

Programming languages

Low level languages Nearest to hardware, n thus easiest to compile to machine code, n best for optimizing hardware resource usage. n Generally assembly languages are in this category. n

Assembly n n n Practically a readable version of machine code Takes lot of time and effort to realize complex functions (of course code parts can be reused) Hard to overview, find bugs – needs lots of comment Can directly access hardware, can utilize all special functions of hardware and special cpu/gpu instructions Code can be made small in size, small in memory footprint, fast to run, optimized for hardware

Assembly n n n More suited for small programs, for low-end hardware, for microcontrollers and embedded systems with special needs Eg. military or space uses – small memory, low speed, have to be very reliable (but high level languages are also used there, see eg. Ada made for military) Compilers (esp. in older times) often in asm Some of op. system’s special functions in asm But possible to write whole graphical modern op system in asm (see eg. Menuet OS)

Assembly example (8086) clrscr proc near mov ax, 0 b 800 h mov es, ax mov di, 0 mov al, ' ' mov ah, 07 d loop_clear_12: mov word ptr es: [di], ax inc di cmp di, 4000 jle loop_clear_12 ret endp

Addressing modes of instructions n Immediate addressing: instr. includes a constant ¨ eg. : mvi register, data (mvi A, 9) ¨ n Register addressing: from one register (inside CPU) to another ¨ eg. : mov register 1, register 2 (2 ->1) (mov ax, bx) ¨ n Direct addressing: instr. includes a memory address to load data from ¨ eg. : lda address (pl. lda 0 x. F 000) ¨ n Indirect addressing: register (pointer) stores the memory address to load from ¨ eg. : mov register, M (mov A, HL) (HL is the pointer register pair) ¨ n Relative addressing ¨ Instr. contains a relative address (offset) from a pre-defined base address (stored also in some registers)

Instruction set example Intel 8085 A n 8 bit data bus, 16 bit address bus n few MHz clock n its instruction set was the basis for the 8086 and then x 86 series processors n it is also similar to many current 8 b microcontrollers n

8085 n Instruction types ¨ arithmetics: add, subtract, increment ¨ logics: and, or, xor, complement (negate) ¨ bitwise: rotate (as in a shift register) ¨ data moving: move, exchange, push, pop ¨ branch: jump, call, return ¨ conditional: jump on condition (jz, jnz, jc, etc) ¨ IO: in, out

8085 addressing modes example n n n mov r 1, r 2 ¨ 1 byte, 1 machine cycle, r 2 r 1 mov r, M ¨ 1 byte, 2 m. cycle, (HL) r ¨ HL register pair contains memory address mov M, r ¨ 1 byte, 2 m. cycle, r (HL) ¨ HL register pair contains memory address mvi r, data ¨ 2 byte (code+data), 2 m. cycles, data r mvi M, data ¨ 2 byte (code+data), 3 m. cycle, data (HL) lxi rp, data ¨ 3 byte (code+2 byte data), 3 m. cycle, data rp (register pair)

8085 addressing modes example n n n n lda addr ¨ 3 byte (code+2 byte address), 4 m. cycle, (addr) A ¨ memory cell’s content into accumulator (A) sta addr ¨ 3 byte, 4 m. cycle, A (addr) lhdl addr ¨ 3 byte, 5 m. cycle, (addr) HL ¨ memory word (2 B) at address addr into HL register pair shld addr ¨ 3 byte, 5 m. cycle, HL (addr) ldax rp ¨ 1 byte, 2 m. cycle, (rp) A ¨ memory cell addressed by BC or DE into A stax rp ¨ 1 byte, 2 m. cycle, A (rp) xchg ¨ 1 byte, 1 m. cycle, HL<=>DE ¨ exchange contents of register pairs

Medium level languages n eg. C (though often put in other categories) ¨ ¨ ¨ ¨ developed before microprocessors, developed for programming Unix operating system integral part of Unix and Linux systems often used to write opsystems and compilers „mother” of many modern prog. languages less special data types; pointers used for strings and arrays; can use pointers of pointers and chaines lists etc. can change data type of an existing variable (without conversion) allows to give value inside a condition (eg. if(a=b) and if(a==b) both are allowed!)

C n examples ¨ int array[10]; what is value of array[11] ? n what is value of array[-1] ? n ¨ if (a=b). . . must we really allow it? n instead of a=b; if (a==true). . . n ¨ a+=2; is it really necessary (instead of a=a+2; ) ? n (btw there is a difference in compilation) n

Higher level languages n n n In high level languages simple instructions can realize complex jobs closer to human languages and logic easier, more special data types easier, faster to develop and debug, easier to overview though sometimes finding a bug needs looking at the asm/machine code, if the bug is hardwaredependent

Higher level languages n n n High level languages: Fortran, Algol, Cobol, Basic, Pascal, Python, Perl, PHP, Ruby, C#, Java, . . . Modern version often give help to create graphical user interface (GUI) (Visual Basic, Visual C, Labview etc) Compiled code potentially larger and/or slower than if using C or asm Often need „runtime engine” and larger hardware req. (eg. dotnet, java) Less chance for syntax errors or algorithm errors, but harder to find errors related to compiler, opsys or hw.

Special high level languages n Windowed system development ¨ Visual n Basic / C / whatever program's visual elements (windows, buttons etc. ) are placed in a graphical UI; traditional program code is written for each element (this is also helped by the IDE)

Visual C#

Special high level languages n Dataflow programming ¨ Lab. View, n VEE, Simulink program's visual elements created in GUI, the corresponding code is also created in a GUI in the form of a flowchart ¨ Labwindows n CVI form of Labview, in which you can put traditional "typed" code

Labview

Special high level languages n Mathematical, logical ¨ Mat. Lab, Maple, Mathematica ¨ R, S (statistics) ¨ Coq (theorem proving), SML (? ), Haskell (? )

Coq

Special high level languages n Scripting languages ¨ for small, quick tasks, limited abilities ¨ often built into op system (esp. Unix/Linux) ¨ or built into / support markup language (Javascript, Flash, partly PHP, Java) ¨ often started as simple script languages with limited abilities, but end up as full prg. languages

Markup languages Markup: formatted data, usually text documents n Eg. XML, HTML n Aided by style languages such as CSS n XML etc. also used for human-readable data storage n even image formats (SVG is XML based) n

Database query languages n Contain commands for creating and managing databases ¨ eg. d. Base, SQL

Network systems n Server side languages ¨ run by web server programs ¨ eg. PHP, Java, Python, Go, Asp n client side languages ¨ run by browsers, runtime engines ¨ eg. HTML+CSS ¨ eg. Javascript, Java, Flash (once)

Portability of source code Portable code n source that can be compiled (without modification or with little modification) on other hardware or op. system n this needs an existing compiler for that hardware (needs not to be compiled on target HW (cross compilation))

Portability of source code n n n Usually needed when porting to systems which are similar in hardware and software environment and general use. Eg. when porting between home computers using x 86, 68000 series or Powerpc processors (MS or Apple op systems). Not necessarily good idea when porting from PC to a tablet (see Win 8). Or to industrial or embedded systems. . . If a programming language exists on different systems, it doesn't mean the code is portable! ¨ C is used on PC and microcontrollers as well, but what a difference. . )

Readability „Read-only programming language” (easy to read, hard to write) „Write only programming language” (easy to write, hard to read) And then this. . .

Programming languages interpreted n compiled n ¨ to machine code ¨ to bytecode n mixed (intermediate) runtime compiled

Operating system n n Technically not necessary – programs can be run without it – but it is not convenient to do so Opsys allow us to: ¨ provide user interface (UI) to load stored programs easily (it needs not be graphical (GUI) – even a command line is a huge help vs the original computers) ¨ provides application programming interface (API) – makes programming easier and portable

Operating system n provides a layer btw HW and user SW ¨ when changing HW, only Driver software changed, but end-user SW sees the same API n n n API can be a library of a programming language or a function call the opsys provides BIOS is „first part” of opsys (but is independent), it also provides some simple function calls opsys can allow for multitasking – it has to keep track of running processes and allocate their cpu time; also helps allocating multi-processor jobs

Data formats

Number systems Most computers store and process numbers in a binary form. n In telecommunication, base 2, 8, 64 etc are also used (mostly in form of QAM or QPSK modulation). n Users mostly see decimal form, programmers use decimal, hexadecimal (base 16), or less frequently octal and binary forms. n

Number systems Examples for how the numbers are written in programming environments or books: n Decimal: n ¨ D’ 223’ n Hexadecimal: ¨ H’ 2 F’, n , 223 D, 22310 h 2 F, 2 Fh, 0 x 2 F, 2 F 16 Binary ¨ B’ 10101100’, b 10101100, 101011002

Number systems n Decimal to binary conversion (natural notation): ¨ Divide by two, repeat with int(result), read the remainders from bottom up (Note: last remainder / first digit is always 1) n Binary to decimal conversion: ¨ Sum is 1 the powers of two where the binary digit

Number systems Hexadecimal (base 16): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F digits One digit can be translated to 4 bits, so an 8 bit number (byte) will always be two hexa digits: 3 Ah=0011 1010 b

Number formats n Binary can mean: ¨ "normal" binary integers ¨ signed integers ¨ fixed point and floating point fractions ¨ traditional fractions ¨ binary coded decimal (BCD) integers ¨ fixed point decimal (BCD) fractions or BCD trad. fractions ¨ etc.

Integer numbers Unsigned integer (uint) n Signed integer (int) n ¨ possible storing formats: sign bit n two’s complement – this one is most popular n offset binary, excess-K n ¨ range in abs. value is halved

Sign bit format first bit is sign n 01010011 b=83 d n 11010011 b=-83 d n range: -127. . . +127 n ¨ 1111 n there are two zeros (0000, 10000000) ¨ problem n . . . 01111111 when counting harder to add numbers

Two’s complement n n Take binary number, invert and add 1 adder circuit can now do subtraction ¨ simply add like unsigned int ¨ additional types of carry bit can appear (signed carry and unsigned carry) n n comparison problem: negative numbers appear larger than positive range in 8 bit: -128. . +127 first bit 1: negative, 0: zero or positive only one zero

Offset binary Add 2(n-1) to the signed number to make it unsigned for storage n First part of range negative, middle is zero, upper part positive n Practically same as two’s complement, but first bit inverted n Easier comparison n

Numbers larger than data bus width n n n Eg. 8 b system and I want to use 16 b long numbers Of course it is possible, just takes more time Eg. addition: ¨ first add least significant bytes (LSB), result is LSB of end result, generates Carry flag bit ¨ add next two bytes, add carry bit to it (special add with carry instructions may be available) ¨ and so on ¨ Similar case when dealing with adding the individual bits. See also carry-lookahead adder

$Fractional numbers Fixed point n Floating point n Traditional fraction n$

Fractional numbers Fixed point n Floating point n Traditional fraction n

Fixed point practically store as an integer (eg. when there are no floating point ALU capabilities) n If m is the original number n and f is number of fractional digits n the fractional number is n

Floating point s: sign n b: base (2 or 10) n c: significant digits (as in "0. c") n q: exponent (2’s complement or offset) n

IEEE 745 (floating point standard) n n first significant digit is always 1 -> no need to store, thus eg. out of 24 b we need to store 23 (24 th bit will be the sign bit) special values: +0 , -0, +inf , -inf , Na. N, subnormal Na. N (not a number): exp all 1, rest not 0 inf (infinite): exp all 1, rest 0

IEEE 745 (floating point) n n n Single precision: 32 b (24+8) approx. 6. . 7 decimal digit precision exponent format: add 127, thus -126. . +127 (offset binary) max: (2− 2− 23) × 2127 ≈ 3. 402823 × 1038 Double precision: 64 b (53+11) approx. 16 decimal digits precision

BCD - Binary Coded Decimal n 4 bits store a decimal digit ¨ eg. 15 (D) = 0001 0101 (BCD) (natural or 8421 coding) ¨ there are other encodings as well n n n "unpacked" – one digit per byte (upper four bits zeroes); "packed" – two digits per byte Can be fixed point or floating point. Esp. with financial systems. For displaying numbers to humans ¨ in some simpler systems, it is easier to store and operate in BCD, in others binary number is converted before display

Storing multi-byte numbers n n n word: a number (or piece of data) made up of multiple bytes generally two types of storage/transmission method: little endian ¨ LSB (least significant byte) first, it is stored at lowest address ¨ easier hardware for addition (starts with LSB) n big endian ¨ MSB (most significant byte) stored at lowest address

Storing multi-byte numbers n Little endian: n n Intel-AMD x 86, x 86 -64 series Big endian: Motorola 68000 descendants ¨ AVR 32 ¨ IBM System/360, z/Architecture ¨ Internet Protocol (IP, TCP, UDP) ¨ n Bi-endian (configurable): ¨ n ARM 3 -tól, Power. PC, Alpha, MIPS, Itanium etc. May find this problem in communications (eg. connect PC to a data acquisition device)

Boolean type n n n Boolean is a data type storing True or False information. Used for conditional statements, comparisons, function returns. The smallest stored unit in computers is usually a byte, but we need only one bit here Variations: ¨ 0 is false, 1 is true ¨ 0 is false, non-0 is true (just checks 1 st bit!) ¨ 0 is false, -1 is true (works with the above, as -1 is 1111 in two's complement) ¨ etc.

Arrays, strings Array: realization of a matrix n Various support in hw. and sw. n String: a variable containing several characters; a character chain, a 1 D array (vector) containing characters n

Arrays n Hardware support: ¨ scalar processors: access one element at a time; access by pointers n pointer hw. support: certain machine code instructions can use certain cpu registers to access RAM (register contains memory address, register is the pointer); pointer can also be in RAM ¨ vector processor: SIMD: single instruction, multiple data; one instruction processes an array

Arrays n Using pointers ¨a pointer shows the base address, ie. the first element of the array, we can give an offset to this ¨ eg. int *array is a pointer (at memory address x) that contains memory address y. That y address contains the first element of the array. ¨ array[i] i means the offset, ie. the i-th element of array ¨ array[0] (same as array) thus points to base address+0, ie. first element; array[5] points thus to sixth element ¨ therefore in many languages (eg. C and similar), arrays start with 0 index

Arrays n multi-dimensional arrays: ¨ if the language supports only 1 D array: create a vector (1 D array) the elements of which are also pointers, all pointing to vectors themselves, thus a 2 D array is made n addressing will be eg. array[i][j] n

Characters Alphanumerical characters are stored by using a table to turn them into numbers; n eg. BCDIC, ASCII, Unicode n

ASCII n American Standard Code for Information Interchange ¨ originally 7 b coding, 8. bit could be parity if needed ¨ thus 128 characters available ¨ 95 printable char. (lowercase, uppercase alphabet, numbers, punctuation) ¨ 32 control char. (for teletypes, printers) (incl. new line, carriage return, end of line, end of file, tab, del, space, etc. ) ¨ lowercase and uppercase differ in one bit only (easier conversion) ¨ numbers contain the number in binary (last four bits)

ASCII

ASCII n Extensions ¨ the 8. bit is used, thus another 128 characters ¨ (too) many versions exist ("code pages") ¨ include accented characters, non-latin alphabets, graphical characters ¨ eg. Code page 437 (IBM PC); ISO 8859 -x ¨ software can change between the code pages ¨ try not to use these in file names or as serial communication control characters

Unicode Made to be able to express most of the world's writing systems. n Currently contains 143 859 characters, 154 scripts (alphabets) (even emojis) n Includes modifier characters (eg. n~ => ñ), n also joining characters for ligatures etc. n Codes often referred in form U+20 AC (4 bytes hexadecimal) n

Unicode encodings n UTF-8 ¨ variable width encoding: 1 to 4 bytes ¨ Today the dominant encoding on WWW ¨ First 128 characters same as ASCII ¨ 2 byte form includes Latin-based, Greek, Cyrillic, Arabic, Hebrew, Armenian etc. ¨ 4 byte form includes all characters (only 21 bits used)

Unicode encodings n UTF-16 ¨ variable length, 2 bytes or 2 x 2 bytes ¨ used by eg. Windows 2000 and above n UTF-32 ¨ fixed length, 4 bytes (21 bits are used for code points)