DMA Programming in Linux Embedded System Lab II

DMA Programming in Linux 경희대학교 컴퓨터공학과 조진성 Embedded System Lab. II


Direct memory access n Buffering ü ü Temporarily storing data in memory before processing Data accumulated in peripherals commonly buffered n Microprocessor could handle this with ISR ü ü Storing and restoring microprocessor state inefficient Regular program must wait n DMA controller more efficient ü ü ü Separate single-purpose processor Microprocessor relinquishes control of system bus to DMA controller Microprocessor can meanwhile execute its regular program § § No inefficient storing and restoring state due to ISR call Regular program need not wait unless it requires the system bus Ø Harvard archictecture – processor can fetch and execute instructions as long as they don’t access data memory – if they do, processor stalls Embedded System Lab. II 3

Peripheral to memory transfer (DMA없이 vectored interrupt 사용) Time 1(a): μP is executing its main program. 3: After completing instruction at 100, μP sees Int asserted, saves the PC’s value of 100, and asserts Inta. 1(b): P 1 receives input data in a register with address 0 x 8000. 2: P 1 asserts Int to request servicing by the microprocessor. 4: P 1 detects Inta and puts interrupt address vector 16 on the data bus. 5(a): μP jumps to the address on the bus (16). The ISR there reads data from 0 x 8000 and then writes it to 0 x 0001, which is in memory. 5(b): After being read, P 1 deasserts Int. 6: The ISR returns, thus restoring PC to 100+1=101, where μP resumes executing. Embedded System Lab. II 4

Peripheral to memory transfer (DMA없이 vectored interrupt 사용) 1(a): P is executing its main program 1(b): P 1 receives input data in a register with address 0 x 8000. Embedded System Lab. II Program memory ISR 16: MOV R 0, 0 x 8000 17: # modifies R 0 18: MOV 0 x 0001, R 0 19: RETI # ISR return. . . Main program. . . 100: instruction 101: instruction 5 μP Data memory 0 x 0000 0 x 0001 System bus Inta Int PC P 1 16 0 x 8000

Peripheral to memory transfer (DMA없이 vectored interrupt 사용) 3: After completing instruction at 100, P sees Int asserted, saves the PC’s value of 100, and asserts Inta. Program memory ISR 16: MOV R 0, 0 x 8000 17: # modifies R 0 18: MOV 0 x 0001, R 0 19: RETI # ISR return. . . Main program. . . 100: instruction 101: instruction μP System bus Inta Int PC 100 Embedded System Lab. II 6 Data memory 0 x 0000 0 x 0001 1 P 1 16 0 x 8000

Peripheral to memory transfer (DMA없이 vectored interrupt 사용) 4: P 1 detects Inta and puts interrupt address vector 16 on the data bus. Program memory ISR 16: MOV R 0, 0 x 8000 17: # modifies R 0 18: MOV 0 x 0001, R 0 19: RETI # ISR return. . . Main program. . . 100: instruction 101: instruction μP 7 System bus 16 Inta Int PC 100 Embedded System Lab. II Data memory 0 x 0000 0 x 0001 P 1 16 0 x 8000

Peripheral to memory transfer (DMA없이 vectored interrupt 사용) 5(a): P jumps to the address on the bus (16). The ISR there reads data from 0 x 8000 and then writes it to 0 x 0001, which is in memory. 5(b): After being read, P 1 de-asserts Int. Program memory ISR 16: MOV R 0, 0 x 8000 17: # modifies R 0 18: MOV 0 x 0001, 0 x 8001, R 0 19: RETI # ISR return. . . Main program. . . 100: instruction 101: instruction μP System bus Inta Int PC 100 Embedded System Lab. II 8 Data memory 0 x 0000 0 x 0001 P 1 0 16 0 x 8000

Peripheral to memory transfer (DMA없이 vectored interrupt 사용) 6: The ISR returns, thus restoring PC to 100+1=101, where P resumes executing. Program memory ISR 16: MOV R 0, 0 x 8000 17: # modifies R 0 0 x 0001, R 0 18: MOV 0 x 8001, 19: RETI # ISR return. . . Main program. . . 100: instruction 101: instruction μP System bus Inta Int PC 100 Embedded System Lab. II 9 Data memory 0 x 0000 0 x 0001 P 1 16 +1 0 x 8000

DMA를 이용한 Peripheral to memory transfer(1) Time 1(a): μP is executing its main program. It has already configured the DMA ctrl registers. 4: After executing instruction 100, μP sees Dreq asserted, releases the system bus, asserts Dack, and resumes execution. μP stalls only if it needs the system bus to continue executing. 7(a): μP de-asserts Dack and resumes control of the bus. Embedded System Lab. II 1(b): P 1 receives input data in a register with address 0 x 8000. 3: DMA ctrl asserts Dreq to request control of system bus. 2: P 1 asserts req to request servicing by DMA ctrl. 5: (a) DMA ctrl asserts ack (b) reads data from 0 x 8000 and (b) writes that data to 0 x 0001. 6: . DMA de-asserts Dreq and ack completing handshake with P 1. 7(b): P 1 de-asserts req. 10

DMA를 이용한 Peripheral to memory transfer(2) 1(a): P is executing its main program. It has already configured the DMA ctrl registers 1(b): P 1 receives input data in a register with address 0 x 8000. Program memory 0 x 0000 Data memory 0 x 0001 No ISR needed! System bus. . . Main program. . . 100: instruction 101: instruction Embedded System Lab. II μP 11 Dack Dreq PC 100 DMA ctrl 0 x 0001 ack 0 x 8000 req P 1 0 x 8000

DMA를 이용한 Peripheral to memory transfer(3) 2: P 1 asserts req to request servicing by DMA ctrl. 3: DMA ctrl asserts Dreq to request control of system bus Program memory 0 x 0000 Data memory 0 x 0001 No ISR needed! System bus. . . Main program. . . 100: instruction 101: instruction Embedded System Lab. II μP 12 Dack Dreq PC 100 1 DMA ctrl 0 x 0001 ack 0 x 8000 P 1 req 1 0 x 8000

DMA를 이용한 Peripheral to memory transfer(4) 4: After executing instruction 100, P sees Dreq asserted, releases the system bus, asserts Dack, and resumes execution, P stalls only if it needs the system bus to continue executing. Program memory 0 x 0000 Data memory 0 x 0001 No ISR needed! System bus. . . Main program. . . 100: instruction 101: instruction Embedded System Lab. II μP 13 Dack Dreq PC 100 1 DMA ctrl 0 x 0001 ack 0 x 8000 req P 1 0 x 8000

DMA를 이용한 Peripheral to memory transfer(5) 5: DMA ctrl (a) asserts ack, (b) reads data from 0 x 8000, and (c) writes that data to 0 x 0001. (Meanwhile, processor still executing if not stalled!) Embedded System Lab. II Program memory μP Data memory 0 x 0001 0 x 0000 No ISR needed! System bus. . . Main program. . . 100: instruction 101: instruction 14 Dack Dreq PC 100 DMA ctrl 0 x 0001 ack 0 x 8000 req 1 P 1 0 x 8000

DMA를 이용한 Peripheral to memory transfer(6) 6: DMA de-asserts Dreq and ack completing the handshake with P 1. Program memory μP 0 x 0000 Data memory 0 x 0001 No ISR needed! System bus. . . Main program. . . 100: instruction 101: instruction Embedded System Lab. II 15 Dack Dreq PC 100 0 DMA ctrl 0 x 0001 ack 0 x 8000 req 0 P 1 0 x 8000

Arbitration: Priority arbiter n Consider the situation where multiple peripherals request service from single resource (e. g. , microprocessor, DMA controller) simultaneously which gets serviced first? n Priority arbiter ü ü ü Single-purpose processor Peripherals make requests to arbiter, arbiter makes requests to resource Arbiter connected to system bus for configuration only Microprocessor System bus Inta Int 5 3 Priority arbiter 7 Peripheral 1 Ireq 1 Iack 1 6 Ireq 2 2 Iack 2 Embedded System Lab. II 16 Peripheral 2 2

Arbitration using a priority arbiter Microprocessor System bus Inta Int 5 3 7 Peripheral 1 Priority arbiter Ireq 1 Iack 1 6 Ireq 2 2 Peripheral 2 2 Iack 2 1. Microprocessor is executing its program. 2. Peripheral 1 needs servicing so asserts Ireq 1. Peripheral 2 also needs servicing so asserts Ireq 2. 3. Priority arbiter sees at least one Ireq input asserted, so asserts Int. 4. Microprocessor stops executing its program and stores its state. 5. Microprocessor asserts Inta. 6. Priority arbiter asserts Iack 1 to acknowledge Peripheral 1. 7. Peripheral 1 puts interrupt address vector on the system bus 8. Microprocessor jumps to the address of ISR read from data bus, ISR executes and returns (and completes handshake with arbiter). 9. Microprocessor resumes executing its program. Embedded System Lab. II 17

Arbitration: Priority arbiter n Types of priority ü Fixed priority § § § ü each peripheral has unique rank highest rank chosen first with simultaneous requests preferred when clear difference in rank between peripherals Rotating priority (round-robin) § § priority changed based on history of servicing better distribution of servicing especially among peripherals with similar priority demands Embedded System Lab. II 18

Arbitration: Daisy-chain arbitration n Arbitration done by peripherals ü Built into peripheral or external logic added § req input and ack output added to each peripheral n Peripherals connected to each other in daisy-chain manner ü ü ü One peripheral connected to resource, all others connected “upstream” Peripheral’s req flows “downstream” to resource, resource’s ack flows “upstream” to requesting peripheral Closest peripheral has highest priority P System bus Inta Int Peripheral 1 Peripheral 2 Ack_in Ack_out Req_out Req_in Daisy-chain aware peripherals Embedded System Lab. II 19 0

Arbitration: Daisy-chain arbitration n Pros/cons ü ü ü Easy to add/remove peripheral - no system redesign needed Does not support rotating priority One broken peripheral can cause loss of access to other peripherals Microprocessor P System bus Inta Int Priority arbiter Ireq 1 Iack 1 Peripheral 2 Inta Int Peripheral 1 Peripheral 2 Ack_in Ack_out Req_out Req_in Ireq 2 Daisy-chain aware peripherals Iack 2 Embedded System Lab. II 20 0

DMA Controller(DMAC) - PXA 255 n DMAC는 내부 및 외부 주변장치에 의해 발생된 요청에 대한 응답으로 메 인 메모리로 혹은 메인 메모리로부터 데이터를 전달 n DMAC는 16개의 DMA channels 지원 n DMAC는 flow-through transfers만을 지원 n Flow-through data는 데이터가 목적지에 저장되기 전에 DMAC를 통해 전 달됨 n DMAC는 flow-through transfers를 이용해서 memory-to-memory moves를 수행할 수 있음 n DMAC Channels ü ü 각 채널은 4개의 32 -bit registers에 의해 제어 각 채널은 내부/외부 장치중의 하나를 서비스하도록 구성 Each channel is serviced in increments of the peripheral device’s burst size Each channel is delivered in the granularity appropriate to that device’s port width. Embedded System Lab. II 21

DMA Channel n n n The burst size and port width for each device is programmed in the channel registers and is based on the device’s FIFO depth and bandwidth needs. When multiple channels are actively executing, the DMAC services each channel is serviced with a burst of data. After the data burst is sent, the DMAC may perform a context switch to another active channel. The DMAC performs context switches based on a channel’s activity, whether its target device is currently requesting service, and where that channel lies in the priority scheme. Channel information must be maintained on a per-channel basis and is contained in the DMAC registers see in Table 5 -13, “DMA Controller Registers” on page 5 -28. The DMAC supports two methods of loading the DMAC register, No. Descriptor and Descriptor Fetch Modes. Software must ensure cache coherency when it configures the DMA channels. Embedded System Lab. II 22

DMAC Block Diagram Memory Controller System Bus(internal) Control Register DMA Controller 16 DMA Channels Channel 15 DSCR 0 Channel 0 DREQ[1: 0] DDADR 0 (external) DSADR 0 DRCMR 0 DTADR 0 PREQ[37: 0] DCMD 0 DINT (internal) Peripheral Bus(internal) Embedded System Lab. II 23 DMA_IRQ (internal)
![DMA Signal Descriptions n The DREQ[1: 0], PREQ[37: 0] and DMA_IRQ signals are controlled DMA Signal Descriptions n The DREQ[1: 0], PREQ[37: 0] and DMA_IRQ signals are controlled](http://slidetodoc.com/presentation_image_h2/1c8578b9fb12606530bdae94b005c663/image-24.jpg)
DMA Signal Descriptions n The DREQ[1: 0], PREQ[37: 0] and DMA_IRQ signals are controlled by the DMAC n The DREQ[1: 0] signal must remain asserted for four MEMCLKs to allow the DMA to recognize the 0 to 1 transition. n When the DREQ[1: 0] signals are deasserted, they must remain deasserted for at least four MEMCLKs. n The DMAC registers the transition from 0 to 1 to identify a new request. n The external companion chip must not assert another DREQ until the previous DMA data transfer starts Embedded System Lab. II 24
![DMA Signal Descriptions n The DREQ[1: 0], PREQ[37: 0] and DMA_IRQ signals are controlled DMA Signal Descriptions n The DREQ[1: 0], PREQ[37: 0] and DMA_IRQ signals are controlled](http://slidetodoc.com/presentation_image_h2/1c8578b9fb12606530bdae94b005c663/image-25.jpg)
DMA Signal Descriptions n The DREQ[1: 0], PREQ[37: 0] and DMA_IRQ signals are controlled by the DMAC Embedded System Lab. II 25

DMA_IRQ Signal n The application processor has 16 IRQ signals, one for each DMA channel. n Each DMA IRQ can be read in the DINT register. n The user can mask some bits that cause interrupts on a channel, such as ENDIRQEN, STARTIRQEN, and STOPIRQEN. n When DMA interrupt occurs, it is visible in Pending Interrupt Register Bit. n When a pending interrupt becomes active, it is sent to the CPU if its corresponding ICMR mask Bit. Embedded System Lab. II 26

DMA Channel Priority Scheme n The DMA channel priority scheme allows peripherals that require high bandwidth to be serviced more often than those requiring less bandwidth. n The DMA channels are internally divided into four sets. ü ü ü ü Each set contains four channels. The channels get a round-robin priority in each set. Set zero has the highest priority. Set 1 has higher priority than sets two and three. Sets two and three are low priority sets. High bandwidth peripherals must be programmed in set zero. Memory-to-memory moves and low bandwidth peripherals must be programmed in set two or three. n When all channels are running concurrently, set zero is serviced four times out of any eight consecutive channel servicing instances. Set one is serviced twice and sets two and three are each serviced once. Embedded System Lab. II 27

DMA Channel Priority Scheme(2) n If all channels request data transfers, the Sets are prioritized in following order: ü Set zero -> Set one -> Set zero -> Set two -> Set zero -> Set one -> Set zero -> Set three Embedded System Lab. II 28

DMA Descriptors n DMAC 동작모드 ü ü Descriptor Fetch Mode와 No-Descriptor Fetch Mode 모드는 DCSRx[NODESCFETCH] bit에 의해 결정 모드는 각 채널에 대해 독립적으로 동시에 사용 가능 각 채널은 동작모드를 바꿀 때는 일단 정지해야만 함 Embedded System Lab. II 29

No-Descriptor Fetch Mode n DDADRx is reserved. n Software must not write to the DDADRx and must load the DSADRx, DTADRx, and DCMDx registers. n When the Run bit is set, the DMAC immediately begins to transfer data n The channel stops when it finishes the transfer. Embedded System Lab. II 30

No-Descriptor Fetch Mode 1. The channel is in an uninitialized state after reset. 2. The DCSR[RUN] bit is set to a 0 and the DCSR[NODESCFETCH] bit is set to a 1. 3. The software writes a source address to the DSADR register, a target address to the DTADR register, and a command to the DCMD register. The DDADR register is reserved in this No-Descriptor Fetch Mode and must not be written. 4. The software writes a 1 to the DCSR[RUN] bit and the No-Descriptor fetches are performed. 5. The channel waits for the request or starts the data transfer, as determined by the DCMD[FLOW] source and target bits. 6. The channel transmits a number of bytes equal to the smaller of DCMD[SIZE] and DCMD[LENGTH]. 7. The channel waits for the next request or continues with the data transfer until the DCMD[LENGTH] reaches zero. 8. The DDADR[STOP] is set to a 1 and the channel stops. Embedded System Lab. II 31

Descriptor Fetch Mode n DMAC registers are loaded from DMA descriptors in main memory. n Multiple DMA descriptors can be chained together in a list. n The descriptor’s protocol design allows descriptors to be added efficiently to the descriptor list of a running DMA stream. Embedded System Lab. II 32

Descriptor Fetch Mode 1. The channel is in an uninitialized state after reset. 2. The software writes a descriptor address (aligned to a 16 -byte boundary) to the DDADR register. 3. The software writes a 1 to the DCSR[RUN] bit. 4. The DMAC fetches the four-word descriptor from the memory indicated by DDADR. 5. The four-word DMA descriptor, aligned on a 16 -byte boundary in main memory, loads the following registers: a. Word [0] -> DDADRx register and a single flag bit. Points to the next four-word descriptor. b. Word [1] -> DSADRx register for the current transfer. c. Word [2] -> DTADRx register for the current transfer. d. Word [3] -> DCMDx register for the current transfer. Embedded System Lab. II 33

Descriptor Fetch Mode 6. The channel waits for the request or starts the data transfer, as determined by the DCMD[FLOW] source and target bits. 7. The channel transmits a number of bytes equal to the smaller of DCMD[SIZE] and DCMD[LENGTH]. 8. The channel waits for the next request or continues with the data transfer until the DCMD[LENGTH] reaches zero. 9. The channel stops or continues with a new descriptor fetch from the memory, as determined by the DDADR[STOP] bit. ü Software must set the DCSR[RUN] bit to 1 after it loads the DDADR. The channel descriptor fetch does not take place unless the DDADR register is loaded and the DCSR[RUN] bit is set to a 1. The DMAC priority scheme does not affect DMA descriptor fetches. The next descriptor is fetched immediately after the previous descriptor is serviced Embedded System Lab. II 34
![Byte Transfer Order n The DCMD[ENDIAN] bit indicates the byte ordering in a word Byte Transfer Order n The DCMD[ENDIAN] bit indicates the byte ordering in a word](http://slidetodoc.com/presentation_image_h2/1c8578b9fb12606530bdae94b005c663/image-35.jpg)
Byte Transfer Order n The DCMD[ENDIAN] bit indicates the byte ordering in a word when data is read from or written to memory. ü ü The DCMD[ENDIAN] bit must be set to 0, which is little endian transfers. If data is being transferred from an internal device to memory, DCMD[ENDIAN] is set to a 0, and DCMD[SIZE] is set to a 1, the memory receives the data in the following order: § § 1. Byte[0] 2. Byte[1] 3. Byte[2] 4. Byte[3] Embedded System Lab. II 35

Byte Transfer Order(2) Embedded System Lab. II 36

Servicing Internal Peripherals n The DMAC provides the DMA Request to Channel Map Registers (DRCMRx) that contain four bits used to assign a channel number for each possible DMA request. An internal peripheral can be mapped to any of the 16 available channels. Embedded System Lab. II 37

Using Flow-Through DMA Read Cycles to Service Internal Peripherals n A flow-through DMA read for an internal peripheral begins when the internal peripheral sends a request, via the PREQ bus, to a DMAC channel that is running and configured for a flow-through read. The number of bytes to be transferred is specified with DCMDx[SIZE]. When the request is the highest priority request, the following process begins: 1. The DMAC sends the memory controller a request to read the number of bytes addressed by DSADRx[31: 0] into a 32 -byte staging buffer in the DMAC. 2. The DMAC transfers the data to the I/O device addressed in DTADRx[31: 0]. DCMD[WIDTH] specifies the width of the internal peripheral to which the data is transferred. 3. At the end of the transfer, DSADRx is increased by the smaller value of DCMDx[LENGTH] and DCMD[SIZE]. DCMDx[LENGTH] is decreased by the same value. Embedded System Lab. II 38

Using Flow-Through DMA Read Cycles to Service Internal Peripherals n For a flow-through DMA read to an internal peripheral, use the following settings for the DMAC register bits: ü ü ü DSADR[SRCADDR] = external memory address DTADR[TRGADDR] = internal peripheral’s address DCMD[INCSRCADDR] = 1 DCMD[FLOWSRC] = 0 DCMD[FLOWTRG] = 1 Embedded System Lab. II 39

Using Flow-Through DMA Write Cycles to Service Internal Peripherals n A flow-through DMA write for an internal peripheral begins when the internal peripheral sends a request, via the PREQ bus, to a DMAC channel that is running and configured for a flow-through write. The number of bytes to be transferred are specified with DCMDx[SIZE]. When the request is the highest priority request, the following process begins: 1. The DMAC transfers the required number of bytes from the I/O device addressed by DSADRx[31: 0] to the DMAC write buffer. 2. The DMAC transfers the data to the memory controller via the internal bus. DCMD[WIDTH] specifies the width of the internal peripheral to which the transfer is being made. 3. At the end of the transfer, DTADRx is increased by the smaller value of DCMDx[LENGTH] and DCMD[SIZE]. DCMDx[LENGTH] is decreased by the same number. Embedded System Lab. II 40

Using Flow-Through DMA Write Cycles to Service Internal Peripherals n For a flow-through DMA write to an internal peripheral, use the following settings for the DMAC register bits: ü ü ü DSADR[SRCADDR] = internal peripheral address DTADR[TRGADDR] = external memory address DCMD[INCTRGADDR] = 1 DCMD[FLOWSRC] = 1 DCMD[FLOWTRG] = 0 Embedded System Lab. II 41

Using Flow-Through DMA Write Cycles to Service Internal Peripherals n A flow-through DMA read for an external peripheral begins when the external peripheral sends a request, via the DREQ[1: 0] bus, to a DMAC channel that is running and configured for a flow-through read. DCMDx[SIZE] specifies the number of bytes to be transferred. When the request is the highest priority request, the follow process begins. 1. The DMAC sends a request to the memory controller to read the number of bytes addressed by DSADRx[31: 0] into a 32 -byte staging buffer in the DMAC. 2. The DMAC transfers the data in the buffer to the external device addressed in DTADRx[31: 0]. 3. At the end of the transfer, DSADRx is increased by the smaller value of DCMDx[LENGTH] and DCMD[SIZE]. DCMDx[LENGTH] is decreased by the same value. ü Note: The process shown for a flow-through DMA read to an external peripheral indicates that the external address increases. Some external peripherals, such as FIFOs, do not require an increment in the external address. Embedded System Lab. II 42

Using Flow-Through DMA Read Cycles to Service External Peripherals n For a flow-through DMA read to an external peripheral, use the following settings for the DMAC register bits: ü ü ü DSADR[SRCADDR] = external memory address DTADR[TRGADDR] = companion chip’s address DCMD[INCSRCADDR] = 1 DCMD[INCTRGADDR] = 0 DCMD[FLOWSRC] = 0 DCMD[FLOWTRG] = 1 Embedded System Lab. II 43

Using Flow-Through DMA Write Cycles to Service External Peripherals n A flow-through DMA write to an external peripheral begins when the external peripheral sends a request, via the DREQ bus, to a DMAC channel that is running and configured for a flow-through write. DCMDx[SIZE] specifies the number of bytes to be transferred. When the request is the highest priority request, the following process begins: 1. The DMAC transfers the required number of bytes from the I/O device addressed by DSADRx[31: 0] to the DMAC write buffer. 2. The DMAC transfers the data to the memory controller via the internal bus. 3. At the end of the transfer, DTADRx is increased by the smaller value of DCMDx[LENGTH] and DCMD[SIZE]. DCMDx[LENGTH] is decreased by the same number. ü Note: The process shown for a flow-through DMA write to an external peripheral indicates that the external address increases. Some external peripherals, such as FIFOs, do not require an increment in the external address. Embedded System Lab. II 44

Using Flow-Through DMA Write Cycles to Service External Peripherals n For a flow-through DMA write to an external peripheral, use the following settings for the DMAC register bits: ü ü ü DSADR[SRCADDR] = companion chip address DTADR[TRGADDR] = external memory address. DCMD[INCSRCADDR] = 0 DCMD[INCTRGADDR] = 1 DCMD[FLOWSRC] = 1 DCMD[FLOWTRG] = 0 Embedded System Lab. II 45

Memory-to-Memory Moves n Memory-to-memory moves do not involve the DREQ and PREQ request signals. n The processor writes to the DCSR[RUN] bit and a channel is configured for a memory-to-memory move. n The DCMDx[FLOWSRC] and the DCMD[FLOWTRG] bits must be set to 0. n If DCMD[IRQEN] is set to a 1, a DMA interrupt is requested at the end of the last cycle associated with the byte that caused DCMDx[LENGTH] to decrease from 1 to 0. Embedded System Lab. II 46

Memory-to-Memory Moves n A flow-through DMA memory-to-memory read or write goes through the following steps: 1. The processor writes to the DCSR[RUN] register bit and starts the memory-to-memory moves. 2. If the application processor is in the Descriptor Fetch Mode, the channel configured for the move fetches the four-word descriptor. The channel transfers data without waiting for PREQ or DREQ to be asserted. The smaller value of DCMDx[SIZE] or DCMDx[LENGTH] specifies the number of bytes to be transferred. 3. The DMAC sends a request to the memory controller to read the number of bytes addressed by DSADRx[31: 0] into a 32 -byte staging buffer in the DMAC. 4. The DMAC generates a write cycle to the location addressed in DTADRx[31: 0]. 5. At the end of the transfer, DSADRx and DTADRx are increased by the smaller value of DCMD[SIZE] and DCMDx[LENGTH]. If DCMD[SIZE] is smaller than DCMDx[LENGTH], DCMDx[LENGTH] is decreased by DCMD[SIZE]. If DCMD[SIZE] is equal to or larger than DCMDx[LENGTH], DCMDx[LENGTH] is zero. Embedded System Lab. II 47

Memory-to-Memory Moves n For a memory-to-memory read or write, use the following settings for the DMAC registers: ü ü ü ü DSADR[SRCADDR] = external memory address DTADR[TRGADDR] = external memory address DCMD[INCSRCADDR] = 1 DCMD[INCTRGADDR] = 1 DCMD[FLOWSRC] = 0 DCMD[FLOWTRG] = 0 DCSR[RUN] =1 Embedded System Lab. II 48

Using Flow-Through DMA Write Cycles to Service External Peripherals n For a flow-through DMA write to an external peripheral, use the following settings for the DMAC register bits: ü ü ü DSADR[SRCADDR] = companion chip address DTADR[TRGADDR] = external memory address. DCMD[INCSRCADDR] = 0 DCMD[INCTRGADDR] = 1 DCMD[FLOWSRC] = 1 DCMD[FLOWTRG] = 0 Embedded System Lab. II 49

DMA 예제 프로그램 n 기본적인 DMA 동작을 이해하기 위한 프로그램 작성 n Memory-to-Memory Data transfer를 위한 sample code 작성 Embedded System Lab. II 50

DMA sample code(1) /* dma_driver. c - driver module */ /* DMA Example : DMA Memory-To-Memory Data Transfer Programmed by Yoo, Sangshin(Sungkonghoe University). */ #include <linux/fs. h> #include <linux/kdev_t. h> #include <asm/uaccess. h> #include <linux/module. h> #include <linux/init. h> #include <linux/kernel. h> #include <linux/sched. h> #include <linux/errno. h> #include <asm/system. h> #include <asm/irq. h> #include <asm/hardware. h> #include <asm/dma. h> #include <linux/slab. h> #include <linux/devfs_fs_kernel. h> #include <asm-arm/io. h> // consistent_alloc, consistent_free Embedded System Lab. II 51

DMA sample code(2) #define DEVICE_NAME "DMA" #define DMA_LENGTH 2000 #define DCMD_MEMTOMEM (DCMD_INCSRCADDR | DCMD_INCTRGADDR | DCMD_BURST 32 | DCMD_WIDTH 4) /* Global Variables */ static int s_n. Major = 0; MODULE_LICENSE("GPL"); /* Device Operations */ static void dma_irq(int ch, void *dev_id, struct pt_regs *regs); static ssize_t dma_read(struct file *filp, char *buf, size_t count, loff_t *l); static int dma_open(struct inode *inode, struct file *filp); int dma_release(struct inode *inode, struct file *pfile); static struct file_operations device_fops = { read: dma_read, open: dma_open, release: dma_release, }; Embedded System Lab. II 52

DMA sample code (3) dma_addr_t dma_A_phys; dma_addr_t dma_B_phys; char *dma_A; char *dma_B; static u_int ch; static DECLARE_WAIT_QUEUE_HEAD(wait_queue); /* Module startup/cleanup */ int init_module(void) { printk(DEVICE_NAME " : Loading DMA Memory-To-Memory Copy Modulen"); if ((s_n. Major = register_chrdev(0, DEVICE_NAME, &device_fops)) < 0) { printk(DEVICE_NAME " : Device registration failed (%d)n", s_n. Major); return s_n. Major; } printk(DEVICE_NAME " : Device registered with Major Number = %dn", s_n. Major); return 0; } Embedded System Lab. II 53

DMA sample code (4) void cleanup_module(void){ int n. Ret. Code; printk(DEVICE_NAME " : Unloading DMA Memory-To-Memory Copy Modulen"); if ((n. Ret. Code = unregister_chrdev(s_n. Major, DEVICE_NAME)) < 0) printk(DEVICE_NAME " : Device unregistration failed (%d)n", n. Ret. Code); } static void dma_irq(int ch, void *dev_id, struct pt_regs *regs){ u_int dcsr; DMA 동작이 정상적으로 완료 되면 해당 인터럽트 발생 dcsr = DCSR(ch); DCSR(ch) = dcsr & ~DCSR_STOPIRQEN; if (dcsr & DCSR_BUSERR) printk(DEVICE_NAME " : M-to-M DMA: bus error on channel %dn", ch); if (dcsr & DCSR_ENDINTR) printk(DEVICE_NAME " : M-to-M DMA: ENDINTR error on channel %dn", ch); if (dcsr & DCSR_STOPIRQEN) printk(DEVICE_NAME " : M-to-M DMA: STOPIRQEN error on channel %dn", ch); if (dcsr & DCSR_STOPSTATE) printk(DEVICE_NAME " : M-to-M DMA: STOPSTATE error on channel %dn", ch); wake_up_interruptible(&wait_queue); } Embedded System Lab. II 54

DMA sample code (5) static ssize_t dma_read(struct file *filp, char *buf, size_t count, loff_t *l) { DCSR(ch) = DCSR_NODESC; DSADR(ch) = dma_A_phys; // physical address required DTADR(ch) = dma_B_phys; DCMD(ch) = DCMD_MEMTOMEM | DCMD_ENDIRQEN | DMA_LENGTH; DCSR(ch) |= DCSR_RUN; DMA 를 동작시키기 위한 레지 스터 설정 - 순서와 값은 앞 부 분에 설명된 DMAC 참조 interruptible_sleep_on(&wait_queue); if (!strncmp(dma_A, dma_B, DMA_LENGTH)) copy_to_user(buf, "sucess", sizeof("sucess")); else copy_to_user(buf, "fail", sizeof("fail")); return 0; } Embedded System Lab. II 55

DMA sample code (6) static int dma_open(struct inode *inode, struct file *filp) { ch = pxa_request_dma(DEVICE_NAME, DMA_PRIO_HIGH, dma_irq, 0); if (ch < 0 ){ printk(DEVICE_NAME " : dma request failed (%d)n", s_n. Major); return -1; } No-cached 메모리 할당 printk(DEVICE_NAME " : dma request success (%d)n", ch); dma_A =(char*)consistent_alloc(GFP_KERNEL, DMA_LENGTH, &dma_A_phys); if(!dma_A){ printk(DEVICE_NAME " : dma_A Memory allocation errorn"); if (ch){ pxa_free_dma(ch); return -1; } dma_B = (char*)consistent_alloc(GFP_KERNEL, DMA_LENGTH, &dma_B_phys); if(!dma_B){ printk(DEVICE_NAME " : dma_B Memory allocation errorn"); if (ch) pxa_free_dma(ch); consistent_free(dma_A, DMA_LENGTH, (dma_addr_t)&dma_A_phys); return -1; } Embedded System Lab. II 56

DMA sample code (7) printk(DEVICE_NAME " : Memory allocation Sucessn"); printk(DEVICE_NAME " : Source Adress - 0 x%xn", dma_A_phys); printk(DEVICE_NAME " : Target Adress - 0 x%xn", dma_B_phys); memset(dma_A, 'A', DMA_LENGTH); memset(dma_B, 'B', DMA_LENGTH); MOD_INC_USE_COUNT; return 0; } int dma_release(struct inode *inode, struct file *pfile){ MOD_DEC_USE_COUNT; if(ch) pxa_free_dma(ch); consistent_free(dma_A, DMA_LENGTH, (dma_addr_t)&dma_A_phys); consistent_free(dma_B, DMA_LENGTH, (dma_addr_t)&dma_B_phys); return 0; } Embedded System Lab. II 57

DMA test 응용 프로그램 #include <stdio. h> /* dma_test. c */ #include <stdlib. h> #include <unistd. h> #include <sys/types. h> #include <sys/stat. h> #include <fcntl. h> #include <errno. h> #include <string. h> static int dev; int main(void){ char buff[12]; dev = open("/dev/dma", O_RDWR); if(dev < 0) { printf( "Device Open ERROR!n"); exit(1); } read(dev, buff, sizeof(buff)); printf("n. DMA Memory-To-Memory Copy Resultn"); if (!strcmp(buff, "sucess")) printf(" : Sucess!n"); else printf(" : Fail!n"); close(dev); return 0; } Embedded System Lab. II 58

Makefile 작성 # DMA module Makefile /* Makefile */ CC = arm-linux-gcc # PXA 255 -Pro KERNELDIR = /root/PXA 255 -Pro/Kernel/linux-2. 4. 19 -pxa 255 INCLUDEDIR = $(KERNELDIR)/include CFLAGS = -D__KERNEL__ -DMODULE -O -Wall -I$(INCLUDEDIR) OBJS = dma_driver. o dma_test. o all : $(OBJS) dma_test : dma_test. o $(CC) -o $@ $^ clean : rm -f dma_test *. o *~ Embedded System Lab. II 59

DMA sample 실행(1) n DMA 모듈 컴파일 방법 # make n DMA 모듈 실행 방법 ü ü ü ü 생성된 dma_driver. o, dma_test 파일을 타겟 보드로 전송 chmod 755 dma_test 로 실행권한 설정 insmod로 dma_driver 모듈을 커널에 loading mknod /dev/dma c 253 0 실행. /dma_test 실행 실행결과 확인 rmmod로 dma_driver 모듈을 커널에 unloading Embedded System Lab. II 60

DMA sample 실행(2) n DMA 모듈 컴파일 Embedded System Lab. II 61

DMA sample 실행(3) n DMA sample 실행 Embedded System Lab. II 62

DINT(DMA Interrupt Register) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 RESERVED<31: 16> Read ad unknown and must be written as zero CHLINTRx<15: 0> Channel ‘x’ Interrupt (read-only). 0 = no interrupt 1= interrupt Embedded System Lab. II 63

DCSRx(DMA Channel Control/status Register)(1) BUSERRINTR STOPSTATE RUN 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 0 0 RESERVED NODESCFETCH STOPIRQEN REQPEND Reserved RUN<31> Run bit(read/write) 0 : stops the channel 1 : starts the channel NODESCFETCH<30> No-Descriptor Fetch(read/write) 0 : Descriptor Fetch Mode 1 : No-Descriptor Fetch Mode STOPIRQEN<29> Stop Interrupt Enable(read/write) 0 : No Interrupt 1 : Enable an Interrupt RESERVED<28: 9> Read ad unknown and must be written as zero REQPEND<8> Request Pending(read-only) 0 : no pending request 1 : the channel has a pending request Embedded System Lab. II 64 STARTINTR ENDINTR

DCSRx(DMA Channel Control/status Register)(2) BUSERRINTR STOPSTATE RUN 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 1 0 0 0 RESERVED STOPIRQEN REQPEND Reserved STARTINTR ENDINTR RESERVED<7: 4> Read ad unknown and must be written as zero STOPSTATE<3> Stop State(read-only) 0 : channel is running 1 : channel is in uninitialized or stopped state ENDINT<2> End Interrupt(read/write) 0 : no interrupt 1 : interrupt caused because the current transaction was successfully completed and DCMD[LENGTH]=0 STARINTR<1> Start Interrupt(read/write) 0 : no interrupt 1 : interrupt caused due to successful descriptor fetch BUSERRINTR<0> Bus Interrupt(read/write) 0 : no interrupt 1 : bus error caused interrupt Embedded System Lab. II 65

DRCMRx(DMA Request to Channel Map Registers) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 Reserved MAPVLD CHLNUM RESERVED<31: 8, 6 -4> Read ad unknown and must be written as zero MAPVLD<7> Map Valid (read / write). 0 = Request is unmapped 1 = Request is mapped to a channel indicated by DRCMRx[3: 0] Determines whether the request is mapped to a channel or not. If the bit is set to a 1, the request is mapped to a channel indicated in DRCMRx[3: 0]. If the bit is 0, the request is unmapped. This bit can also be used to mask the request. CHLNUM<3 -0> Channel Number (read / write). Indicates the channel number if DRCMR[MAPVLD] is set to a 1. Do not map two active requests to the same channel. It produces unpredictable results. Refer to “DMA Channel Priority Scheme” to review the channel priority scheme. Embedded System Lab. II 66

DDADRx(DMA Descriptor Address Register) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ? ? ? ? ? ? ? ? 0 DESCRIPTOR ADDRESS RESERVED STOP<0> STOP(read/write) 0 : Run channel 1 : Stop channel RESERVED<3: 1> Read as unknown and must be written as zero DESCRIPTOR ADDRESS<31: 4> Address of next descriptor(read/write) Embedded System Lab. II 67

DSADRx(DMA Source Address Register) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ? ? ? ? ? ? ? ? 0 0 SRCADDR(SOURCE ADDRESS) RESERVED<1: 0> Read as unknown and must be written as zero SRCADDR<31: 3> Source Address (read / write). Address of the internal peripheral or address of a memory location. Address of a memory location for companion -chip transfer SRCADDR<2> Source Address Bit 2 RESERVED if DSADR. Src. Addr is an external memory location Not reserved if DSADR. Src. Addr is an internal peripheral (read / write). Embedded System Lab. II 68

DTADRx(DMA Target Address Register) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ? ? ? ? ? ? ? ? 0 0 TRGADDR(TARGET ADDRESS) RESERVED<1: 0> Read as unknown and must be written as zero TRGADDR<31: 3> Target Address (read / write): Address of the on chip peripheral or the address of a memory location Address of a memory location for companion chip transfer TRGADDR<2> Target Address Bit 2 RESERVED if DTADR. Trg. Addr is an external memory location Not reserved if DTADR. trg. Addr is an internal peripheral (read / write). Embedded System Lab. II 69

DCMDx(DMA Command Register)(1) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 INCSRCADDR<31> Source Address Increment Setting. (read / write). 0 = Do not increment Source Address 1 = Increment Source Address at the end of each internal bus transaction initiation by DCMD[SIZE] If the source address is an internal peripheral’s FIFO address or external IO address, the address is not incremented on each successive access. In this case, this bit must be 0. INCTRGADDR<30> Target Address Increment Setting. (read / write). 0 = Do not increment Target Address 1 = Increment Target Address at the end of each internal bus transaction initiated by DCMD[SIZE] If the target address is an internal peripheral’s FIFO address or external IO address, the address is incremented on each successive access. In this cases the bit must be 0. FLOWSRC<29> Flow Control by the source. (read / write). 0 = Start the data transfer immediately. 1 = Wait for a request signal before initiating the data transfer. Indicates the flow control of the source. This bit must be ‘ 1’ if the source is an onchip or external peripheral. If either the DCMD[FLOWSRC] or DCMD[FLOWTRG] bit is set, the current DMA does not initiate a transfer until it receives a request. Do not set both the DCMD[FLOWTRG] and DCMD[FLOWSRC] bit to 1. Embedded System Lab. II 70

DCMDx(DMA Command Register)(2) FLOWTRG<28> Flow Control by the target. (read / write). 0 = Start the data transfer immediately. 1 = Wait for a request signal before initiating the data transfer. Indicates the Flow Control of the target. This bit must be ‘ 1’ if the target is an onchip or external peripheral. If either the DCMD[FLOWSRC] or DCMD[FLOWTRG] bit is set, the current DMA does not initiate a transfer until it receives a request. Do not set both the DCMD[FLOWTRG] and DCMD[FLOWSRC] bit to 1. STARTIRQEN<22> Start Interrupt Enable (read / write), Reserved for the No-Descriptor Fetch Mode 0 = no interrupt is generated. 1 = Allow interrupt to pass when the descriptor (i. e. , 4 words) for the channel are loaded. Sets DCSR[Start. Intr] interrupt for the channel when this descriptor is loaded. ENDIRQEN<22> End Interrupt Enable (read / write). 0 = no interrupt is generated. 1 = set DCSR[End. Intr] interrupt for the channel when DCMD[LENGTH] is decreased to zero. Indicates that the interrupt is enabled as soon as the data transfer is completed. RESERVED<27: 23, 20: 19, 13> Read ad unknown and must be written as zero Embedded System Lab. II 71

DCMDx(DMA Command Register)(3) ENDIAN<18> Device Endian-ness. (read / write). 0 = Byte ordering is little endian 1 = RESERVED SIZE<17: 16> Maximum Burst Size of each data transferred (read / write). 00 = RESERVED, 01 = 8 Bytes 10 = 16 Bytes 11 = 32 Bytes If DCMDx[LENGTH] is less than DCMDx[SIZE] the data transfer size equals DCMDx[LENGTH]. WIDTH<15: 14> Width of the on-chip peripheral. (read / write/). 00 = Reserved 01 = 1 byte 10 = Half. Word (2 bytes) 11 = Word (4 Bytes) Must be programmed 00 for memory-to-memory moves or companion-chip related operations. LENGTH<12: 0> Length of transfer in bytes. (read / write). Indicates the length of transfer in bytes. DCMD[LENGTH] = 0 means zero bytes for Descriptor Fetch Mode only. DCMD[LENGTH] = 0 is an invalid setting for the No-Descriptor Fetch Mode. The maximum transfer length is (8 K-1) bytes. If the transfer involves any of the internal peripherals, the length of the transfer must be an integer multiple of the width of that peripheral. Embedded System Lab. II 72
- Slides: 72