Introduction SYSC 5603 ELG 6163 Digital Signal Processing

  • Slides: 49
Download presentation
Introduction SYSC 5603 (ELG 6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic

Introduction SYSC 5603 (ELG 6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic 1

Outline • Introduction to the course • Computer architectures for signal processing • Design

Outline • Introduction to the course • Computer architectures for signal processing • Design cycle 2

Course Outline Hardware • DSP Systems, A/D and D/A converters • Architectural Analysis of

Course Outline Hardware • DSP Systems, A/D and D/A converters • Architectural Analysis of a DSP Device, TMS 320 C 6 x, Tiger. Sharc, Blackfin • FPGA for signal processing (Altera, Xilinx), • Application domain specific instruction set processors • So. C, DSP Multiprocessors • Signal processing arithmetic units Algorithm design and transformations • Scheduling, Resource Allocation, Synthesis • Finite-word length effects • Algorithmic transformations • FIR filter design • FFT design • IIR filter design • Adaptive filter design 3

Course Conduct • Course notes will be posted on the course web page •

Course Conduct • Course notes will be posted on the course web page • Assignments with solutions will be provided and will not be graded • There is no text-book • The exam will be prepared based on lecture slides, references and assignments 4

Paper Analysis and Presentation • Topics are related to the studied material • Each

Paper Analysis and Presentation • Topics are related to the studied material • Each student will present for 15 minutes • Discussion will follow after the presentation • Each student has to choose one topic before January 16 th at 7 pm. • Each student have to send a document (from 8 -10 pages) font 12 single spaced three days before the presentation. • The document has to be revised after my comments • 15 presentation slides max (10 minutes, 15 min max) • The mark is 50% document, 50% presentation • Some preliminary time schedule is given on the course web page. This time schedule will be updated on January 16 th • Your reports will be posted on the course Web page. Please see the paper on plagiarism: How to Handle Plagiarism: New Guidelines 5

Presentation topics- Computer architectures • Configurable processors for DSP applications – The analysis of

Presentation topics- Computer architectures • Configurable processors for DSP applications – The analysis of processors with configurable instructions sets. Analysis of the tools. Include Tensilica, Altera and Coware solutions (Lisatek). An example of existing designs using configurable processors. • Multiprocessors for DSP – Analysis of papers including [Kumar 05] and [Wiangtong 05]. Analysis of current hardware solutions. Analysis of tools including CMPWARE. An example of existing designs using multi-processors. • IP core design. Current standards related to IP core design. Standard buses used for IP cores. Advantages and disadvantages of hard and soft IP cores. DSP processor cores. DSP hardware cores. 6

Presentation topics- Tools • Design space exploration tools – The analysis of the tools

Presentation topics- Tools • Design space exploration tools – The analysis of the tools for design space exploration. Simulink based tools Accel. Chip vs. C-based tools (Coware). Performance and differences. • Direct mapping from algorithms to hardware – Analysis of different tools (Simulink, Synopsys System Studio, Co. Ware's SPW 5 -XP) and design processes used for automated implementation of signal processing algorithms to FPGA. Analysis of quality and speed of these automated implementations. • Comparison between Handle. C, Spec. C and System. C – What is the main difference of these languages. Which language should be taken for which application? Which of these languages have total support from algorithm design to the implementation (example Synopsys System. C solution). • Tools for the analysis of the optimal-word length – Analyze the tools for floating to fixed point precision. Compare solutions from Mathworks, Synopsys and Accel. Chip. • TI standard for writing algorithms - e. Xpress. DSP Algorithm 7

Presentation topics - Applications • Software-defined radio – Analysis of signal processing algorithms used

Presentation topics - Applications • Software-defined radio – Analysis of signal processing algorithms used for software defined radios. Computer architectures for software defined radios. List of commercial platforms and development tools. • Signal processing for wireless sensor networks – Analysis of signal processing algorithms used for wireless sensor networks: positioning, tracking, data fusion, sensor processing. Analysis of DSP architectures used in sensor networks. Specifics of algorithm designs for wireless sensor networks. • Tracking applications – Detailed analysis of different tracking and navigation application including: aircraft positioning, target tracking for radar and sonar applications, car collision detection, and positioning and tracking in homeland security applications. Define the requirements for each application such as sampling rate, accuracy, latency, range. Discuss about the algorithms and about the hardware platforms used for each applications 8

Project • • Project proposals are expected by February 6 th. Deadline for project

Project • • Project proposals are expected by February 6 th. Deadline for project demonstration: March 31 Deadline for project report: March 27 Grade: 20% Project Proposal, 20% Project Report, 20% Project Presentation, 40% Demonstration • • You propose the algorithm and the application Two defined projects – Float-to-fixed point analysis and implementation of particle filters (Simulink or Synopsys System Studio) using FPGA – Comparison of different implementations of atan function using PDSP and FPGA platforms (VHDL) • Project platforms and tools: 1. Implementing signal processing algorithms using configurable processors with DSP blocks (Tensilica and NIOS II 1) 2. The analysis of VLIW architectures and simulators for signal processing (Hardware design) 3. System level design using Simulink & Altera's DSP Builder 1 4. System level design using System. C under Synopsys System Studio 5. Multiprocessing using CMPWARE (Java, NIOS II) 9 1 – might be the license problem

Project topics • Implementations of different algorithms on the same platform for the purpose

Project topics • Implementations of different algorithms on the same platform for the purpose of comparison of the algorithms Examples: – Implementation of multimedia signal processing algorithm in programmable dsp chips (TI TMS 32060) using the algorithm transformation techniques and compare to existing implementations. It is requried to discuss the VLIW instructure architecture and demonstrate how algorithm transformation/mappling techniques are being used to generate the code. – Comparison of different implementations of atan function using PDSP and FPGA platforms (VHDL). • Implementation of a DSP algorithm on new platforms. Examples: – Comparison of performance of Kalman filter implementations on configurable processors – Development of parallel Kalman filtering algorithm suitable for multiprocessor implementation. • Implementation of complex algorithms on FPGAs – It requires full implementation cycle from the implementation of these algorithms on Matlab/Simulink to their implementation. Mapping between the algorithms and the hardware have to be performed. Floating to fixed point analysis have to be performed 10

Project report Proposal: The purposes of writing a project proposals are: (i) to determine

Project report Proposal: The purposes of writing a project proposals are: (i) to determine the topic, (ii) to show that preliminary study of the subject materials have been done, (iii) to assess the likelihood of success of the project, (iv) to give the plan to carry out the project. You should submit a three to five pages proposal to the instructor for approval of the project. A face to face discussion lasting 5 -10 minutes between the instructor and the student is required. This discussion should take place during one of the office hours of the instructor. At the end of this discussion, the instructor will either approve the proposal and assign a grade, or reject the proposal and let the team know the reason. In the latter case, the team must come up with an revised proposal or an alternate new proposal before a deadline specified in the course outline. Preliminary discussion and the instructor can also be held in advance during their office hours. However, the opinion expressed by the teaching staff during these preliminary discussions are only suggestions. The team members are responsible to use their best judgement to prepare the proposal for approval. The format of the proposal is as follows: • title of the project • project highlight -- explain what you want to do in this project, • Motivation -- explain the significance of the proposed project and the relevance of the project to this course • Prior art -- listing at least three previous works (papers, books, etc. ) that reported work most closely related to the current project. Briefly review their approaches, advantages and shortcomings. • Approach -- outline proposed approaches. Including preliminary analytical result, or implementation prototype as appropriate, a schedule of tasks to be performed, etc. • expected results -- what can be promised in the final project report that is not part of the proposal. • Task planning --specify when you will do what. Report: A type-written, hardcopy project report, as well as an electronic version (including source code, design files developed) are to be submitted at the end of the semester. The length of the report is not restricted. However, the report must be include the following sections: • Introduction: Motivation and backgrounds. • Main body of report. Depending on types of project, this part may include method used, approaches taken, problem description, etc. • Conclusion and discussion: Highlight your achievement in this project and things may be done in the future. More details about the project will follow Copied from http: //homepages. cae. wisc. edu/~ece 734/project/index. html 11

Course Objectives … To • Understand tradeoffs in implementing DSP algorithms • Know basic

Course Objectives … To • Understand tradeoffs in implementing DSP algorithms • Know basic DSP architectures • Know some reduced complexity strategies for algorithms mainly on FPGA. • Know about commercial DSP solution • Know and understand system-level design tools • Understand research topics related to algorithmic modifications and algorithm-architecture matching 12

Why this course? There is the demand to derive more information per signal. “More”

Why this course? There is the demand to derive more information per signal. “More” means • Faster: Derive more information per unit time; – Faster hardware – Newer algorithms with fewer operations • Cheaper: Derive information at a reduced cost in processor size, weight, power consumption, or dollars; • Better: Derive higher quality information, (higher precision, finer resolution, higher signal-to-noise ratio) [Richards 04 ] 13

Hardware and software elements Progress in signal processing capability is the product of progress

Hardware and software elements Progress in signal processing capability is the product of progress in IC devices, architectures, algorithms and mathematics. [Richards 04 ] 14

Moore’s Law Predicts doubling of circuit density every 1. 5 to 2 years. http:

Moore’s Law Predicts doubling of circuit density every 1. 5 to 2 years. http: //www. icknowledge. com/trends/uproc. html 15

What is Signal Processing? • Ways to manipulate signal in its original medium or

What is Signal Processing? • Ways to manipulate signal in its original medium or an abstract representation. • Signal can be abstracted as functions of time or spatial coordinates. • Types of processing: – – – – – Transformation Filtering Detection Estimation Recognition and classification Coding (compression) Synthesis and reproduction Recording, archiving Analyzing, modeling Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 16

Digital Signal Processing • Signals generated via • Digital signal processing physical phenomenon are

Digital Signal Processing • Signals generated via • Digital signal processing physical phenomenon are concerns processing analog in that signals using digital – Their amplitudes are computers. defined over the range of real/complex numbers – Their domains are continuous in time or space. – A continuous time/space signal must be sampled to yield countable signal samples. – The real-(complex) valued samples must be quantized to fit into internal word length. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 17

Signal Processing Systems A/D Digital Signal Processing D/A The task of digital signal processing

Signal Processing Systems A/D Digital Signal Processing D/A The task of digital signal processing (DSP) is to process sampled signals (from A/D analog to digital converter), and provide its output to the D/A (digital to analog converter) to be transformed back to physical signals. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 18

Stratix DSP Development Board Nios Expansion Prototype Connector MAX 7000 Device Prototyping Area D/A

Stratix DSP Development Board Nios Expansion Prototype Connector MAX 7000 Device Prototyping Area D/A Converters Mictor-Type Connectors for HP Logic Analyzers A/D Converters Analog SMA Connectors Texas Instruments Connectors on Underside of Board [Altera. DSP] 40 -Pin Connectors for Analog Devices 19

Example DSP Applications…. n COMMUNICATIONS äEcho Cancellation PBXs äLine Repeaters äModems äGlobal Positioning äSound/Modem/Fax

Example DSP Applications…. n COMMUNICATIONS äEcho Cancellation PBXs äLine Repeaters äModems äGlobal Positioning äSound/Modem/Fax Cards äCellular Phones äSpeaker Phones äVideo Conferencing äATMs äDigital n n VOICE/SPEECH n Recognition äSpeech Processing/Vocoding äSpeech Enhancement äText-to-Speech äVoice Mail äAV Editing Mixers äHome Theater äPro Audio äDigital n Detectors Tools äDigital Audio / TV äMusic Synthesizers äToys / Games äAnswering Machines äDigital Speakers äPower DSP INSTRUMENTATION Analyzers Processors äDigital Oscilloscopes äMass Spectrometers äSeismic INDUSTRIAL/CONTROL äRobotics äNumeric Control Line Monitors äMotor/Servo Control äPower CONSUMER äRadar äSpectrum n PRO-AUDIO äSpeech n MEDICAL äPatient Monitoring äUltrasound Equipment äDiagnostic Tools äFetal Monitors äLife Support Systems äImage Enhancement n MILITARY äSecure Communications Processing äImage Processing äRadar Processing äNavigation, Guidance äSonar 20 www. analog. com/dsp

Implementation of DSP Systems • Platforms: – Native signal processing (NSP) with general purpose

Implementation of DSP Systems • Platforms: – Native signal processing (NSP) with general purpose processors (GPP) • Multimedia extension (MMX) instructions – Programmable digital signal processors (PDSP) – Application-Specific Integrated Circuits (ASIC) – Field-programmable gate array (FPGA) • Requirements: – Real time • Processing must be done before a prespecified deadline. – Streamed numerical data • Sequential processing • Fast arithmetic processing – High throughput • Fast data input/output • Fast manipulation of data Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 21

How Fast is Enough for DSP? • Real time requirements: – Example: data capture

How Fast is Enough for DSP? • Real time requirements: – Example: data capture speed must match sampling rate. Otherwise, data will be lost. – Processing must be done by a specific deadline. • Different throughput rates for processing different signals – – Throughput sampling rate. CD music: 44. 1 k. Hz Speech: 8 -22 k. Hz Video (depends on frame rate, frame size, etc. ) range from 100 s k. Hz to MHz. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 22

ASIC: Application Specific ICs • Custom or semi-custom IC chip or chip sets developed

ASIC: Application Specific ICs • Custom or semi-custom IC chip or chip sets developed for specific functions. • Suitable for high volume, low cost productions. • Example: MPEG codec, 3 D graphic chip, etc. • ASIC becomes popular due to availability of IC foundry services. Fabless design houses turn innovative design into profitable chip sets using CAD tools. • Design automation is a key enabling technology to facilitate fast design cycle and shorter time to market delay. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 23

Programmable Digital Signal Processors (PDSPs) • Micro-processors designed for signal processing applications. • Special

Programmable Digital Signal Processors (PDSPs) • Micro-processors designed for signal processing applications. • Special hardware support for: – Multiply-and-Accumulate (MAC) ops – Saturation arithmetic ops – Zero-overhead loop ops – Dedicated data I/O ports – Complex address calculation and memory access – Real time clock and other embedded processing supports. • PDSPs were developed to fill a market segment between GPP and ASIC: – GPP flexible, but slow – ASIC fast, but inflexible • As VLSI technology improves, role of PDSP changed over time. – Cost: design, sales, maintenance/upgrade – Performance Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 24

[Seshan 98] 25

[Seshan 98] 25

PDSP Market – By Company Ref: Forward Concepts http: //www. fwdconcepts. com/Pages/press 42. htm

PDSP Market – By Company Ref: Forward Concepts http: //www. fwdconcepts. com/Pages/press 42. htm 26

DSP Market – By Application Ref: Forward Concepts http: //www. fwdconcepts. com/Pages/press 42. htm

DSP Market – By Application Ref: Forward Concepts http: //www. fwdconcepts. com/Pages/press 42. htm 27

Computing using FPGA • FPGA (Field programmable gate array) is a derivative of PLD

Computing using FPGA • FPGA (Field programmable gate array) is a derivative of PLD (programmable logic devices). • They are hardware configurable to behave differently for different configurations. • Slower than ASIC, but faster than PDSP. • Once configured, it behaves like an ASIC module. • Use of FPGA – Rapid prototyping: run fractional ASIC speed without fab delay. – Hardware accelerator: using the same hardware to realize different function modules to save hardware – Low quantity system deployment Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 28

Stratix EP 1 S 10 Altera Corp. , Stratix Module 2: Logic Structure &

Stratix EP 1 S 10 Altera Corp. , Stratix Module 2: Logic Structure & Multi. Track Interconnect, 2004. 29

IP Cores • Processor cores Start-Core – 16 -bit fixed-point VLIW DSP core from

IP Cores • Processor cores Start-Core – 16 -bit fixed-point VLIW DSP core from Lucent/Motorola (a company is established by Lucent for DSP section called “Agere”) – First VLIW machine to target low-power applications – Pipeline relatively simple – Targeting 198 m. W @ 300 MHz, 1. 5 V • Hardware cores Altera DSP cores. Device – – – – Type FIR Compiler IIR Compiler FFT/IFFT Compiler NCO Compiler Reed-Solomon Compiler Constellation Mapper/Demapper Viterbi Compiler 30

So. C (System-on-Chip) • With the continuing scaling of modern IC devices, it is

So. C (System-on-Chip) • With the continuing scaling of modern IC devices, it is now possible to incorporate – Micro-processor cores + ASIC function blocks – Analog + digital components – Computation + communication functions – I/O, memory + processor into the same chip to form a comprehensive “system”. Thus, the notion of System-onchip (So. C) • Soc uses intellectual properties (IPs) that are pre-designed modules. • Designing So. C thus becomes a task of system integration. • Challenge issues in So. C design: – Interface among IPs from different venders – Verification of function – Physical design challenges Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 31

Design Issues • Given a DSP application, which implementation option should be chosen? •

Design Issues • Given a DSP application, which implementation option should be chosen? • For a particular implementation option, how to achieve optimal design? Optimal in terms of what criteria? • Software design: – NSP, PDSP – Algorithms are implemented as programs. • Hardware design: – ASIC, FPGA – Algorithms are directly implemented in hardware modules. • S/H Co-design: System level design methodology. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 32

Design Process Model • Design is the process that • Implementation links algorithm to

Design Process Model • Design is the process that • Implementation links algorithm to – Assignment: Each operation can be realized implementation with • Algorithm • One or more instructions – Operations – Dependency between operations determines a partial ordering of execution – Can be specified as a dependence graph (software) • One or more function modules (hardware) – Scheduling: Dependence relations and resource constraints leads to a schedule. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 33

A Design Example … Consider the algorithm: • Operations: – Multiplication – Addition •

A Design Example … Consider the algorithm: • Operations: – Multiplication – Addition • Dependency – y(k) depends on y(k-1) – Dependence Graph: Program: y(0) = 0 For k = 1 to n Do y(k) = y(k-1)+ a(k)*x(k) End y = y(n) a(1) x(1) a(2) x(2) y(0) a(n) x(n) * * * + + + Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction y(n) 34

Design Example cont’d … • Software Implementation: – Map each * op. to a

Design Example cont’d … • Software Implementation: – Map each * op. to a MUL instruction, and each + op. to a ADD instruction. – Allocate memory space for {a(k)}, {x(k)}, and {y(k)} – Schedule the operation by sequentially execute y(1)=a(1)*x(1), y(2)=y(1) + a(2)*x(2), etc. – Note that each instruction is still to be implemented in hardware. • Hardware Implementation: – Map each * op. to a multiplier, and each + op. to an adder. – Interconnect them according to the dependence graph: a(1) x(1) a(2) x(2) y(0) a(n) x(n) * * * + + + Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction y(n) 35

Observations • Bottom line – Hardware/ • Eventually, an software co-design. There implementation is

Observations • Bottom line – Hardware/ • Eventually, an software co-design. There implementation is realized is a continuation between with hardware and software • However, by using the implementation. same hardware to realize • A design must explore different operations at both simultaneously to different time achieve best (scheduling), we have a performance/cost tradesoftware program! off. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 36

A Theme • Matching hardware to algorithm – Hardware architecture must match the characteristics

A Theme • Matching hardware to algorithm – Hardware architecture must match the characteristics of the algorithm. – Example: ASIC architecture is designed to implement a specific algorithm, and hence can achieve superior performance. • Formulate algorithm to match hardware – Algorithm must be formulated so that they can best exploit the potential of architecture. – Example: GPP, PDSP architectures are fixed. One must formulate the algorithm properly to achieve best performance. Eg. To minimize number of operations. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 37

Algorithm Reformulation • Algorithmic level equivalence – Different filter structures implementing the same specification

Algorithm Reformulation • Algorithmic level equivalence – Different filter structures implementing the same specification • Exploiting parallelism – Regular iterative algorithms and loop reformulation • Well studied in parallel compiler technology – Signal flow/Data flow representation • Suitable for specification of pipelining Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 38

Mapping Algorithm to Architecture • Scheduling and Assignment Problem – Resources: hardware modules, and

Mapping Algorithm to Architecture • Scheduling and Assignment Problem – Resources: hardware modules, and time slots – Demands: operations (algorithm), and throughput • Constrained optimization problem – Minimize resources (objective function) to meet demands (constraints) • For regular iterative algorithms and regular processor arrays -> algebraic mapping. Copied from [Hu 04 -Slides] Design and Implementation of Signal Processing Systems: An Introduction 39

Implementation process for PDSP [Wiangtong 05] 40

Implementation process for PDSP [Wiangtong 05] 40

Direct Mapping Techniques [Wiangtong 05] 41

Direct Mapping Techniques [Wiangtong 05] 41

FIR Filters [DSPPrimer-Slides] 42

FIR Filters [DSPPrimer-Slides] 42

Transposed FIR Filter • Algorithm transform techniques: – Pipelining and parallelism, – retiming, –

Transposed FIR Filter • Algorithm transform techniques: – Pipelining and parallelism, – retiming, – Unfolding-loop unrolling [DSPPrimer-Slides] 43

Example: One-to-one mapping and pipelining A B C D allocation A B C D

Example: One-to-one mapping and pipelining A B C D allocation A B C D assignment A B C D pipelining A B clocked flip-flop C Analyse timing • if OK then stop • else pipelining D ff clock [Meerbergen-Slides] 44

Coware SPW Design Flow www. coware. com 45

Coware SPW Design Flow www. coware. com 45

System-level design flow: Simulink-Altera [Altera. DSP] 46

System-level design flow: Simulink-Altera [Altera. DSP] 46

Arithmetic • CORDIC – Compute elementary functions • Distributed arithmetic – ROM based implementation

Arithmetic • CORDIC – Compute elementary functions • Distributed arithmetic – ROM based implementation 47

Floating to fixed point analysis • Overflow of the number range • Large errors

Floating to fixed point analysis • Overflow of the number range • Large errors in the output signal occur when the available number range is exceeded— overflow. • Round-off errors • Rounding or truncation of products must be done in recursive loops so that the word length does not increase for each iteration. • Coefficient errors • Coefficients can only be represented with finite precision. • • Design for fixed-point arithmetic: Peak value estimation Word-length optimization Saturation arithmetic 48

References In order to prepare these slides, the following material is used: • Slides

References In order to prepare these slides, the following material is used: • Slides from [Hu 04 -Slides] “Design and Implementation of Signal Processing Systems: An Introduction” are copied with permission. • Slides from [DSPPrimer-Slides] and [Meerbergen-Slides] • [Richards 04], [Altera. DSP], [Seshan 98] • Details about these references can be found at: http: //www. site. uottawa. ca/~mbolic/elg 6163/References. htm 49