Adaptive Image Filtering Using RunTime Reconfiguration Nitin Srivastava

  • Slides: 30
Download presentation
Adaptive Image Filtering Using Run-Time Reconfiguration Nitin Srivastava Jerry L. Trahan Ramachandran Vaidyanathan Suresh

Adaptive Image Filtering Using Run-Time Reconfiguration Nitin Srivastava Jerry L. Trahan Ramachandran Vaidyanathan Suresh Rai Department of Electrical and Computer Engineering Louisiana State University

The Problem: Adaptive Image Filtering • A filtering window moves over the image pixel

The Problem: Adaptive Image Filtering • A filtering window moves over the image pixel by pixel. Window size is usually 3 3, 5 5, or 7 7. • The filter multiplies the intensity values of pixels that the window overlaps with its coefficients and sums the products to produce the new value of the pixel at the center of the window.

Working of a 3 3 size filter

Working of a 3 3 size filter

 • Spatially invariant filter — does not change values of its coefficients with

• Spatially invariant filter — does not change values of its coefficients with the position of the filtering window over the image. • Adaptive filter — adjusts values of its coefficients according to the nature of the image. For instance, handles uniform regions differently than edges.

Use of Run-Time Reconfiguration • For fixed filter coefficients, could use constant coefficient multipliers

Use of Run-Time Reconfiguration • For fixed filter coefficients, could use constant coefficient multipliers (KCMs) configured for coefficients. • For adaptive coefficients, we use KCMs configured for pixel values. Flow of data regular, but more involved. • Inspired by 1 D adaptive filtering technique of Wojko and El. Gindy (RAW’ 99).

Use of FPGAs • Circuit design tailored to the problem — Filtering exhibits regular,

Use of FPGAs • Circuit design tailored to the problem — Filtering exhibits regular, repeated operations, taking an inner product among the same number of elements at each pixel position. • Problem size-specific components and datapaths. • Advantages for this problem even without reconfiguration.

Image Filtering Details

Image Filtering Details

Solution Approach • Gray scale image of size 256, using a filtering window of

Solution Approach • Gray scale image of size 256, using a filtering window of size 3 3. • Can tailor to different image size — changes some register sizes and memory requirements. • Can tailor to different window size — changes memory requirements. • Can extend to video — window is 3 D across frames.

Solution Approach, cont. • Basic component is a module. • Sixteen pipelined modules act

Solution Approach, cont. • Basic component is a module. • Sixteen pipelined modules act on 16 contiguous pixels at a time from the same row. • Three sets of three steps each, corresponding to the three rows in a 3 3 window and the three positions in each row. • For each of these nine steps, a module contributes to one of the nine window computations in which its pixel participates.

Module Algorithm Procedure Step 0: Step 1: Step 2: THREEPIX(r, in, out) Adder(r) KCM(r)

Module Algorithm Procedure Step 0: Step 1: Step 2: THREEPIX(r, in, out) Adder(r) KCM(r) + in Adder(r) KCM(r) + Adder(r 1) out Adder(r) • Contributes to three pixel values on the same row.

Overall Algorithm for i 0 to 255 for k 0 to 255 in steps

Overall Algorithm for i 0 to 255 for k 0 to 255 in steps of 16 for all j, where k j k+15 r = j mod 16 /* module r has v(i, j) */ THREEPIX(r, 0, memory) THREEPIX(r, memory, I/O pins)

Module filtering window pixel coefficient value KCM zero register 0 To output mux KCM

Module filtering window pixel coefficient value KCM zero register 0 To output mux KCM To block memory write register module mux previous module memory read register From block memory KCM output mux pipeline register module adder step counter next module

First vantage point: one module

First vantage point: one module

5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28

5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28 5, 29 6, 22 6, 23 6, 24 6, 25 6, 26 6, 27 6, 28 6, 29 7, 22 7, 23 7, 24 7, 25 7, 26 7, 27 7, 28 7, 29 8, 22 8, 23 8, 24 8, 25 8, 26 8, 27 8, 28 8, 29 9, 22 9, 23 9, 24 9, 26 9, 27 9, 28 9, 29 9, 25

block memory block memories block memory 5, 22 5, 23 5, 24 5, 25

block memory block memories block memory 5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28 5, 29 6, 22 6, 23 6, 24 6, 25 6, 26 6, 27 6, 28 6, 29 7, 22 7, 23 7, 24 7, 25 7, 26 7, 27 7, 28 7, 29 8, 22 8, 23 8, 24 8, 26 8, 27 8, 28 8, 29 9, 22 9, 23 9, 24 9, 26 9, 27 9, 28 9, 29 8, 25 9, 25

v[7, 23] 0 0 pd[8, 24, -1] v[7, 25] v[7, 24] 0 0 pd[8,

v[7, 23] 0 0 pd[8, 24, -1] v[7, 25] v[7, 24] 0 0 pd[8, 25, -1] v[7, 26] pd[8, 26, -1] pd[8, 27, -1] pd[8, 26, -1, 0] v[7, 27] 0 pd[8, 28, -1]

Nitin: block memory Slide added block memories block memory 5, 22 5, 23 5,

Nitin: block memory Slide added block memories block memory 5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28 5, 29 6, 22 6, 23 6, 24 6, 25 6, 26 6, 27 6, 28 6, 29 7, 22 7, 23 7, 24 7, 25 7, 26 7, 27 7, 28 7, 29 8, 22 8, 23 8, 24 8, 26 8, 27 8, 28 8, 29 9, 22 9, 23 9, 24 9, 26 9, 27 9, 28 9, 29 8, 25 9, 25

v[7, 23] 0 0 pd[8, 24, -1] pd[8, 23, -1, 0] v[7, 25] v[7,

v[7, 23] 0 0 pd[8, 24, -1] pd[8, 23, -1, 0] v[7, 25] v[7, 24] 0 0 pd[8, 25, -1] pd[8, 24, -1, 0] v[7, 26] pd[8, 26, -1] pd[8, 25, -1, 0] pd[8, 27, -1] pd[8, 26, -1, 0] pd[8, 25, -1, 1] v[7, 27] 0 pd[8, 28, -1] pd[8, 27, -1, 0]

block memory block memories block memory 9 rs(-1)[8, 24] 5, 22 5, 23 5,

block memory block memories block memory 9 rs(-1)[8, 24] 5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28 5, 29 6, 22 6, 23 6, 24 6, 25 6, 26 6, 27 6, 28 6, 29 7, 22 7, 23 7, 24 7, 25 7, 26 7, 27 7, 28 7, 29 8, 22 8, 23 8, 24 8, 26 8, 27 8, 28 8, 29 9, 22 9, 23 9, 24 9, 26 9, 27 9, 28 9, 29 8, 25 9, 25

v[7, 23] 0 0 pd[8, 24, -1] pd[8, 23, -1, 0] pd[8, 23, -1,

v[7, 23] 0 0 pd[8, 24, -1] pd[8, 23, -1, 0] pd[8, 23, -1, 1] v[7, 25] v[7, 24] 0 0 pd[8, 25, -1] v[7, 26] pd[8, 26, -1] pd[8, 27, -1] v[7, 27] 0 pd[8, 28, -1] pd[8, 24, -1, 0] pd[8, 25, -1, 0] pd[8, 26, -1, 0] pd[8, 27, -1, 0] pd[8, 22, -1, 1] pd[8, 24, -1, 1] pd[8, 25, -1, 1] rs(-1)[8, 24] pd[8, 26, -1, 1]

block memory 11 block memories block memory 9 newv[6, 24] rs(-1)[8, 24] rs(-1)[7. 24]+

block memory 11 block memories block memory 9 newv[6, 24] rs(-1)[8, 24] rs(-1)[7. 24]+ rs(0)[7, 24] rs(-1)[7, 26] rs(-1)[6, 26]+ rs(0)[6. 26] 5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28 5, 29 6, 22 6, 23 6, 24 6, 25 6, 26 6, 27 6, 28 6, 29 7, 22 7, 23 7, 24 7, 25 7, 26 7, 27 7, 28 7, 29 8, 22 8, 23 8, 24 8, 26 8, 27 8, 28 8, 29 9, 22 9, 23 9, 24 9, 26 9, 27 9, 28 9, 29 8, 25 9, 25

v[7, 23] 0 0 pd[8, 24, -1] pd[8, 23, -1, 0] pd[8, 23, -1,

v[7, 23] 0 0 pd[8, 24, -1] pd[8, 23, -1, 0] pd[8, 23, -1, 1] pd[7, 24, 0, -1] v[7, 25] v[7, 24] 0 0 pd[8, 25, -1] pd[8, 24, -1, 0] pd[8, 22, -1, 1] rs(-1)[7, 26] pd[7, 25, 0, -1] v[7, 26] pd[8, 26, -1] pd[8, 27, -1] v[7, 27] 0 pd[8, 28, -1] pd[8, 25, -1, 0] pd[8, 26, -1, 0] pd[8, 27, -1, 0] pd[8, 24, -1, 1] pd[8, 25, -1, 1] rs(-1)[8, 24] pd[7, 27, 0, -1] pd[8, 26, -1, 1] pd[7, 25, 0, 0] pd[7, 26, 0, 0] pd[7, 27, 0, 0] pd[7, 24, 0, 1] pd[7, 25, 0, 1] pd[7, 26, 0, 1] rs(-1)[7, 24]+rs(0)[7, 24] pd[6, 27, 1, -1] pd[6, 28, 1, -1] pd[7, 26, 0, -1] pd[7, 28, 0, -1] pd[6, 24, 1, -1] pd[7, 24, 0, 0] pd[7, 23, 0, 1] rs(-1)[6, 26]+ rs(0)[6, 26] pd[6, 25, 1, -1] pd[6, 26, 1, -1] pd[6, 23, 1, 0] pd[6, 24, 1, 0] pd[6, 25, 1, 0] pd[6, 26, 1, 0] pd[6, 27, 1, 0] pd[6, 24, 1, 1] pd[6, 25, 1, 1] pd[6, 26, 1, 1] pd[7, 23, 0, 0] pd[7, 22, 0, 1] pd[6, 22, 1, 1] pd[6, 23, 1, 1] newv[6, 24]

Second vantage point: one pixel

Second vantage point: one pixel

5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28

5, 22 5, 23 5, 24 5, 25 5, 26 5, 27 5, 28 5, 29 6, 22 6, 23 6, 24 6, 25 6, 26 6, 27 6, 28 6, 29 7, 22 7, 23 7, 24 7, 25 7, 26 7, 27 7, 28 7, 29 8, 22 8, 23 8, 24 8, 25 8, 26 8, 27 8, 28 8, 29 9, 22 9, 23 9, 24 9, 26 9, 27 9, 28 9, 29 9, 25

v[6, 23] 0 0 pd[7, 24, -1] v[6, 25] v[6, 24] 0 pd[7, 25,

v[6, 23] 0 0 pd[7, 24, -1] v[6, 25] v[6, 24] 0 pd[7, 25, -1] v[6, 26] 0 pd[7, 26, -1] v[6. 27] 0 pd[7, 27, -1] pd[7, 28 – 1, -1]

v[6, 23] 0 0 pd[7, 24, -1] pd[7, 23, -1, 0] v[6, 25] v[6,

v[6, 23] 0 0 pd[7, 24, -1] pd[7, 23, -1, 0] v[6, 25] v[6, 24] 0 pd[7, 25, -1] pd[7, 24, -1, 0] v[6, 26] 0 pd[7, 26, -1] pd[7, 25, -1, 0] pd[7, 27, -1] pd[7, 26, -1, 0] v[6. 27] 0 pd[7, 28, -1] pd[7, 27, -1, 0]

v[6, 23] 0 0 pd[7, 24, -1] v[6, 25] v[6, 24] 0 pd[7, 25,

v[6, 23] 0 0 pd[7, 24, -1] v[6, 25] v[6, 24] 0 pd[7, 25, -1] v[6, 26] v[6. 27] 0 pd[7, 26, -1] pd[7, 27, -1] 0 pd[7, 28, -1] pd[7, 23, -1, 0] pd[7, 24, -1, 0] pd[7, 25, -1, 0] pd[7, 26, -1, 0] pd[7, 27, -1, 0] pd[7, 22, -1, 1] pd[7, 23, -1, 1] pd[7, 24, -1, 1] pd[7, 25, -1, 1] pd[7, 26, -1, 1] rs(-1)[7, 25]

v[8, 23] 0 0 pd[9, 24, -1] v[8, 25] v[8, 24] 0 0 pd[9,

v[8, 23] 0 0 pd[9, 24, -1] v[8, 25] v[8, 24] 0 0 pd[9, 25, -1] v[8, 26] pd[9, 26, -1] pd[9, 27, -1] v[8, 27] 0 pd[9, 28, -1] pd[9, 23, -1, 0] pd[9, 24, -1, 0] pd[9, 25, -1, 0] pd[9, 26, -1, 0] pd[9, 27, -1, 0] pd[9, 22, -1, 1] pd[9, 23, -1, 1] pd[9, 24, -1, 1] pd[9, 25, -1, 1] pd[9, 26, -1, 1] pd[8, 25, 0, -1] pd[8, 26, 0, -1] pd[8, 27, 0, -1] pd[8, 28, 0, -1] pd[8, 24, 0, 0] pd[8, 25, 0, 0] pd[8, 26, 0, 0] pd[8, 27, 0, 0] pd[8, 22, 0, 1] rs(-1)[7, 25]+ rs(0)[7, 25] pd[8, 23, 0, 1] pd[8, 24, 0, 1] pd[8, 25, 0, 1] pd[8, 26, 0, 1] pd[7, 24, 1, -1] pd[7, 25, 1, -1] pd[7, 26, 1, -1] pd[7, 23, 1, 0] pd[7, 24, 1, 0] pd[8, 24, 0, -1] pd[8, 23, 0, 0] pd[7, 22, 1, 1] pd[7, 23, 1, 1] pd[7, 25, 1, 0] pd[7, 24, 1, 1] pd[7, 27, 1, -1] pd[7, 28, 1, -1] pd[7, 26, 1, 0] pd[7, 27, 1, 0] pd[7, 26, 1, 1] pd[7, 25, 1, 1] newv[7, 25]

Evaluation • • Xilinx Virtex-E FPGA XCV 200 E Number of CLB slices required

Evaluation • • Xilinx Virtex-E FPGA XCV 200 E Number of CLB slices required = 492 Clock frequency = 101. 9 MHz Time spent in filtering a 256 image = 642 s

Comparison System description Running time Speedup 866 Pentium system MHz III 20 ms 31

Comparison System description Running time Speedup 866 Pentium system MHz III 20 ms 31 400 MHz Sun Ultra 5 system 53 ms 84