2020-11-22Digital
Image Processing: Histograms and Frequency Domain
Filtering
Starry, starry night
Paint your palette blue and grey
Look out on a summer’s day
With eyes that know the darkness in my soul
Shadows on the hills
Sketch the trees and the daffodils
Catch the breeze and the winter chills
In colors on the snowy linen land
Click here to download the code
and sample images
1. Code
Structure and Usage Instructions
1.1 Code Structure
The program written for this experiment is a
command-line executable program in C++ for the Linux
environment, rather than the long-outdated
Windows MFC program mentioned in the PPT.
ps.cpp: Main program code (our own
PhotoShop)bmp.cpp and bmp.hpp:
Encapsulate BMP image processing operationscolor.cpp and color.hpp:
Encapsulate color space operationsfft.cpp and fft.hpp: Fast
Fourier Transform (FFT) and frequency domain filtering
operations for imagesutils.cpp and utils.hpp:
General utility functions
1.2 Compilation
Compile using the pre-written Makefile,
which generates the ps binary file after
compilation. Input the paths of input and output files,
as well as parameters for processing the file, via the
command line.
Since this experiment uses the fftw3
Fast Fourier Transform library, ensure the library is
installed before compilation:
sudo apt install libfftw3-dev
When compiling afterward, use the following
parameters for linking:
-lfftw3 -lm
This has already been implemented in the
Makefile.
1.3 Usage Instructions
This program reads BMP files by specifying paths via
the command line, and can perform corresponding image
processing with specified parameters. Parameters can be
stacked, in which case the processing operations will be
executed sequentially. The usage instructions are as
follows:
Usage: ./ps input_img [-d|-l|-h|-e|-n|-b|-o]
Parameters:
-d Show DFT Transform spectrum
-l filter_size Gaussian low pass filter
-h filter_size Gaussian high pass filter
-e Equalization histogram of input image
-n ref_img Normalization histogram of input image with reference image
-b threshold Binarize image into 2-color with given threshold
-o output_img Output image
For example, to perform low-pass filtering on the
image test.bmp, then equalize it, and
output to test_out.bmp, use:
./ps test.bmp -l 20 -e -o test_out.bmp
The BMP class is defined in bmp.hpp:
// BMP Read & Write & Manipulate Class
class BMP
{
private:
Header header; // BMP Header
Color *px; // pixels
Color *colors; // color palette
int colors_num; // size of color palette
int padding; // padding bytes of each line
int *his; // V channel histogram
Color get_color(int index);
int find_color(Color &px);
void read1(FILE *fp);
void read4(FILE *fp);
void read8(FILE *fp);
void read24(FILE *fp);
void write1(FILE *fp);
void write4(FILE *fp);
void write8(FILE *fp);
void write24(FILE *fp);
void to1();
void to8();
void to24();
void hsv();
void rgb();
public:
BMP(const char *filename);
~BMP();
void info();
void dft();
void low_pass_filter(double filter_size);
void high_pass_filter(double filter_size);
void equalize();
void normalize(int *his);
void binarize(int threshold);
int *histogram();
void write(const char *filename);
};
Subsequent BMP file processing is carried out around
this class, where:
header: Stores header informationpx: Stores pixel datacolors: Stores color palette
informationcolors_num: Number of color palette
entriespadding: Number of padding bytes for
4-byte alignment per line (mentioned later)his: Stores histogram information
Subsequent functions and variables will be explained
or reflected in comments, and will not be repeated
here.
The Header structure is defined in
bmp.hpp, with 1-byte
alignment. The cstdint library is
used to ensure consistent integer bit widths:
#pragma once
#pragma pack(push, 1)
// BMP File Header
struct Header
{
std::uint16_t bmp; // Magic number for file
std::uint32_t file_size; // Size of file
std::uint16_t reserved_1; // Reserved
std::uint16_t reserved_2; // Reserved
std::uint32_t offset; // Offset to bitmap data
std::uint32_t header_size; // Size of info header
std::uint32_t width; // Width of image
std::uint32_t height; // Height of image
std::uint16_t planes; // Number of color planes
std::uint16_t depth; // Number of bits per pixel
std::uint32_t compression; // Type of compression to use
std::uint32_t image_size; // Size of image data
std::uint32_t ppm_x; // X pixels per meter
std::uint32_t ppm_y; // Y pixels per meter
std::uint32_t colors; // Number of colors used
std::uint32_t colors_important; // Number of important colors
};
#pragma pack(pop)
Key points to note are:
offset: Offset from the start of the
file to the bitmap data areawidth: Image widthheight: Image heightdepth: Bit depth (number of bits per
pixel), which determines the type of image: 1 for
2-color images, 4 for 16-color images, 8 for 256-color
images, and 24 for true-color imagescolors: Number of colors in the
palette. In most cases, this value is set to 0, and the
actual number of palette entries is determined by the
bit depth:- A default of 2 palette entries for a bit depth of
1
- A default of 16 palette entries for a bit depth of
4
- A default of 256 palette entries for a bit depth of
8
- No palette for a bit depth of 24
Each palette entry consists of 4 bytes, in the order
of R, G, B, and Reserved. Adjacent palette entries are
contiguous. The Color structure can be
defined in color.hpp:
// Color Palette
struct Color
{
bool type = false; // false for HSV, true for RGB
std::uint8_t r; // Red
std::uint8_t g; // Green
std::uint8_t b; // Blue
std::uint8_t a; // Reserved
void hsv();
void rgb();
};
Considering subsequent color space conversion, two
conversion functions and a type identifier
are provided here. Therefore, creating a
Color array can represent the color
palette. Additionally, two functions
(get_color and find_color)
that map indices in the palette array to the
corresponding Color structures are written
for subsequent file reading and writing:
// Get a color from color table
// index: color index
Color BMP::get_color(int index)
{
Color color;
color.r = colors[index].r;
color.g = colors[index].g;
color.b = colors[index].b;
color.a = colors[index].a;
return color;
}
// Find a color index in color table
// color: color, which index is to be found
int BMP::find_color(Color &color)
{
for (int i = 0; i < colors_num; i++)
{
if (color.r == colors[i].r && color.g == colors[i].g &&
color.b == colors[i].b && color.a == colors[i].a)
return i;
}
return -1;
}
The data area is stored from left to right and bottom
to top. Each row is from left to right, with pixels in
each row adjacent to one another, but a certain number
of 0-byte padding is added at the end of each row to
make the length of each row a multiple of 4 bytes (also
known as 4-byte alignment). Therefore,
direct reading/writing with nested loops is not
feasible; the offset must be calculated first:
padding = (4 - (((header.width * header.depth + 7) / 8) % 4)) % 4;
Since the minimum unit for each read/write operation
is 1 byte, corresponding calculations must be performed
for bit depths less than 8:
- For 2-color images (1-bit depth): 8 bits in 1 byte
represent 8 pixels, with lower bits first and
higher bits later. The value (0 or 1) of each
bit represents its color index in the palette.
- For 16-color images (4-bit depth): 8 bits in 1 byte
are divided into two groups of 4 bits, representing two
pixels. The lower 4 bits represent the first
pixel, and the higher 4 bits represent the second
pixel. The value corresponding to the 4 bits is
also the color index in the palette.
- For 256-color images (8-bit depth): 1 byte
represents 1 pixel, and the value here also indicates
the color index in the palette, not the
grayscale value from 0 to 255.
- For true-color images (24-bit depth): 3 bytes
represent 1 pixel, arranged in B, G, R channel order.
The content of 1 byte for each channel represents the
pixel value (0 to 255) of that channel (true-color
images have no palette).
2.4 Reading and
Writing BMP Images
Writing functions are similar to reading functions;
reading is used as an example here. In the BMP class,
the constructor is defined as the read function, which
takes a string parameter pointing to the filename:
// Constructor
// filename: BMP image file
BMP::BMP(const char *filename)
{
uint8_t byte;
FILE *fp;
// File header
if ((fp = fopen(filename, "rb")) == NULL)
throw_ex("fopen");
if (fread(&header, sizeof(Header), 1, fp) < 0)
throw_ex("fread");
// Color palette
colors_num = (header.colors == 0 && header.depth != 24) ? (1 << header.depth) : header.colors;
colors = new Color[colors_num];
for (int i = 0; i < colors_num; i++)
{
fread(&byte, 1, 1, fp);
colors[i].r = byte;
fread(&byte, 1, 1, fp);
colors[i].g = byte;
fread(&byte, 1, 1, fp);
colors[i].b = byte;
fread(&byte, 1, 1, fp);
colors[i].a = byte;
}
// Pixels
padding = (4 - (((header.width * header.depth + 7) / 8) % 4)) % 4;
px = new Color[header.width * header.height];
switch (header.depth)
{
case 1:
read1(fp);
break;
case 4:
read4(fp);
break;
case 8:
read8(fp);
break;
case 24:
read24(fp);
break;
}
fclose(fp);
his = new int[256];
}
The file header, color palette, and pixels are read
in sequence. After reading the file header, the number
of palette entries (color_num) and the
number of alignment padding bytes per row of pixels
(padding) are calculated using information
from the file header. Then, different reading functions
are called based on different bit depth scenarios.
read4 is used as an example:
// Read 4-bit BMP file
// fp: file pointer
void BMP::read4(FILE *fp)
{
uint8_t byte;
fseek(fp, header.offset, SEEK_SET);
for (int i = header.height - 1; i >= 0; i--)
{
for (int j = 0; j < header.width; j += 2)
{
if (fread(&byte, 1, 1, fp) < 0)
throw_ex("fread");
px[header.width * i + j] = get_color(byte & 0x0F);
px[header.width * i + j + 1] = get_color(byte >> 4);
}
fseek(fp, padding, SEEK_CUR);
}
}
First, fseek is used to jump to the data
area. Since the BMP format is stored from bottom to top,
but in digital image processing, the x-axis is generally from top
to bottom, rows are read in reverse order. When reading
pixels, since it is a 4-bit image (16-color), one byte
contains two pixels, so these two pixels are split using
bitwise operations and written to the px
array in sequence. Additionally, the
padding area is skipped after reading each
row.
Writing operations are similar. For 1-bit (2-color)
BMP images, the write operation is as follows:
// Write 1-bit BMP file
// fp: file pointer
void BMP::write1(FILE *fp)
{
uint8_t byte;
fseek(fp, header.offset, SEEK_SET);
for (int i = header.height - 1; i >= 0; i--)
{
for (int j = 0; j < header.width; j += 8)
{
byte = 0;
for (int k = 0; k < 8; k++)
byte += find_color(px[i * header.width + j + k]) << (7 - k);
fwrite(&byte, 1, 1, fp);
}
fwrite("\0", 1, padding, fp);
}
}
For the convenience of subsequent operations, images
can be uniformly converted to 24-bit color storage, so
the to24 function is written, which simply
adjusts the image size, offset, number of palette
entries, and bit depth information in the header
structure:
// Convert current BMP image into 24-bit
void BMP::to24()
{
header.offset = 54;
header.depth = 24;
padding = (4 - (((header.width * header.depth + 7) / 8) % 4)) % 4;
header.image_size = (header.width * 3 + padding) * header.height;
header.file_size = header.offset + header.image_size;
colors_num = 0;
}
Meanwhile, for the convenience of binarization and
grayscale processing, the to1 function
(converts the image to 1-bit storage) and
to8 function (converts to 256-color
grayscale image storage) are also written:
// Convert current BMP image into 1-bit
void BMP::to1()
{
header.offset = 62;
header.depth = 1;
header.colors = 0;
padding = (4 - (((header.width * header.depth + 7) / 8) % 4)) % 4;
header.image_size = ((header.width + 7) / 8 + padding) * header.height;
header.file_size = header.offset + header.image_size;
delete[] colors;
colors_num = 2;
colors = new Color[colors_num];
colors[0].r = 0;
colors[0].g = 0;
colors[0].b = 0;
colors[0].a = 0;
colors[1].r = 255;
colors[1].g = 255;
colors[1].b = 255;
colors[1].a = 0;
}
// Convert current BMP image into 8-bit
void BMP::to8()
{
header.offset = 1078;
header.depth = 8;
header.colors = 0;
padding = (4 - (((header.width * header.depth + 7) / 8) % 4)) % 4;
header.image_size = (header.width + padding) * header.height;
header.file_size = header.offset + header.image_size;
delete[] colors;
colors_num = 256;
colors = new Color[colors_num];
for (int i = 0; i < colors_num; i++)
{
colors[i].r = i;
colors[i].g = i;
colors[i].b = i;
colors[i].a = 0;
}
}
3.
Histogram Equalization and Normalization
Since all pixel data is stored in the px
array, subsequent image processing operations can be
carried out around the px array, as well as
the width (header.width) and height
(header.height) in the header
information.
Equalization and normalization here only consider
cases where grayscale values range from 0 to 255.
3.1 Obtaining
Histogram Array
A histogram is the statistical distribution of
grayscale values, with the horizontal axis representing
grayscale values and the vertical axis representing the
number of pixels with that grayscale value. Therefore,
the histogram function is written to obtain
the histogram array:
// Calculate image histogram
int *BMP::histogram()
{
hsv();
for (int i = 0; i < 256; i++)
his[i] = 0;
for (int i = 0; i < header.width * header.height; i++)
his[px[i].b]++;
rgb();
return his;
}
The result is stored in the histogram
array.
3.2 Grayscale
Image Equalization
According to the formula in the “Digital Image
Processing” textbook, let map[i] be the grayscale value
that a pixel with the original grayscale value i should be converted to
after equalization. Then:
map[i]=\frac{255}{width\times
height}\sum_{k=0}^ihistogram[k]
This smooths the cumulative distribution of the
histogram up to each grayscale value. Then, iterate over
each grayscale value i
and set i\leftarrow
map[i] to complete equalization. Partial code can
be written as follows (taking the B channel as an
example):
// Histogram equalization
void BMP::equalize()
{
...
int *map = new int[256];
int sum;
double size = header.width * header.height;
// Value Channel
sum = 0;
for (int i = 0; i < 256; i++)
{
sum += histogram[i];
map[i] = (int)round(255 * sum / size);
}
for (int i = 0; i < size; i++)
px[i].b = map[px[i].b];
for (int i = 0; i < colors_num; i++)
colors[i].b = map[colors[i].b];
...
calc_histogram();
delete[] map;
}
Note that the colors in the palette must also be
equalized to ensure correct correspondence during
saving. The effect is shown in the upper right corner of
the figure below.
3.3 Color Image
Equalization
However, the above only applies to grayscale images
where the RGB channel values are the same. For color
images, the first thought is to process the three
channels separately and perform equalization on each.
However, this leads to color inconsistency issues, as
shown in the lower right corner:

Because the purpose of equalization is to
balance brightness, and none of the RGB
channels can independently reflect brightness,
color space conversion is required
first. Here, conversion from RGB to HSV is
chosen, where the V channel represents brightness.
Two mutual conversion functions for these channels
are written in color.cpp:
// RGB to HSV in-place
void Color::hsv()
{
if (type)
return;
double R, G, B, H, S, V;
R = r / 255.0;
G = g / 255.0;
B = b / 255.0;
double X_max = max(R, max(G, B));
double X_min = min(R, min(G, B));
double C = X_max - X_min;
V = X_max;
if (C == 0)
H = 0;
else if (V == R)
H = 60 * (0 + (G - B) / C);
else if (V == G)
H = 60 * (2 + (B - R) / C);
else
H = 60 * (4 + (R - G) / C);
if (V == 0)
S = 0;
else
S = C / V;
r = (uint8_t)round(H / 2);
g = (uint8_t)round(S * 255);
b = (uint8_t)round(V * 255); // H/S/V all in 0~255
type = true;
}
// HSV to RGB in-place
void Color::rgb()
{
if (!type)
return;
double R, G, B, H, S, V;
H = (r * 2) / 60.0;
S = g / 255.0;
V = b / 255.0;
double C = V * S;
double X = C * (1 - fabs(fmod(H, 2) - 1));
double M = V - C;
if (0 <= H && H <= 1)
R = C, G = X, B = 0;
else if (1 < H && H <= 2)
R = X, G = C, B = 0;
else if (2 < H && H <= 3)
R = 0, G = C, B = X;
else if (3 < H && H <= 4)
R = 0, G = X, B = C;
else if (4 < H && H <= 5)
R = X, G = 0, B = C;
else if (5 < H && H <= 6)
R = C, G = 0, B = X;
else
R = 0, G = 0, B = 0;
r = (uint8_t)round((R + M) * 255);
g = (uint8_t)round((G + M) * 255);
b = (uint8_t)round((B + M) * 255);
type = false;
}
An in-place mode is adopted, where the original R, G,
and B correspond to H, S, and V respectively. Since in
the HSV color space, H typically ranges from 0 to 360,
and S and V typically range from 0 to 100, scaling is
performed here to uniformly set them in the range of 0
to 255.
With channel conversion, for color image
equalization, it is only necessary to convert to the HSV
space before calculation, perform equalization on the V
channel, and then convert back to the RGB space after
calculation. The effect is shown in the figure
below:

3.4 Histogram Matching
Similarly, the V channel is used for explanation here
(the V channel of a grayscale image corresponds to its
grayscale value). According to the textbook formula, let
map[i] be the grayscale
value that a pixel with the original grayscale value
i should be converted
to after equalization; histogram_2 is the target
histogram to be approximated. Then, after first
equalizing the image:
g[i]=255\times\sum_{k=0}^ihistogram_2[k]
map[i]=g^{-1}[i]
Since g[i] is
obviously monotonically increasing, the value of g^{-1}[i] can be obtained
through simple binary search. The
lower_bound function from the C++
algorithm library can be used to implement
this. The program is written as follows:
// Histogram normalization
// his: new distribution of histogram
void BMP::normalize(int *his)
{
equalize();
hsv();
int *g = new int[256];
int *map = new int[256];
int sum;
double size = header.width * header.height;
// We process only value channel
sum = 0;
for (int i = 0; i < 256; i++)
{
sum += his[i];
g[i] = (int)round(255 * sum / size);
}
for (int i = 0; i < 256; i++)
map[i] = lower_bound(g, g + 256, i) - g;
for (int i = 0; i < size; i++)
px[i].b = map[px[i].b];
rgb();
calc_histogram();
delete[] g;
delete[] map;
}
Using the histogram distribution of a 256-color
grayscale image to normalize the histogram of a color
image, the effect is shown in
star24_out.bmp on the right:

3.5 Image Binarization
We can binarize the image based on the histogram
distribution. In fact, it is only necessary to determine
whether the pixel is 0 or 1 based on whether the value
of the V channel exceeds a set threshold. At the same
time, to save space, the binarized image can be
converted to 2-color storage, so the previously written
to1 function is used:
// Binarize an image into black and white
// threshold: Value channel threshold
void BMP::binarize(int threshold)
{
hsv();
for (int i = 0; i < header.height; i++)
for (int j = 0; j < header.width; j++)
if (px[i * header.width + j].b > threshold)
{
px[i * header.width + j].b = 255;
px[i * header.width + j].g = 255;
px[i * header.width + j].r = 255;
}
else
{
px[i * header.width + j].b = 0;
px[i * header.width + j].g = 0;
px[i * header.width + j].r = 0;
}
to1();
}
Binarization with a threshold of 100 is performed on
both color and 16-color images, with the effect shown in
the figure below:

4. FFT and
Frequency Domain Filtering
The FFT algorithm is implemented here using the
fftw3 library, which can be installed on
Linux with the following command:
sudo apt install libfftw3-dev
4.1 Image Conventions
For image coordinates, the same convention as in the
textbook is used here: the origin is at the upper left
corner, the x-axis
represents height (denoted by i in the
program), and the y-axis represents width
(denoted by j in the program).

For spectrum display: For grayscale
images, DFT can be directly performed using grayscale
values, but for color images, conversion to the HSV
space is also performed here, and the V channel is used
to calculate the DFT.
However, for frequency domain
filtering: For color images, DFT, filtering,
and IDFT calculations are performed separately for each
of the three channels.
4.2 2D DFT Spectrum
Display
4.2.1 Code Framework
According to the FFTW documentation, the code
framework for the usage steps is as follows:
#include <fftw3.h>
...
{
fftw_complex *in, *out;
fftw_plan p;
...
in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
p = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
...
fftw_execute(p); /* repeat as needed */
...
fftw_destroy_plan(p);
fftw_free(in); fftw_free(out);
}
- First, create storage space for input and output
complex arrays using
fftw_malloc. - Then, create a computation plan using
fftw_plan* series functions. - Next, execute the computation using
fftw_execute. - Finally, release the plan and the space for the
complex arrays using
destroy_plan and
fftw_free.
fftw_complex represents a complex
number; in fact, each fftw_complex is a
double [2], where index 0 represents the
real part and index 1 represents the imaginary part.
To implement spectrum display, the px
array first needs to be converted to an
fftw_complex array. The real part can be
simply assigned here, but this results in the spectrum
origin being at the upper left corner (same as the
image). To center the spectrum origin in the image, the
conclusion from the textbook can be used:
If the Fourier transform of an image f(x,\,y) is F(u,\,v), then:
f(x,\,y)(-1)^{x+y}\Leftrightarrow
F\left(u-\frac{height}{2},v-\frac{width}{2}\right)
Therefore, (-1)^{i+j} can be multiplied
when assigning the real part:
in = (fftw_complex *)fftw_malloc(sizeof(fftw_complex) * width * height);
out = (fftw_complex *)fftw_malloc(sizeof(fftw_complex) * width * height);
if (in == NULL || out == NULL)
throw_ex("fftw_malloc");
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
{
in[i * width + j][0] = px[i * width + j].b * ((i + j) % 2 ? -1 : 1); // Make sure (0, 0) is at the center of the spectrum
in[i * width + j][1] = 0;
}
4.2.3 Calculating DFT
According to the documentation, a
fftw_plan_dft_2d plan needs to be created
here:
p = fftw_plan_dft_2d(height, width, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
fftw_execute(p);
After calculation, the results are stored in
out.
4.2.4 Spectrum
Visualization
Since the calculation results are complex numbers,
the modulus of the spectrum is used to determine the
grayscale value (consistent with the textbook).
Additionally, as mentioned on page 268 (page 155 of the
Chinese version) of the textbook, since the spectral
intensity varies significantly across different
frequencies, the dynamic range can be improved by taking
the logarithm of the spectral intensity.
Finally, since the actual grayscale values range from
0 to 255, the grayscale values need to be uniformly
scaled. Let the spectrum image be f'(x,\,y), and the
previous Fourier transform result be F(u,\,v). The final formula
is as follows:
f'(x,\,y) = \mathrm{round}\left(\frac{\log(
|F(x,\,y)|+1)}{\displaystyle{\max_{0\leq
u<height,\,0\leq v<
width}\log(|F(u,\,v)|+1)}}\times255\right)
The code can be written as follows:
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
{
double re = out[i * width + j][0];
double im = out[i * width + j][1];
f_max = max(f_max, log(sqrt(re * re + im * im) + 1)); // use log to enhance low-power frequencies
}
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
{
double re = out[i * width + j][0];
double im = out[i * width + j][1];
uint8_t value = (uint8_t)round(log(sqrt(re * re + im * im) + 1) / f_max * 255); // use log to enhance low-power frequencies
px[i * width + j].b = value;
px[i * width + j].g = value;
px[i * width + j].r = value;
}
4.2.5 Function
Prototype and Testing
The above calculation steps are written as the
fft_dft function with the following
prototype:
// Calculate DFT of an image, and replace its px in-place
// width: image width
// height: image height
// px: image pixel data, and after this, its data will be replaced by its Fourier spectrum in grayscale
void fft_dft(int width, int height, Color *px)
This function is then called in bmp.cpp.
Since the spectrum is a grayscale image, the image is
converted to an 8-bit 256-color image to reduce storage
space:
// Convert image to Fourier spectrum of its V channel
void BMP::dft()
{
hsv();
fft_dft(header.width, header.height, px);
to8();
}
Testing on images similar to those in the textbook
yields the following results, with the spectra of the
two images shown on the right:

Testing on Lena’s grayscale and color images yields
the following results:

It can be seen that there is no significant
difference between the spectra of the grayscale and
color images, because the spectrum of the color image is
calculated using the V channel.
4.3 Frequency Domain
Filtering
4.3.1 Filter Design
Gaussian low-pass and high-pass filters from the
textbook are used here. The filter can be expressed as
H(u,\,v), and
multiplying H(u,\,v) by
the frequency-domain F(u,\,v) yields the new
frequency domain. Let size be the filter size,
then:
The Gaussian low-pass filter is:
H(u,\,v)=\exp\left(-\frac{(u-u_0)^2+(v-v_0)^2}{2\times
size^2}\right)
The Gaussian high-pass filter is:
H(u,\,v)=1-\exp\left(-\frac{(u-u_0)^2+(v-v_0)^2}{2\times
size^2}\right)
where:
u_0=\frac{height}{2}\quad v_0=\frac{width}{2}
For a low-pass filter, a larger size means more frequencies
are allowed to pass, resulting in less impact on the
original image; for a high-pass filter, a larger size means more frequencies
are removed, resulting in a greater impact on the
original image. These two filters can be written as the
following functions:
// Gaussian low-pass filter
// width: image width
// height: image height
// out: frequency domain data
// filter_size: size of filter in frequency domain
void gauss_lpf(int width, int height, fftw_complex *out, double filter_size)
{
double x_center = height / 2.0;
double y_center = width / 2.0;
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
{
double d2 = (i - x_center) * (i - x_center) + (j - y_center) * (j - y_center);
double h = exp(-d2 / (2 * filter_size * filter_size));
out[i * width + j][0] *= h;
out[i * width + j][1] *= h;
}
}
// Gaussian high-pass filter
// width: image width
// height: image height
// out: frequency domain data
// filter_size: size of filter in frequency domain
void gauss_hpf(int width, int height, fftw_complex *out, double filter_size)
{
double x_center = height / 2.0;
double y_center = width / 2.0;
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
{
double d2 = (i - x_center) * (i - x_center) + (j - y_center) * (j - y_center);
double h = 1 - exp(-d2 / (2 * filter_size * filter_size));
out[i * width + j][0] *= h;
out[i * width + j][1] *= h;
}
}
After obtaining the new out complex
array through the filter, inverse transform needs to be
performed on it. According to the textbook formula, IDFT
can be calculated using FFT:
If:
F(u,\,v)=\mathrm{DFT}[f(x,\,y)]
then:
f(x,\,y)=\frac{1}{width\times
height}\mathrm{DFT}[\overline{F(u,\,v)}]
Only the complex conjugate needs to be taken, and the
result divided by the image size. Additionally, since
the image was multiplied by (-1)^{x+y} before DFT, it
needs to be multiplied by (-1)^{x+y} again here to
restore the original image.
Furthermore, only the real part of the inverse
transform result, i.e., \mathrm{Re}\,f(x,y), needs to
be taken. The code can be written as follows:
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
out[i * width + j][1] *= -1;
p = fftw_plan_dft_2d(height, width, out, out, FFTW_FORWARD, FFTW_ESTIMATE);
fftw_execute(p);
for (int i = 0; i < height; i++)
for (int j = 0; j < width; j++)
{
double re = out[i * width + j][0] / (width * height) * ((i + j) % 2 ? -1 : 1);
if (re > 255)
re = 255;
if (re < 0)
re = 0;
uint8_t value = (uint8_t)round(re);
if (c == 0)
px[i * width + j].b = value;
else if (c == 1)
px[i * width + j].g = value;
else
px[i * width + j].r = value;
}
4.3.3 Function
Prototype and Testing
The above steps are written as the
fft_dft_idft function with the following
prototype:
// Calculate DFT of an image, perform some processing in the frequency domain, and then IDFT
// width: image width
// height: image height
// px: image pixel data, and after this, its data will be replaced by the processed image
// proc: processing function for the frequency domain
// filter_size: size of filter in frequency domain
void fft_dft_idft(int width, int height, Color *px, void (*proc)(int, int, fftw_complex *, double), double filter_size)
where proc is one of the two filter
functions.
This function is called in bmp.cpp.
Since filtering 2-color and 16-color images generates
new colors, they need to be converted to 256-color;
color images remain unchanged:
// Gaussian low-pass filter
// filter_size: image is blurrier if smaller
void BMP::low_pass_filter(double filter_size)
{
if (header.depth <= 8)
to8();
fft_dft_idft(header.width, header.height, px, gauss_lpf, filter_size);
}
// Gaussian high-pass filter
// filter_size: image is sharper if larger
void BMP::high_pass_filter(double filter_size)
{
if (header.depth <= 8)
to8();
fft_dft_idft(header.width, header.height, px, gauss_hpf, filter_size);
}
Low-pass filtering is performed on Lena’s color image
with filter sizes of 2, 4, 8, 16, and 32:

It can be found that a smaller filter size means
fewer high-frequency components are allowed to pass,
resulting in a smoother image. High-pass filtering is
then performed with the same sizes (2, 4, 8, 16,
32):

It can be seen that a larger filter size means more
low-frequency components are removed, resulting in more
prominent edges. Additionally, as low-frequency
components close to DC are removed, the image tends to
be black in smooth areas and has other colors only at
edges. To increase contrast, histogram equalization is
performed on these five images:

This further shows that a larger filter size results
in more obvious extraction of edge features.
The following are tests on the test images: low-pass
filtering is applied to the first row, and high-pass
filtering to the second row, with a filter size of 10
for both:

Significant changes in the spectra can also be
observed.
For the second row, since the range is scaled when
displaying the spectrum, the color at the center (where
frequency intensity is originally highest) remains the
same, but it can be seen that the originally dark areas
around become lighter.
The algorithm is also effective for non-square
patterns. Below is the result of low-pass filtering with
a size of 10:

Click here to download the code
and sample images