How To Use Fourier Transform On Images – C# Guide
Fourier transform is an equation that turns normal pixel values into complex numbers. But to know what these complex numbers mean, we should give a little more context to them by...
Filter by Category
Fourier transform is an equation that turns normal pixel values into complex numbers. But to know what these complex numbers mean, we should give a little more context to them by...
C# tutorial on how to achieve histogram equalization on an image.
In this tutorial we will be talking about intensity level slicing in image processing. What is Intensity Level Slicing It’s a process that highlights pixels in an arbitrary...
This post is a short revision of Contrast Stretch post we already worked on in the past. Where we talked about histogram equalization, which is a little more complex method than...
What is Log Transformation Anyway? Log transformation in image processing is a part of gray level transformations. It works by transforming each pixel individually. It is usually...
Where Do We Start? There are numerous ways you can flip an image horizontally. It’s pretty basic functionality in image processing. I wanted to challenge myself and not just...
Let’s Get Started A very basic problem that we face in image processing is how to scale images without making them look deformed when we want to make it of an arbitrary...
List of contents: Short intro First Method – Google Opinion Rewards Second Method – Taking & Fortifying Gyms Third Method – Answer a Few Questions List of...
Zero padding an image is useful when we’re convolving it with a filter of certain size. How much padding should we use depends on how big our filter is. What is the purpose...
Gamma correction is a process, which allows you to set the brightness of screen. You can often run into this setting in video games, where you set the brightness of your screen...
Fourier transform is an equation that turns normal pixel values into complex numbers. But to know what these complex numbers mean, we should give a little more context to them by explaining what Fourier transform actually does to data.
When we transform some data with Fourier transform, we convert it into frequency domain. And as it happens to be, we can divide complex frequency waves into a number of simple sinusoids.
A Fourier transform of an array of data, will look like one big wave in the middle with smaller and smaller waves extending into both directions.
Something like the image above. The structure and the frequency values are important because we can use inverse Fourier transform to turn it back into original form. We call an ordinary image being in spatial domain.
The purpose of transforming our data into frequency domain is so we can apply changes to a set of frequencies to process our data in the spatial domain.
If we want to use the transform on images, we need to use extended equation that allows us to calculate data in 2 dimensions.
Let’s take a look at the formulas that allows us to calculate only in 1 dimension first though.
The one above turns our data into frequency domain and below is it’s inverse that brings it back to spatial domain.
Now let’s take a look how the extended versions look like.
Let’s talk about what everything in the formulas represent.
As you can see M and N values are the same for both formulas, which means that the frequency map will be the same size as the image.
Using the formulas above will work in theory just fine. But we’ll run into a problem when we’ll try to implement it in practice.
The bigger the image you’re trying to transform, the longer and beefier your formula will get for your computer to process it. So it is next to impossible to do it this way, because it could literally take years or more likely melt your processor into the motherboard.
For practical use, we need Fast Fourier Transform or FFT for short. FFT splits our set of data into even and odd index positions and calculates them individually and then puts them together. Not only that, we can create a recursive structure in our code to reuse already calculated values shorten our processing time significantly.
What this formulation also does is calculates each dimension separately. As you’ll be able to see in our code, we will calculate our coulumns first, transpose the data and then calculate our rows.
First step we need to do is to convert pixel values from our image into a complex map of numbers. By taking pixel value in a channel and putting it as a Real component with imaginary component resulting to 0.
public static Complex[][] ToComplex(Bitmap image)
{
int w = image.Width;
int h = image.Height;
BitmapData input_data = image.LockBits(
new Rectangle(0, 0, w, h),
ImageLockMode.ReadOnly,
PixelFormat.Format32bppArgb);
int bytes = input_data.Stride * input_data.Height;
byte[] buffer = new byte[bytes];
Complex[][] result = new Complex[w][];
Marshal.Copy(input_data.Scan0, buffer, 0, bytes);
image.UnlockBits(input_data);
int pixel_position;
for (int x = 0; x < w; x++)
{
result[x] = new Complex[h];
for (int y = 0; y < h; y++)
{
pixel_position = y * input_data.Stride + x * 4;
result[x][y] = new Complex(buffer[pixel_position], 0);
}
}
return result;
}
Once we get our map of complex number ready to be put into our Fourier transform funtions we can take a look at how the function that computes one array of values first.
public static Complex[] Forward(Complex[] input, bool phaseShift = true)
{
var result = new Complex[input.Length];
var omega = (float)(-2.0 * Math.PI / input.Length);
if (input.Length == 1)
{
result[0] = input[0];
if (Complex.IsNaN(result[0]))
{
return new [] {new Complex(0, 0)};
}
return result;
}
var evenInput = new Complex[input.Length / 2];
var oddInput = new Complex[input.Length / 2];
for (int i = 0; i < input.Length / 2; i++)
{
evenInput[i] = input[2 * i];
oddInput[i] = input[2 * i + 1];
}
var even = Forward(evenInput, phaseShift);
var odd = Forward(oddInput, phaseShift);
for (int k = 0; k < input.Length / 2; k++)
{
int phase;
if (phaseShift)
{
phase = k - Size / 2;
}
else
{
phase = k;
}
odd[k] *= Complex.Polar(1, omega * phase);
}
for (int k = 0; k < input.Length / 2; k++)
{
result[k] = even[k] + odd[k];
result[k + input.Length / 2] = even[k] - odd[k];
}
return result;
}
The recursive steps the function takes, calculates the values as shown on the image below.
Now that we know how to compute one array of numbers, let’s go on to the whole image.
public static Complex[][] Forward(Bitmap image)
{
var p = new Complex[Size][];
var f = new Complex[Size][];
var t = new Complex[Size][];
var complexImage = ToComplex(image);
for (int l = 0; l < Size; l++)
{
p[l] = Forward(complexImage[l]);
}
for (int l = 0; l < Size; l++)
{
t[l] = new Complex[Size];
for (int k = 0; k < Size; k++)
{
t[l][k] = p[k][l];
}
f[l] = Forward(t[l]);
}
return f;
}
While the computation is considerably faster than DFT (Discrete Fourier Transform) version, it does have some constraints. The image that we’re trying to process need to be square. As you might have noticed in the functions above, we calculate array only of size “Size” as denoted above.
Not only that, each dimension needs to be of size 2n , with n being a positive integer.
But we can work around this constraint by using zero padding around the image to make it into a square. I’ve written a function that does exactly that.
public static Bitmap Padding(this Bitmap image)
{
int w = image.Width;
int h = image.Height;
int n = 0;
while (FFT.Size <= Math.Max(w,h))
{
FFT.Size = (int)Math.Pow(2, n);
if (FFT.Size == Math.Max(w,h))
{
break;
}
n++;
}
double horizontal_padding = FFT.Size - w;
double vertical_padding = FFT.Size - h;
int left_padding = (int)Math.Floor(horizontal_padding / 2);
int top_padding = (int)Math.Floor(vertical_padding / 2);
BitmapData image_data = image.LockBits(
new Rectangle(0, 0, w, h),
ImageLockMode.ReadOnly,
PixelFormat.Format32bppArgb);
int bytes = image_data.Stride * image_data.Height;
byte[] buffer = new byte[bytes];
Marshal.Copy(image_data.Scan0, buffer, 0, bytes);
image.UnlockBits(image_data);
Bitmap padded_image = new Bitmap(FFT.Size, FFT.Size);
BitmapData padded_data = padded_image.LockBits(
new Rectangle(0, 0, FFT.Size, FFT.Size),
ImageLockMode.WriteOnly,
PixelFormat.Format32bppArgb);
int padded_bytes = padded_data.Stride * padded_data.Height;
byte[] result = new byte[padded_bytes];
for (int i = 3; i < padded_bytes; i+=4)
{
result[i] = 255;
}
for (int y = 0; y < h; y++)
{
for (int x = 0; x < w; x++)
{
int image_position = y * image_data.Stride + x * 4;
int padding_position = y * padded_data.Stride + x * 4;
for (int i = 0; i < 3; i++)
{
result[padded_data.Stride * top_padding + 4 * left_padding + padding_position + i] = buffer[image_position + i];
}
}
}
Marshal.Copy(result, 0, padded_data.Scan0, padded_bytes);
padded_image.UnlockBits(padded_data);
return padded_image;
}
I’ve modified the code from zero padding guide on this blog if you want to check it out.
Let’s go through the steps the function above takes
For simplicity I’ve used grayscale filter on images, so the program calculates values from only one channel.
In the image below each pixel represents a separate frequency. But we can’t use the image itself to convert it back to spatial domain. To visualize the transform we need to get the magnitude or modulus of complex numbers in the map. We also need to use logarithmic transformation on it to make it visible.
Above we can see that formulas for forward and inverse transform are quite similar. So there’s a trick we can use to reuse forward transform function. If we use conjugate of complex numbers with forward transformation function and it’s result, we can use it to get the pixel values by getting their magnitude.
public static Complex[] Inverse(Complex[] input)
{
for (int i = 0; i < input.Length; i++)
{
input[i] = Complex.Conjugate(input[i]);
}
var transform = Forward(input, false);
for (int i = 0; i < input.Length; i++)
{
transform[i] = Complex.Conjugate(transform[i]);
}
return transform;
}
public static Bitmap Inverse(Complex[][] frequencies)
{
var p = new Complex[Size][];
var f = new Complex[Size][];
var t = new Complex[Size][];
Bitmap image = new Bitmap(Size, Size);
BitmapData image_data = image.LockBits(
new Rectangle(0, 0, Size, Size),
ImageLockMode.WriteOnly,
PixelFormat.Format32bppArgb);
int bytes = image_data.Stride * image_data.Height;
byte[] result = new byte[bytes];
for (int i = 0; i < Size; i++)
{
p[i] = Inverse(frequencies[i]);
}
for (int i = 0; i < Size; i++)
{
t[i] = new Complex[Size];
for (int j = 0; j < Size; j++)
{
t[i][j] = p[j][i] / (Size * Size);
}
f[i] = Inverse(t[i]);
}
for (int y = 0; y < Size; y++)
{
for (int x = 0; x < Size; x++)
{
int pixel_position = y * image_data.Stride + x * 4;
for (int i = 0; i < 3; i++)
{
result[pixel_position + i] = (byte)Complex.Modulus(f[x][y]);
}
result[pixel_position + 3] = 255;
}
}
Marshal.Copy(result, 0, image_data.Scan0, bytes);
image.UnlockBits(image_data);
return image;
}
I’ve tried to keep it short and , hopefully, understandable for you guys. Feel free to download the entire project and try it out yourself.
If you wish to support me and my work you could download it as an incentive. You will need to complete a survey to get the file.