The greatest Mathematicians, as Archimedes, Newton, and Gauss, always united theory and applications in equal measure. - Felix Klein

I remember, while taking a class in linear algebra, the professor, showing applications of linear algebra, introduced SVD Decomposition. It has applications in Image Compression, Search Engines, Pattern Extraction, Modal analysis, Numerical Weather Prediction, etc..  

What is SVD Decomposition?

Singular-value Decomposition (SVD) is the decomposition of a matrix $A$ into three matrices, $U$ $\Sigma$, and $V$. Where $V$ and $U$ is a square matrix and $\Sigma$ is a diagonal matrix whose values at the diagonal are called the singular values. These singular values are stored from greatest to least starting from the top left and ending at the bottom right. $U$ and $V$ are unitary, meaning that their columns are sets of orthonormal vectors and the product with their transpose is an identity matrix, a matrix whose diagonal values are 1 with all others being 0. And, their product results in $A$. So,

$$A=U\Sigma V^*$$
while $A\in R^{m\times n}$, $U\in R^{m\times m}$, $\Sigma\in R^{m\times n}$, and $V\in R^{n\times n}$. $U$ and $V$ can also be represented by rectagular matrices called thin matrices. These matrices can be interpreted as the rotation and the scaling of the matrix $A$.

How to calculate the SVD Decomposition

If we multiply $A$ with its conjugate transpose
\begin{align*}
A^* A&=(V \Sigma U^* )(U \Sigma V^* )=V( \Sigma^2) V^* \\
AA^* &=(U \Sigma V^* )(V \Sigma U^* )=V( \Sigma^2) U^*
\end{align*}
Meaning that, when compared to Eigenvalue decomposition, the singular values are the square roots of it's eigenvalue. So, we can find the SVD Decomposition by solving for the eigenvalues and vectors? Well, yes and no, while we can find the singular values and the $U$ and $V$ matrices afterward, the operation is not numerically stable since it involves squaring the matrix $A$.
A well known method is the Jacobi Algorithm which is implemented in the linear algebra library LAPACK, but, this algorithm, while robust, is comparatively slow. Another widely used algorithm is well the use of two seperate algorithms, one algorithm bidiagonalizes the matrix and the other does the final step to find the singular values. To bidiagonalize, Householder reflections are used to find the unitary matrices applied to the left and right resulting in the bidiagonal matrix, also called Golub-Kahan bidiagonalization. The, the singular values can be found using a variation of the QR decomposition (link). One of the fastest algorithms implemented by LAPACK, and the algorithm we will use, is a divide-and-conqure method of computing the SVD of the bidiagonal matrix (link). The outmost fastest algorithm out there is an algorithm with a complexity of $O(x^2)$ called, relatively robust representations (RRR) algorithm (link). In the program we create we will use the Eigen3 C++ library with the divide and conquer technique.
The Eigen3 Library uses householder transormations to bidiagonalize the matrix, splits the matrix into blocks, and uses the two-sided Jacobi R-SVD algorithm to decompose. Class Reference

Denoise

In the paper, by Rowayda A. Sadek, called "SVD Image Processing Applications: State of The Art, Contributions and Research Challenges", the author describes several properties of the SVD decomposition that we can use to filter noise. The property that we are going to use is the SVD architecture which according to the author, "The largest object components in an image found using the SVD generally correspond to eigenimages associated with the largest singular values (SV), while image noise corresponds to eigenimages associated with the SVs". So, we theoretically can reduce the noise of images by zeroing some of the smallest singular values in the singular value matrix $\Sigma$.


Code

I've written a quick C++ program to do this noise reduction (GitHub). I used the hunter package manager so, CMake will make be able to aquire the dependencies for the program and prepare the project for compilation.
The most important function in the program is contained in the source file denoise.cpp within the src folder whose name is Compute_Channel_Denoise

Eigen::MatrixXd Compute_Channel_Denoise(Eigen::MatrixXd a, int perc)
{
  // Compute SVD and store in svd a BDCSVD object
  Eigen::BDCSVD<Eigen::MatrixXd> svd(a, Eigen::ComputeThinU | Eigen::ComputeThinV);

  // Grab singular values
  auto singular = svd.singularValues();

  // Zero out perc% of the smallest singualar values
  auto size = singular.size();
  for (int i = static_cast<int>(size * (1.0 - static_cast<double>(perc)/100.0)); i < size; i++)
    singular(i) = 0.0e0;

  // recombine the decomposition
  return svd.matrixU() * singular.asDiagonal() * svd.matrixV().transpose();
}

The above is simple enough, in line 4, we create an Eigen::BDCSVD whose constructor paramters are the matrix to decompose and an unsigned integer denoting whether to compute the thin or full $U$ and $V$ matricies. I chose to construct the thin matrices to save memory. Line 7 saves the singular values as a vector where on line 10-12 we set the input percentage perc of the smallest singular values as 0. Finally, in line 15, we multipy the matrices back to together and return.
Depending on the number of channels the image has, whether it is RGB, RGBA, Gray scale, or Gray scale w/ alpha channel, we will run this multiple times. For example, if the image is of type RGB, or has a Red, Green, and Blue channel, we will run it 3 times, once for each channel. In C++11 the standard thread support library added the future class which, together with the async class allows for tasks to be executed asynchronously. So, while the main thread is taking care of the asynchronous threads are computing dealing with the other channels. The Eigen Library is pretty thread-safe within the new versions of C++ and, even, the algorithm to multiply matrices has been fully multi-threaded using OpenMP. For matrix multiplication of the Eigen Library to be multi-threaded all we have to do is tick the -fopenmp compiler argument and include it. here is a snippet of the Denoise function which uses the Compute_Channel_Denoise

int Denoise(pngreq& img, int perc)
{
  // Grab some info on images
  png_byte color_type = img.color_type;
  int w = img.width, h = img.height;
  // Check color_type
  if (color_type == PNG_COLOR_TYPE_RGB)
  {
    // Allocate Arrays that will store RGB data
    Eigen::MatrixXd R(w,h);
    Eigen::MatrixXd G(w,h);
    Eigen::MatrixXd B(w,h);

    // Get Image Values
    img.get_rgb_channels(R, G, B);

    // Pass each channel through Compute_Channel_Denoise
    std::future<Eigen::MatrixXd> newR_fut = std::async(Compute_Channel_Denoise, R, perc);
    std::future<Eigen::MatrixXd> newG_fut = std::async(Compute_Channel_Denoise, G, perc);
    Eigen::MatrixXd newB = Compute_Channel_Denoise(B, perc);

    Eigen::MatrixXd newR = newR_fut.get();
    Eigen::MatrixXd newG = newG_fut.get();

    // Set new Image Values
    img.set_rgb_channels(newR, newG, newB);
  }
  else if (color_type == PNG_COLOR_TYPE_RGBA)
  {
/*    \/\/ Code Continues \/\/    */

The input argument for the function is a class which imports the png file using the libpng library. It is stored in memory for manipulation as a pointer of type png_bytep which points to our image data. Anyway, in the above function we get some pertinent information on the image, including the color type, width, and height of the image. With that information we open an if elseif statement denoting what we are going to do if the image is of type RGB, RGBA, Gray Scale, or Gray Scale /w Alpha channel. In each of the if statements, we first create a couple matrices of size (width x height), each will store a channel of the image. By the way, by channel I mean, for example, for the green channel I mean all the green elements of the pixel. We use the function get_rgb_channels which, by reference, will add the image data to each of the matrices. On line 18 is where we take advantage of the new C++ feature by creating std::future objects which will run the R and G matrices through the function asynchronously. Use the function get() to make the main thread wait for the computations to complete and to return the new matrices. Finally, set the new values into the pngreq object using the set_rgb_channels function, by reference.
For each color type, RGB, RGBA, Grey Scale, and Grey Scale /w Alpha channel, we add an else if to the code to account for it. While I added the 4 types of color_types I assume each channel of the image is 8 bits, meaning that each sample is 8bits in size, while the png image type supports up to 16 bits per channel. So keep that in mind. I won't go over the pngreq class since it is mearly a class that serves as an interface with the libpng library, but thank you to Guillaume Cottenceau link.
For the main function, we declare a pngreq type, run it's member function read_png_file whose input parameter is the path to the png file as a char array, run the Denoise function, and, finally, write the image to the new png file path with write_png_file. Here is the code.

#include "pngreq.h"
#include "denoise.h"
#include <iostream>

int main(int argc, char* argv[])
{
  pngreq img;
  // Read PNG Image
  img.read_png_file(argv[1]);
  // DeNoise Image
  Denoise(img, std::stoi(argv[3]));
  // Write Image
  img.write_png_file(argv[2]);
  return 0;
}

Testing

Time to try out the program, I'll be using an image I captured of the open star cluster called Pleiades. I added some artificial noise using GIMP and turned the image into gray scale for faster computations. Here is a google drive link with the image [link]. I ran it twice, once with the 80% of the singular values zeroed and another with 90%. Here are the results.

Here is the terminal call

./SVD_Denoiser pleiades.png pleiades_80.png 80
./SVD_Denoiser pleidaes.png pleiades_90.png 90

SOURCE CODE