Above Image, The MAGIC Telescope at night by Robert Wagner [Attribution], via Wikimedia Commons

As I was going through the catalog of UCI Machine Learning Repository, a certain data set caught my eye, it's name is MAGIC Gamma Telescope Data Set. After closer inspection, It was a data set created via Monte Carlo simulation of what would be read from a Gamma Telescope.

# Introduction

After some research, I found that these Gamma Telescopes are called Imaging Atmospheric Cherenkov Telescope. In all my time in university and after, I had not heard of these ground based telescopes that use cherenkov radiation to detect massively powerful gamma rays from outer space. To understand these telescopes it is important to understand Cherenkov radiation.

I, back then, asked my self, "Isn't light and light like things the fastest things in the universe?". Well, that is still true, but as light travels through matter it interacts with atoms by being absorbed and re-emitted with a small time delay each time, this is a simplified explanation. As it travels between the atoms it is still traveling at light speed; This is the phase velocity. We can use the refractive index of the material to calculate the velocity at which light will travel in the medium. $$n=\frac{c}{v}\rightarrow v=\frac{c}{n}$$ In the case of water, the refractive index is $1.333$ resulting in the phase velocity of light being $\frac{c}{1.333}=0.75c$ or $2.25\times 10^8\frac{m}{s}$. In the case of air, that refractive index is $1.00027717$ resulting in a speed of light of $2.99709\times 10 ^8\frac{m}{s}$, $83070\frac{m}{s}$ slower than the speed of light in a vacuum.

Cherenkov radiation is the light/glow that results from charged particles that pass through a dielectric medium faster than the phase velocity of light in the medium. One way that was explained to me, in an optics class, is that Cherenkov radiation is the result of a similar processes to the sonic boom; where an object traveling faster than the speed of sound creates a shock wave. (link)

# Detecting High-energy Gamma Rays w/ IACT

A gamma ray with an energy of around $30 GeV$ and interacts with the atmosphere where a large amount of secondary particles are generated that travel close to the speed of light. So much so that the particles are traveling faster than the phase velocity of light in air, so faster than $2.99709\times 10 ^8\frac{m}{s}$, and generate ultraviolet Cherenkov radiation. This light is captured and collected by the telescopes and tracked back to the object they originated from.

# The Data

The data was generated using Monte Carlo Simulation using software called CORSIKA. The generated data includes the size of the major and minor axis in millimeters, the $\log_{10}$ of the sum of all the pixels, the ratio of the sum of two highest pixels and the $\log_{10}$ of the sum of all the pixels, etc. You can see the 10 features recorded at this link. The classification is as to whether the sample originates from the background or is the signal of a gamma ray. I have modified the data in order to be used with the C++ Neural Network Library that I am working on. (Link to Modified Data)

In the modified data file, the first 10 are the columns containing the input data and the last two contain the classification data. The first classification column is 1 when the sample is that of a gamma ray and the second classification column is 0. For the background samples the opposite is true.

# The Code

As I said earlier, I used a C++ Neural Network Library that I am currently working on called CppNNet. It make creating and training a neural network simpler and I am always working to make it even simple, easier to use, and optimizing. You can clone and build the library from it's github repository page (link).

### Ubuntu build instructions

sudo apt install build-essential git cmake
git clone https://github.com/anhydrous99/CppNNet
cd CppNNet && mkdir build && cd build
cmake ..
make


I have included in the repository a version of the network with less layers with the purpose of faster training, you can modify it to be similar to what I write. (link)

### Magic Network

Lets start by importing the following headers in the main source file.

#include "Normalizer.h"
#include "Neural_Layer.h"
#include "Neural_Trainer.h"
#include "CSV_Importer.h"
#include <iostream>
#include <chrono>


Normalizer.h contains a class that stretches or shrinks the range of the samples to accelerate training. Neural_Layer.h has the class that is the object that contains the weights and biases. Neural_Trainer.h, like the name implies, trains the network. CSV_Importer.h imports the samples from csv files.
In the main function, lets declare the number of inputs for the network and the number of outputs.

using namespace CppNNet;
int main(int argc, char* argv[]) {
// Number of inputs
int inp = 10;
// Number of outputs
int out = 2;

// Next code
}


We then create the layers. I used 4 layers, the first with 40 neurons with each next layer halving the number of neurons of the last later. I used the ReLU activation function to avoid the Gradient Vanishing Problem. The library make extensive use of managed pointer called shared pointers, so declare std::shared_ptrs with Neural_Layer objects at the end.

// Create Layers
std::shared_ptr<Neural_Layer> layer1 = std::make_shared<Neural_Layer>(40, inp, activation_function::ReLU);
std::shared_ptr<Neural_Layer> layer2 = std::make_shared<Neural_Layer>(20, 40, layer1, activation_function::ReLU);
std::shared_ptr<Neural_Layer> layer3 = std::make_shared<Neural_Layer>(10, 20, layer2, activation_function::ReLU);
std::shared_ptr<Neural_Layer> layer4 = std::make_shared<Neural_Layer>(out, 10, layer3);


As I have said before, I am hosting the modified data in an S3 bucket for use and, in the library, I use libcurl in order to automatically download the csv files and import them with a CSV_Importer object.

// Import Data
CSV_Importer imp(path, inp, out);
std::vector<Evector> samples = imp.GetSamples();
std::vector<Evector> targets = imp.GetTargets();


We then normalize the data in order so that all the data fits between the range $[-1, 1]$.

// Normalize Data
Normalizer samplen(samples, -1, 1);
std::vector<Evector> normed_samples = samplen.get_batch_norm(samples);


Create a trainer object with a learning rate of 0.0001.

// Create Trainer
Neural_Trainer trainer(layer4, 0.0001);


Train for 250 Epoches, or over the entire dataset 250 times, while taking account of starting and ending time to measure training time.

// Train
std::cout << "Starting to Train" << std::endl;
for (int i = 0, sizei = 250; i < sizei; i++) {
trainer.train_minibatch(normed_samples, targets, 250);
}


Finally display performance metrics.

// Calculate error
float mse = layer4->mse(normed_samples, targets);
float rmse = layer4->rmse(normed_samples, targets);


display it and return

std::cout << "MSE:    " << mse << std::endl;
std::cout << "RMSE:   " << rmse << std::endl;
std::cout << "It took " << std::chrono::duration_cast<std::chrono::seconds>(end1 - start1).count()
<< " seconds"
<< std::endl;
return 0;


Here is the output in terminal

/mnt/c/Users/arman/CLionProjects/CppNNet/cmake-build-debug/tests/MAGIC_Net_Test