TB Logo
NeuralNetworkCPP — Neural Network from scratch in C++

NeuralNetworkCPP — Neural Network from scratch in C++

May 2023
Time spent: ~30 h
View on GitHub
Tags:
Deep LearningNeural Networks
Skills:
C++
C++

NeuralNetworkCPP

This project is a small neural network library written in C++ “from scratch”, made mostly for learning. At the time, I was learning deep learning fundamentals and I also wanted a hands-on C++ project that forced me to implement the core math and training loop myself.

Why I built it

Most of the time, you can train a neural network without thinking too much about what happens under the hood.
Here, the goal was the opposite: implement the full pipeline (forward pass → loss → backprop → parameter update), and see how small choices (activation functions, learning rate, batch size, etc.) affect training.

This is intentionally a “learning library”, not a production-grade framework. It’s focused on clarity and experimentation.

How it works (tech + algo)

The library implements a basic feedforward network with multiple fully-connected layers, where weights and biases are initialized randomly. It supports common activation functions like ReLU, Sigmoid and TanH, configurable per-layer.

Training is classic gradient descent:

  • Forward propagation caches intermediate tensors (Z / A) to reuse them during backprop.
  • Backpropagation computes gradients layer-by-layer and then applies an update step: W=WαdWW = W - \alpha \cdot dW and b=bαdbb = b - \alpha \cdot db, where α\alpha is the learning rate.
  • The loss used in the training loop is a log loss computed from outputs/targets (with a small epsilon for numerical stability).

There’s also an optional dropout mode (with a keep probability) during training to help reduce overfitting. And for fun / performance testing, the project includes an optional CUDA path to accelerate matrix computations on NVIDIA GPUs.

Digit recognition demo (MNIST)

To validate that the library actually learns something, I built a digit recognition example using MNIST images.
The demo loads 55,000 training images and 5,000 test images, normalizes them, and builds one-hot targets for digits 0–9.

For this run, the network shape is a simple MLP: 784 → 128 → 128 → 128 → 10, trained with learning rate 0.1 for 10 epochs and batch size 512. Hidden layers use ReLU, and the output layer uses Sigmoid in this implementation. During training I enabled dropout with keepProb = 0.7.

Training UI (terminal)

I also spent time on the terminal UX because it makes the project more “alive”:

  • A custom progress bar is used for dataset loading and for each epoch.
  • Loss values are displayed regularly during training iterations.

Training progress bar
Training progress bar

Accuracy graphs (terminal)

At the end of training, the demo prints:

  • Global accuracy on the test set and on the training set.
  • A per-digit bar graph, where each class has a green bar for correct predictions and a red bar for incorrect ones, plus counts next to the bars.

Test set per-digit accuracy graph
Test set per-digit accuracy graph
Training set per-digit accuracy graph
Training set per-digit accuracy graph

Results (10 epochs)

On this experiment, I got 94.32% accuracy on the test set and 95.65% on the training set.
A small train/test gap like this usually suggests mild overfitting, but nothing extreme (the model is learning useful features, yet still generalizes reasonably well).
Also, considering this is a simple fully-connected network (no convolutions) and a “from scratch” training loop, this was a really satisfying result for only 10 epochs.

What I learned

This project forced me to understand backpropagation in a very concrete way (dimensions, broadcasting mistakes, gradient flow, and how easy it is to silently break training). It also made the performance side of ML feel real: memory layout, matrix multiplication cost, batching, and why libraries invest so much into optimized kernels (and why CUDA is a world on its own).

On the tooling side, building a CLI with visual feedback (progress bars + graphs) taught me how much usability matters, even for “just a terminal program”.

What I'd improve next

If I came back to it today, a few upgrades would be high on the list:

  • Switch the output to a more standard Softmax + cross-entropy setup for multi-class classification.
  • Add better optimizers (Adam / RMSProp) and maybe a learning-rate schedule.
  • Add regularization options and better experiment tracking (metrics history export, confusion matrix, etc.).
  • Extend the library beyond dense layers (but staying simple), or compare against a small CNN to show the gap clearly.

If you want to explore the code, the repo is here: NeuralNetworkCPP on GitHub.