In this first tutorial from the series Arduino Machine learning we're going to implement the "Hello world" of Machine learning projects: classifying the Iris dataset on an Arduino board. The Iris dataset is a well known one in the Machine learning world and is often used in introductory tutorials about classification.
In this tutorial we're going to run the classification directly on a Arduino Nano board (old generation), equipped with 32 kb of flash and only 2 kb of RAM: that's the only thing you will need!

1. Features definition

There are 4 features in this dataset: sepal length, sepal width, petal length, petal width; and 3 classes: Setosa, Versicolor, Virginica. You can see in the picture below how they relate to the actual flower.

Iris features illustrated @ credits to https://gallery.azure.ai/Experiment/Classify-Iris-Dataset-using-Decision-Forest-1

2. Sample data

You may download the dataset here.
An excerpt of the dataset is reported in the following table.

sepal.lengthsepal.widthpetal.lengthpetal.widthvariety
5.13.51.40.2Setosa
4.931.40.2Setosa
4.63.11.50.2Setosa
5.82.641.2Versicolor
5.62.74.21.3Versicolor
6.32.551.9Virginica
6.535.22Virginica
5.935.11.8Virginica

A contour plot of this dataset is depicted in the image below.

Decision boundaries of 2 PCA components of Iris features

3. Train and export the classifier

For a detailed guide refer to the tutorial

from sklearn.ensemble import RandomForestClassifier
from micromlgen import port

# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)

At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h.

4. Run the inference

We will be running the inferences from the features entered via Serial monitor: you type 4 float values representing the 4 features and get back the predicted Iris species.

#include "iris.h"

void setup() {
    Serial.begin(115200);
}

void loop() {
    if (Serial.available()) {
        double features[4];

        for (int i = 0; i < 4; i++) {
            // split features on comma (,)
            String feature = Serial.readStringUntil(',');

            features[i] = atof(feature.c_str());
        }

        Serial.print("Detected species: ");
        Serial.println(classIdxToName(predict(features)));
    }

    delay(10);
}

If you open the Serial monitor you should see something like the next picture as you type in the features from different species.

Iris classification serial output

That’s it: you deployed machine learning in 2 Kb!

Project figures

On my machine, the sketch targeted at the Arduino Nano (old generation) requires 7446 bytes (24%) of program space and 302 bytes (14%) of RAM. This means you could actually run machine learning in even less space than what the Arduino Nano provides. So, the answer to the question Can I run machine learning on Arduino? is definetly YES.

Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.



Check the full project code on Github

Help the blow grow