In this first tutorial from the series Arduino Machine learning we're going to implement the "Hello world" of Machine learning projects: classifying the Iris dataset on an Arduino board. The Iris dataset is a well known one in the Machine learning world and is often used in introductory tutorials about classification.
In this tutorial we're going to run the classification directly on a Arduino Nano board (old generation), equipped with 32 kb of flash and only 2 kb of RAM: that's the only thing you will need!
1. Features definition
There are 4 features in this dataset: sepal length, sepal width, petal length, petal width; and 3 classes: Setosa, Versicolor, Virginica. You can see in the picture below how they relate to the actual flower.
2. Sample data
You may download the dataset here.
An excerpt of the dataset is reported in the following table.
sepal.length | sepal.width | petal.length | petal.width | variety |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | Setosa |
4.9 | 3 | 1.4 | 0.2 | Setosa |
4.6 | 3.1 | 1.5 | 0.2 | Setosa |
5.8 | 2.6 | 4 | 1.2 | Versicolor |
5.6 | 2.7 | 4.2 | 1.3 | Versicolor |
6.3 | 2.5 | 5 | 1.9 | Virginica |
6.5 | 3 | 5.2 | 2 | Virginica |
5.9 | 3 | 5.1 | 1.8 | Virginica |
A contour plot of this dataset is depicted in the image below.
3. Train and export the classifier
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
4. Run the inference
We will be running the inferences from the features entered via Serial monitor: you type 4 float values representing the 4 features and get back the predicted Iris species.
#include "iris.h"
void setup() {
Serial.begin(115200);
}
void loop() {
if (Serial.available()) {
double features[4];
for (int i = 0; i < 4; i++) {
// split features on comma (,)
String feature = Serial.readStringUntil(',');
features[i] = atof(feature.c_str());
}
Serial.print("Detected species: ");
Serial.println(classIdxToName(predict(features)));
}
delay(10);
}
If you open the Serial monitor you should see something like the next picture as you type in the features from different species.
That’s it: you deployed machine learning in 2 Kb!
Project figures
On my machine, the sketch targeted at the Arduino Nano (old generation) requires 7446 bytes (24%) of program space and 302 bytes (14%) of RAM. This means you could actually run machine learning in even less space than what the Arduino Nano provides. So, the answer to the question Can I run machine learning on Arduino? is definetly YES.
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github