L'articolo How to train a IRIS classification Machine learning classifier directly on your Arduino board proviene da Eloquent Arduino Blog.
]]>
In this demo project we're going to take a know dataset (iris flowers) and interactively train an SVM classifier on it, adjusting the number of samples to see the effects on both training time, inference time and accuracy.
#ifdef ESP32
#define min(a, b) (a) < (b) ? (a) : (b)
#define max(a, b) (a) > (b) ? (a) : (b)
#define abs(x) ((x) > 0 ? (x) : -(x))
#endif
#include <EloquentSVMSMO.h>
#include "iris.h"
#define TOTAL_SAMPLES (POSITIVE_SAMPLES + NEGATIVE_SAMPLES)
using namespace Eloquent::ML;
float X_train[TOTAL_SAMPLES][FEATURES_DIM];
float X_test[TOTAL_SAMPLES][FEATURES_DIM];
int y_train[TOTAL_SAMPLES];
int y_test[TOTAL_SAMPLES];
SVMSMO<FEATURES_DIM> classifier(linearKernel);
First of all we need to include a couple files, namely EloquentSVMSMO.h
for the SVM classifier and iris.h
for the dataset.
iris.h
defines a couple constants:
FEATURES_DIM
: the number of features each sample has (4 in this case)POSITIVE_SAMPLES
: the number of samples that belong to the positive class (50)NEGATIVE_SAMPLES
: the number of samples that belong to the negative class (50)The we declare the array that hold the data: X_train
and y_train
for the training process, X_test
and y_test
for the inference process.
void setup() {
Serial.begin(115200);
delay(5000);
// configure classifier
classifier.setC(5);
classifier.setTol(1e-5);
classifier.setMaxIter(10000);
}
Here we just set a few parameters for the classifier. You could actually skip this step in this demo, since the defaults will work well. Those lines are there so you know you can tweak them, if needed.
Please refer to the demo for color classification for an explanation of each parameter.
void loop() {
int positiveSamples = readSerialNumber("How many positive samples will you use for training? ", POSITIVE_SAMPLES);
if (positiveSamples > POSITIVE_SAMPLES - 1) {
Serial.println("Too many positive samples entered. All but one will be used instead");
positiveSamples = POSITIVE_SAMPLES - 1;
}
int negativeSamples = readSerialNumber("How many negative samples will you use for training? ", NEGATIVE_SAMPLES);
if (negativeSamples > NEGATIVE_SAMPLES - 1) {
Serial.println("Too many negative samples entered. All but one will be used instead");
negativeSamples = NEGATIVE_SAMPLES - 1;
}
loadDataset(positiveSamples, negativeSamples);
// ...
}
/**
* Ask the user to enter a numeric value
*/
int readSerialNumber(String prompt, int maxAllowed) {
Serial.print(prompt);
Serial.print(" (");
Serial.print(maxAllowed);
Serial.print(" max) ");
while (!Serial.available()) delay(1);
int n = Serial.readStringUntil('\n').toInt();
Serial.println(n);
return n;
}
/**
* Divide training and test data
*/
void loadDataset(int positiveSamples, int negativeSamples) {
int positiveTestSamples = POSITIVE_SAMPLES - positiveSamples;
for (int i = 0; i < positiveSamples; i++) {
memcpy(X_train[i], X_positive[i], FEATURES_DIM);
y_train[i] = 1;
}
for (int i = 0; i < negativeSamples; i++) {
memcpy(X_train[i + positiveSamples], X_negative[i], FEATURES_DIM);
y_train[i + positiveSamples] = -1;
}
for (int i = 0; i < positiveTestSamples; i++) {
memcpy(X_test[i], X_positive[i + positiveSamples], FEATURES_DIM);
y_test[i] = 1;
}
for (int i = 0; i < NEGATIVE_SAMPLES - negativeSamples; i++) {
memcpy(X_test[i + positiveTestSamples], X_negative[i + negativeSamples], FEATURES_DIM);
y_test[i + positiveTestSamples] = -1;
}
}
The code above is a preliminary step where you're asked to enter how many samples you will use for training of both positive and negative classes.
This way you can have multiple run of benchmarking without the need to re-compile and re-upload the sketch.
It also shows that the training process can be "dynamic", in the sense that you can tweak it at runtime as per your need.
time_t start = millis();
classifier.fit(X_train, y_train, positiveSamples + negativeSamples);
Serial.print("It took ");
Serial.print(millis() - start);
Serial.print("ms to train on ");
Serial.print(positiveSamples + negativeSamples);
Serial.println(" samples");
Training is actually a one line operation. Here we'll also logging how much time it takes to train.
void loop() {
// ...
int tp = 0;
int tn = 0;
int fp = 0;
int fn = 0;
start = millis();
for (int i = 0; i < TOTAL_SAMPLES - positiveSamples - negativeSamples; i++) {
int y_pred = classifier.predict(X_train, X_test[i]);
int y_true = y_test[i];
if (y_pred == y_true && y_pred == 1) tp += 1;
if (y_pred == y_true && y_pred == -1) tn += 1;
if (y_pred != y_true && y_pred == 1) fp += 1;
if (y_pred != y_true && y_pred == -1) fn += 1;
}
Serial.print("It took ");
Serial.print(millis() - start);
Serial.print("ms to test on ");
Serial.print(TOTAL_SAMPLES - positiveSamples - negativeSamples);
Serial.println(" samples");
printConfusionMatrix(tp, tn, fp, fn);
}
/**
* Dump confusion matrix to Serial monitor
*/
void printConfusionMatrix(int tp, int tn, int fp, int fn) {
Serial.print("Overall accuracy ");
Serial.print(100.0 * (tp + tn) / (tp + tn + fp + fn));
Serial.println("%");
Serial.println("Confusion matrix");
Serial.print(" | Predicted 1 | Predicted -1 |\n");
Serial.print("----------------------------------------\n");
Serial.print("Actual 1 | ");
Serial.print(tp);
Serial.print(" | ");
Serial.print(fn);
Serial.print(" |\n");
Serial.print("----------------------------------------\n");
Serial.print("Actual -1 | ");
Serial.print(fp);
Serial.print(" | ");
Serial.print(tn);
Serial.print(" |\n");
Serial.print("----------------------------------------\n\n\n");
}
Finally we can run the classification on our test set and get the overall accuracy.
We also print the confusion matrix to double-check each class accuracy.
Check the full project code on Github where you'll also find another dataset to test, which is characterized by a number of features much higher (30 instead of 4).
L'articolo How to train a IRIS classification Machine learning classifier directly on your Arduino board proviene da Eloquent Arduino Blog.
]]>L'articolo Handwritten digit classification with Arduino and MicroML proviene da Eloquent Arduino Blog.
]]>
If this is the first time you're reading my blog, you may have missed that I'm on a journey to push the limits of Machine learning on embedded devices like the Arduino boards and ESP32.
I started with accelerometer data classification, then did Wifi indoor positioning as a proof of concept.
In the last weeks, though, I undertook a more difficult path that is image classification.
Image classification is where Convolutional Neural Networks really shine, but I'm here to question this settlement and demostrate that it is possible to come up with much lighter alternatives.
In this post we continue with the examples, replicating a "benchmark" dataset in Machine learning: the handwritten digits classification.
The objective of this example is to be able to tell what an handwritten digit is, taking as input a photo from the ESP32 camera.
In particular, we have 3 handwritten numbers and the task of our model will be to distinguish which image is what number.
I only have a single image per digit, but you're free to draw as many samples as you like: it should help improve the performance of you're classifier.
When dealing with images, if you use a CNN this step is often overlooked: CNNs are made on purpose to handle raw pixel values, so you just throw the image in and it is handled properly.
When using other types of classifiers, it could help add a bit of feature engineering to help the classifier doing its job and achieve high accuracy.
But not this time.
I wanted to be as "light" as possible in this demo, so I only took a couple steps during the feature acquisition:
I would hardly call this feature engineering.
This is an example of the result of this pipeline.
The code for this pipeline is really simple and is almost the same from the example on motion detection.
#include "esp_camera.h"
#define PWDN_GPIO_NUM -1
#define RESET_GPIO_NUM 15
#define XCLK_GPIO_NUM 27
#define SIOD_GPIO_NUM 22
#define SIOC_GPIO_NUM 23
#define Y9_GPIO_NUM 19
#define Y8_GPIO_NUM 36
#define Y7_GPIO_NUM 18
#define Y6_GPIO_NUM 39
#define Y5_GPIO_NUM 5
#define Y4_GPIO_NUM 34
#define Y3_GPIO_NUM 35
#define Y2_GPIO_NUM 32
#define VSYNC_GPIO_NUM 25
#define HREF_GPIO_NUM 26
#define PCLK_GPIO_NUM 21
#define FRAME_SIZE FRAMESIZE_QQVGA
#define WIDTH 160
#define HEIGHT 120
#define BLOCK_SIZE 5
#define W (WIDTH / BLOCK_SIZE)
#define H (HEIGHT / BLOCK_SIZE)
#define THRESHOLD 127
double features[H*W] = { 0 };
void setup() {
Serial.begin(115200);
Serial.println(setup_camera(FRAME_SIZE) ? "OK" : "ERR INIT");
delay(3000);
}
void loop() {
if (!capture_still()) {
Serial.println("Failed capture");
delay(2000);
return;
}
print_features();
delay(3000);
}
bool setup_camera(framesize_t frameSize) {
camera_config_t config;
config.ledc_channel = LEDC_CHANNEL_0;
config.ledc_timer = LEDC_TIMER_0;
config.pin_d0 = Y2_GPIO_NUM;
config.pin_d1 = Y3_GPIO_NUM;
config.pin_d2 = Y4_GPIO_NUM;
config.pin_d3 = Y5_GPIO_NUM;
config.pin_d4 = Y6_GPIO_NUM;
config.pin_d5 = Y7_GPIO_NUM;
config.pin_d6 = Y8_GPIO_NUM;
config.pin_d7 = Y9_GPIO_NUM;
config.pin_xclk = XCLK_GPIO_NUM;
config.pin_pclk = PCLK_GPIO_NUM;
config.pin_vsync = VSYNC_GPIO_NUM;
config.pin_href = HREF_GPIO_NUM;
config.pin_sscb_sda = SIOD_GPIO_NUM;
config.pin_sscb_scl = SIOC_GPIO_NUM;
config.pin_pwdn = PWDN_GPIO_NUM;
config.pin_reset = RESET_GPIO_NUM;
config.xclk_freq_hz = 20000000;
config.pixel_format = PIXFORMAT_GRAYSCALE;
config.frame_size = frameSize;
config.jpeg_quality = 12;
config.fb_count = 1;
bool ok = esp_camera_init(&config) == ESP_OK;
sensor_t *sensor = esp_camera_sensor_get();
sensor->set_framesize(sensor, frameSize);
return ok;
}
bool capture_still() {
camera_fb_t *frame = esp_camera_fb_get();
if (!frame)
return false;
// reset all the features
for (size_t i = 0; i < H * W; i++)
features[i] = 0;
// for each pixel, compute the position in the downsampled image
for (size_t i = 0; i < frame->len; i++) {
const uint16_t x = i % WIDTH;
const uint16_t y = floor(i / WIDTH);
const uint8_t block_x = floor(x / BLOCK_SIZE);
const uint8_t block_y = floor(y / BLOCK_SIZE);
const uint16_t j = block_y * W + block_x;
features[j] += frame->buf[i];
}
// apply threshold
for (size_t i = 0; i < H * W; i++) {
features[i] = (features[i] / (BLOCK_SIZE * BLOCK_SIZE) > THRESHOLD) ? 1 : 0;
}
return true;
}
void print_features() {
for (size_t i = 0; i < H * W; i++) {
Serial.print(features[i]);
if (i != H * W - 1)
Serial.print(',');
}
Serial.println();
}
To create your own dataset, you need a collection of handwritten digits.
You can do this part as you like, by using pieces of paper or a monitor. I used a tablet because it was well illuminated and I could open a bunch of tabs to keep a record of my samples.
As in the apple vs orange, keep in mind that you should be consistent during both the training phase and the inference phase.
This is why I used tape to fix my ESP32 camera to the desk and kept the tablet in the exact same position.
If you desire, you could experiment varying slightly the capturing setup during the training and see if your classifier still achieves good accuracy: this is a test I didn't make.
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
Okay, at this point you should have all the working pieces to do handwritten digit image classification on your ESP32 camera. Include your model in the sketch and run the classification.
#include "model.h"
void loop() {
if (!capture_still()) {
Serial.println("Failed capture");
delay(2000);
return;
}
Serial.print("Number: ");
Serial.println(classIdxToName(predict(features)));
delay(3000);
}
Done.
You can see a demo of my results in the video below.
My dataset is composed of 25 training samples in total and the SVM with linear kernel produced 17 support vectors.
On my M5Stick camera board, the overhead for the model is 6.8 Kb of flash and the inference takes 7ms: not that bad!
Check the full project code on Github
L'articolo Handwritten digit classification with Arduino and MicroML proviene da Eloquent Arduino Blog.
]]>L'articolo Even smaller Machine learning models for your MCU: up to -82% code size proviene da Eloquent Arduino Blog.
]]>
I chose SVM as my main focus of intereset for the MicroML framework because I knew the support vector encoding could be very memory efficient once ported to plain C. And it really is.
I was able to port many real-world models (gesture identification, wake word detection) to tiny microcontrollers like the old Arduino Nano (32 kb flash, 2 kb RAM).
The tradeoff of my implementation was to sacrifice the flash space (which is usually quite big) to save as much RAM as possible, which is usually the most limiting factor.
Due to this implementation, if your model grows in size (highly dimensional data or not well separable data), the generated code will still fit in the RAM, but "overflow" the available flash.
In a couple of my previous post I warned that model selection might be a required step before being able to deploy a model to a MCU, since you should first check if it fits. If not, you must train another model hoping to get fewer support vectors, since each of them contributes to the code size increase.
It was by chance that I came across a new algorithm that I never heard of, called Relevance Vector Machine. It was patented by Microsoft until last year (so maybe this is the reason you don't see it in the wild), but now it is free of use as far as I can tell.
Here is the link to the paper if you want to read it, it gives some insights into the development process.
I'm not a mathematician, so I can't describe it accurately, but in a few words it uses the same formulation of SVM (a weightened sum of kernels), applying a Bayesan model.
This serves in the first place to be able to get the probabilities of the classification results, which is something totally missing in SVM.
In the second place, the algorithm tries to learn a much more sparse representation of the support vectors, as you can see in the following picture.
When I first read the paper my first tought was just "wow"! This is exactly what I need for my MicroML framework: a ultra-lightweight model which can still achieve high accuracy.
Now that I knew this algorithm, I searched for it in the sklearn
documentation: it was not there.
It seems that, since it was patented, they didn't have an implementation.
Fortunately, there is an implementation which follows the sklearn paradigm. You have to install it:
pip install Cython
pip install https://github.com/AmazaspShumik/sklearn_bayes/archive/master.zip
Since the interface is the usual fit
predict
, it is super easy to train a classifier.
from sklearn.datasets import load_iris
from skbayes.rvm_ard_models import RVC
import warnings
# I get tons of boring warnings during training, so turn it off
warnings.filterwarnings("ignore")
iris = load_iris()
X = iris.data
y = iris.target
clf = RVC(kernel='rbf', gamma=0.001)
clf.fit(X, y)
y_predict = clf.predict(X)
The parameters for the constructor are similar to those of the SVC
classifier from sklearn:
kernel
: one of linear, poly, rbfdegree
: if kernel=poly
gamma
: if kernel=poly
or kernel=rbf
You can read the docs from sklearn to learn more.
Now that we have a trained classifier, we have to port it to plain C that compiles on our microcontroller of choice.
I patched my package micromlgen
to do the job for you, so you should install the latest version to get it working.
pip install --upgrade micromlgen
Now the export part is almost the same as with an SVM classifier.
from micromlgen import port_rvm
clf = get_rvm_classifier()
c_code = port_rvm(clf)
print(c_code)
And you're done: you have plain C code you can embed in any microcontroller.
To test the effectiveness of this new algorithm, I applied it to the datasets I built in my previous posts, comparing side by side the size and accuracy of both SVM and RVM.
The results are summarized in the next table.
Dataset | SVM | RVM | Delta | |||
---|---|---|---|---|---|---|
Flash(byte) | Acc. (%) | Flash(byte) | Acc. (%) | Flash | Acc. | |
RGB colors | 4584 | 100 | 3580 | 100 | -22% | -0% |
Accelerometer gestures(linear kernel) | 36888 | 92 | 7056 | 85 | -80% | -7% |
Accelerometer gestures(gaussian kernel) | 45348 | 95 | 7766 | 95 | -82% | -0% |
Wifi positioning | 4641 | 100 | 3534 | 100 | -24% | -0% |
Wake word(linear kernel) | 18098 | 86 | 3602 | 53 | -80% | -33% |
Wake word(gaussian kernel) | 21788 | 90 | 4826 | 62 | -78% | -28% |
** the accuracy reported are with default parameters, without any tuning, averaged in 30 runs
As you may see, the results are quite surpising:
As in any situation, you should test which one of the two algorithms works best for your use case, but there a couple of guidelines you may follow:
As a reference, here is the codes generated for an SVM classifier and an RVM one to classify the IRIS dataset.
uint8_t predict_rvm(double *x) {
double decision[3] = { 0 };
decision[0] = -0.6190847299428206;
decision[1] = (compute_kernel(x, 6.3, 3.3, 6.0, 2.5) - 72.33233 ) * 0.228214 + -2.3609625;
decision[2] = (compute_kernel(x, 7.7, 2.8, 6.7, 2.0) - 81.0089166 ) * -0.29006 + -3.360963;
uint8_t idx = 0;
double val = decision[0];
for (uint8_t i = 1; i < 3; i++) {
if (decision[i] > val) {
idx = i;
val = decision[i];
}
}
return idx;
}
int predict_svm(double *x) {
double kernels[10] = { 0 };
double decisions[3] = { 0 };
int votes[3] = { 0 };
kernels[0] = compute_kernel(x, 6.7 , 3.0 , 5.0 , 1.7 );
kernels[1] = compute_kernel(x, 6.0 , 2.7 , 5.1 , 1.6 );
kernels[2] = compute_kernel(x, 5.1 , 2.5 , 3.0 , 1.1 );
kernels[3] = compute_kernel(x, 6.0 , 3.0 , 4.8 , 1.8 );
kernels[4] = compute_kernel(x, 7.2 , 3.0 , 5.8 , 1.6 );
kernels[5] = compute_kernel(x, 4.9 , 2.5 , 4.5 , 1.7 );
kernels[6] = compute_kernel(x, 6.2 , 2.8 , 4.8 , 1.8 );
kernels[7] = compute_kernel(x, 6.0 , 2.2 , 5.0 , 1.5 );
kernels[8] = compute_kernel(x, 4.8 , 3.4 , 1.9 , 0.2 );
kernels[9] = compute_kernel(x, 5.1 , 3.3 , 1.7 , 0.5 );
decisions[0] = 20.276395502
+ kernels[0] * 100.0
+ kernels[1] * 100.0
+ kernels[3] * -79.351629954
+ kernels[4] * -49.298850195
+ kernels[6] * -40.585178082
+ kernels[7] * -30.764341769
;
decisions[1] = -0.903345464
+ kernels[2] * 0.743494115
+ kernels[9] * -0.743494115
;
decisions[2] = -1.507856504
+ kernels[5] * 0.203695177
+ kernels[8] * -0.160020702
+ kernels[9] * -0.043674475
;
votes[decisions[0] > 0 ? 0 : 1] += 1;
votes[decisions[1] > 0 ? 0 : 2] += 1;
votes[decisions[2] > 0 ? 1 : 2] += 1;
int classVal = -1;
int classIdx = -1;
for (int i = 0; i < 3; i++) {
if (votes[i] > classVal) {
classVal = votes[i];
classIdx = i;
}
}
return classIdx;
}
As you can see, RVM actually only computes 2 kernels and does 2 multiplications. SVM, on the other hand, computes 10 kernels and does 13 multiplications.
This is a recurring pattern, so RVM is much much faster in the inference process.
micromlgen
and in particular port_rvm
are work in progress: you may experience some glitches or it may not work in your specific case. Please report any issue on the Github repo.
L'articolo Even smaller Machine learning models for your MCU: up to -82% code size proviene da Eloquent Arduino Blog.
]]>L'articolo Apple or Orange? Image recognition with ESP32 and Arduino proviene da Eloquent Arduino Blog.
]]>Want to do image recognition directly on your ESP32, without a PC?
In this post we'll look into a very basic image recognition task: distinguish apples from oranges with machine learning.
Image recognition is a very hot topic these days in the AI/ML landscape. Convolutional Neural Networks really shines in this task and can achieve almost perfect accuracy on many scenarios.
Sadly, you can't run CNN on your ESP32, they're just too large for a microcontroller.
Since in this series about Machine Learning on Microcontrollers we're exploring the potential of Support Vector Machines (SVMs) at solving different classification tasks, we'll take a look into image classification too.
In a previous post about color identification with Machine learning, we used an Arduino to detect the object we were pointing at with a color sensor (TCS3200) by its color: if we detected yellow, for example, we knew we had a banana in front of us.
Of course such a process is not object recognition at all: yellow may be a banane, or a lemon, or an apple.
Object inference, in that case, works only if you have exactly one object for a given color.
The objective of this post, instead, is to investigate if we can use the MicroML framework to do simple image recognition on the images from an ESP32 camera.
This is much more similar to the tasks you do on your PC with CNN or any other form of NN you are comfortable with. Sure, we will still apply some restrictions to fit the problem on a microcontroller, but this is a huge step forward compared to the simple color identification.
As any beginning machine learning project about image classification worth of respect, our task will be to distinguish an orange from an apple.
I have to admit that I rarely use NN, so I may be wrong here, but from the examples I read online it looks to me that features engineering is not a fundamental task with NN.
Those few times I used CNN, I always used the whole image as input, as-is. I didn't extracted any feature from them (e.g. color histogram): the CNN worked perfectly fine with raw images.
I don't think this will work best with SVM, but in this first post we're starting as simple as possible, so we'll be using the RGB components of the image as our features. In a future post, we'll introduce additional features to try to improve our results.
I said we're using the RGB components of the image. But not all of them.
Even at the lowest resolution of 160x120 pixels, a raw RGB image from the camera would generate 160x120x3 = 57600 features: way too much.
We need to reduce this number to the bare minimum.
How much pixels do you think are necessary to get reasonable results in this task of classifying apples from oranges?
You would be surprised to know that I got 90% accuracy with an RGB image of 8x6!
Yes, that's all we really need to do a good enough classification.
You can distinguish apples from oranges on ESP32 with 8x6 pixels only!
Click To Tweet
Of course this is a tradeoff: you can't expect to achieve 99% accuracy while mantaining the model size small enough to fit on a microcontroller. 90% is an acceptable accuracy for me in this context.
You have to keep in mind, moreover, that the features vector size grows quadratically with the image size (if you keep the aspect ratio). A raw RGB image of 8x6 generates 144 features: an image of 16x12 generates 576 features. This was already causing random crashes on my ESP32.
So we'll stick to 8x6 images.
Now, how do you compact a 160x120 image to 8x6? With downsampling.
This is the same tecnique we've used in the post about motion detection on ESP32: we define a block size and average all the pixels inside the block to get a single value (you can refer to that post for more details).
This time, though, we're working with RGB images instead of grayscale, so we'll repeat the exact same process 3 times, one for each channel.
This is the code excerpt that does the downsampling.
uint16_t rgb_frame[HEIGHT / BLOCK_SIZE][WIDTH / BLOCK_SIZE][3] = { 0 };
void grab_image() {
for (size_t i = 0; i < len; i += 2) {
// get r, g, b from the buffer
// see later
const size_t j = i / 2;
// transform x, y in the original image to x, y in the downsampled image
// by dividing by BLOCK_SIZE
const uint16_t x = j % WIDTH;
const uint16_t y = floor(j / WIDTH);
const uint8_t block_x = floor(x / BLOCK_SIZE);
const uint8_t block_y = floor(y / BLOCK_SIZE);
// average pixels in block (accumulate)
rgb_frame[block_y][block_x][0] += r;
rgb_frame[block_y][block_x][1] += g;
rgb_frame[block_y][block_x][2] += b;
}
}
The ESP32 camera can store the image in different formats (of our interest — there are a couple more available):
For our purpose, we'll use the RGB565 format and extract the 3 components from the 2 bytes with the following code.
config.pixel_format = PIXFORMAT_RGB565;
for (size_t i = 0; i < len; i += 2) {
const uint8_t high = buf[i];
const uint8_t low = buf[i+1];
const uint16_t pixel = (high << 8) | low;
const uint8_t r = (pixel & 0b1111100000000000) >> 11;
const uint8_t g = (pixel & 0b0000011111100000) >> 6;
const uint8_t b = (pixel & 0b0000000000011111);
}
Now that we can grab the images from the camera, we'll need to take a few samples of each object we want to racognize.
Before doing so, we'll linearize the image matrix to a 1-dimensional vector, because that's what our prediction function expects.
#define H (HEIGHT / BLOCK_SIZE)
#define W (WIDTH / BLOCK_SIZE)
void linearize_features() {
size_t i = 0;
double features[H*W*3] = {0};
for (int y = 0; y < H; y++) {
for (int x = 0; x < W; x++) {
features[i++] = rgb_frame[y][x][0];
features[i++] = rgb_frame[y][x][1];
features[i++] = rgb_frame[y][x][2];
}
}
// print to serial
for (size_t i = 0; i < H*W*3; i++) {
Serial.print(features[i]);
Serial.print('\t');
}
Serial.println();
}
Now you can setup your acquisition environment and take the samples: 15-20 of each object will do the job.
To train the classifier, save the features for each object in a file, one features vector per line. Then follow the steps on how to train a ML classifier for Arduino to get the exported model.
You can experiment with different classifier configurations.
My features were well distinguishable, so I had great results (100% accuracy) with any kernel (even linear).
One odd thing happened with the RBF kernel: I had to use an extremely low gamma value (0.0000001). Does anyone can explain me why? I usually go with a default value of 0.001.
The model produced 13 support vectors.
I did no features scaling: you could try it if classifying more than 2 classes and having poor results.
If you followed all the steps above, you should now have a model capable of detecting if your camera is shotting an apple or an orange, as you can see in the following video.
The little white object you see at the bottom of the image is the camera, taped to the desk.
Did you think it was possible to do simple image classification on your ESP32?
This is not full-fledged object recognition: it can't label objects while you walk as Tensorflow can do, for example.
You have to carefully craft your setup and be as consistent as possible between training and inferencing.
Still, I think this is a fun proof-of-concept that can have useful applications in simple scenarios where you can live with a fixed camera and don't want to use a full Raspberry Pi.
In the next weeks I settled to finally try TensorFlow Lite for Microcontrollers on my ESP32, so I'll try to do a comparison between them and this example and report my results.
Now that you can do image classification on your ESP32, can you think of a use case you will be able to apply this code to?
Let me know in the comments, we could even try realize it together if you need some help.
Check the full project code on Github
L'articolo Apple or Orange? Image recognition with ESP32 and Arduino proviene da Eloquent Arduino Blog.
]]>L'articolo Embedded Machine learning on Attiny85 proviene da Eloquent Arduino Blog.
]]>
When I first run a Machine learning project on my Arduino Nano (old generation), it already felt a big achievement. I mean, that board has only 32 Kb of program space and 2 Kb of RAM and you can buy a chinese clone for around 2.50 $.
It already opened the path to a embedded machine learning at a new scale, given the huge amount of microcontrollers ready to become "intelligent".
But it was not enough for me: after all, the MicroML generator exports plain C that should run on any embedded system, not only on Arduino boards.
So I setup to test if I could go even smaller and run it on the #1 of tiny chips: the Attiny85.
MicroML exports plain C that could run anywhere, not only on Arduino boards.
Click To Tweet
No, I couldn't.
The generated code makes use of a variadic function, which seems to not be supported by the Attiny compiler in the Arduino IDE.
So I had to come up with an alternative implementation to make it work.
Fortunately I already experimented with a non-variadic version when first writing the porter, so it was a matter of refreshing that algorithm and try it out.
Guess what? It compiled!
So I tried porting one my earliear tutorial (the color identification one) to the Attiny and...
Boom! Machine learning on an Attiny85!
Click To Tweet
Here's a step-by-step tutorial on how you can do it too.
We're going to use the RGB components of a color sensor (TCS3200 in my case) to infer which object we're pointing it at. This means our features are going to be 3-dimensional, which leads to a really simple model with very high accuracy.
You must do this step on a board with a Serial interface, like an Arduino Uno / Nano / Pro Mini. See the original tutorial for the code of this step.
This part is exactly the same as the original, except for a single parameter: you will pass platform=attiny
to the port
function.
from sklearn.svm import SVC
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = SVC(kernel='linear').fit(X, y)
c_code = port(classifier, classmap=classmap, platform='attiny')
print(c_code)
At this point you have to copy the printed code and import it in your project, in a file called model.h
.
Since we don't have a Serial, we will blink a LED a number of times dependant on the prediction result.
#include "model.h"
#define LED 0
void loop() {
readRGB();
classify();
delay(1000);
}
void classify() {
for (uint8_t times = predict(features) + 1; times > 0; times--) {
digitalWrite(LED, HIGH);
delay(10);
digitalWrite(LED, LOW);
delay(10);
}
}
Here we are: put some colored object in front of the sensor and see the LED blink.
On my machine, the sketch requires 3434 bytes (41%) of program space and 21 bytes (4%) of RAM. This means you could actually run machine learning in even less space than what the Attiny85 provides.
This model in particular it's so tiny you can run in even on an Attiny45, which has only 4 Kb of flash and 256 bytes of RAM.
I'd like you to look at the RAM figure for a moment: 21 bytes. 21 bytes is all the memory you need to run a Machine learning algorithm on a microcontroller. This is the result of the implementation I chose: the least RAM overhead possible. I challenge you to go any lower than this.
21 bytes is all the memory you need to run a Machine learning algorithm on a microcontroller
Click To Tweet
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github
L'articolo Embedded Machine learning on Attiny85 proviene da Eloquent Arduino Blog.
]]>L'articolo Word classification using Arduino and MicroML proviene da Eloquent Arduino Blog.
]]>
In this project the features are going to be the Fast Fourier Transform of 50 analog readings from a microphone, taken starting from when a loud sound is detected, sampled at intervals of 5 millis.
The microphone we're going to use is a super simple device: it produces an analog signal (0-1024) based on the sound it detects.
When working with audio you almost always don't want to use raw readings, since they're hardly useful. Instead you often go with Fourier Transform, which extracts the frequency information from a time signal. That's going to become our features vector: let's see how in the next step.
First of all, we start with raw audio data. The following plot is me saying random words.
#define MIC A0
#define INTERVAL 5
void setup() {
Serial.begin(115200);
pinMode(MIC, INPUT);
}
void loop() {
Serial.println(analogRead(MIC));
delay(INTERVAL);
}
For the Fourier Transform to work, we need to provide as input an array of values both positive and negative. analogRead()
is returning only positive values, tough, so we need to translate them.
int16_t readMic() {
// this translated the analog value to a proper interval
return (analogRead(MIC) - 512) >> 2;
}
As in the tutorial about gesture classification, we'll start recording the features when a word is beginning to be pronounced. Also in this project we'll use a threshold to detect the start of a word.
To do this, we first record a "background" sound level, that is the value produced by the sensor when we're not talking at all.
float backgroundSound = 0;
void setup() {
Serial.begin(115200);
pinMode(MIC, INPUT);
calibrate();
}
void calibrate() {
for (int i = 0; i < 200; i++)
backgroundSound += readMic();
backgroundSound /= 200;
Serial.print("Background sound level is ");
Serial.println(backgroundSound);
}
At this point we can check for the starting of a word when the detected sound level exceeds tha background one by a given threshold.
// adjust as per your need
// it will depend on the sensitivity of you microphone
#define SOUND_THRESHOLD 3
void loop() {
if (!soundDetected()) {
delay(10);
return;
}
}
bool soundDetected() {
return abs(read() - backgroundSound) >= SOUND_THRESHOLD;
}
As for the gestures, we'll record a fixed number of readings at a fixed interval.
Here a tradeoff arises: you want to have a decent number of readings to be able to accurately describe the words you want to classify, but not too much otherwise your model is going to be too large to fit in your board.
I made some experiments, and I got good results with 32 samples at 5 millis interval, which covers ~150 ms of speech.
#define NUM_SAMPLES 32
#define INTERVAL 5
float features[NUM_SAMPLES];
double featuresForFFT[NUM_SAMPLES];
void loop() {
if (!soundDetected()) {
delay(10);
return;
}
captureWord();
printFeatures();
delay(1000);
}
void captureWord() {
for (uint16_t i = 0; i < NUM_SAMPLES; i++) {
features[i] = readMic();
delay(INTERVAL);
}
}
void printFeatures() {
const uint16_t numFeatures = sizeof(features) / sizeof(float);
for (int i = 0; i < numFeatures; i++) {
Serial.print(features[i]);
Serial.print(i == numFeatures - 1 ? 'n' : ',');
}
}
Here we are with the Fourier Transform. When implemented in software, the most widely implementation of the FT is actually called Fast Fourier Transform (FFT), which is - as you may guess - a fast implementation of the FT.
Luckily for us, there exists a library for Arduino that does FFT.
And is so easy to use that we only need a line to get usable results!
#include <arduinoFFT.h>
arduinoFFT fft;
void captureWord() {
for (uint16_t i = 0; i < NUM_SAMPLES; i++) {
featuresForFFT[i] = readMic();
delay(INTERVAL);
}
fft.Windowing(featuresForFFT, NUM_SAMPLES, FFT_WIN_TYP_HAMMING, FFT_FORWARD);
for (int i = 0; i < NUM_SAMPLES; i++)
features[i] = featuresForFFT[i];
}
You don't need to know what the Windowing
function actually does (I don't either): what matters is that it extracts meaningful informations from our signal. Since it overwrites the features array, after calling that line we have what we need to input to our classifier.
At this point, record 10-15 samples for each word and save them to a file, one for each word.
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
In this project on Machine learning we're not achieving 100% accuracy easily.
Audio is quite noise, so you should experiment with a few params for the classifier and choose the ones that perform best. I'll showcase a few examples:
Here's an overview table of the 3 tests I did.
Kernel | No. support vectors | Avg. accuracy |
---|---|---|
Linear | 22 | 87% |
Poly 3 | 29 | 91% |
RBF | 36 | 94% |
Of course the one with the RBF kernel would be the most desiderable since it has a very high accuracy: 36 support vectors, tough, will produce a model too large to fit on an Arduino Nano.
So you're forced to pick the one with the highest accuracy that fit on your board: in my case it was the Linear kernel one.
#include "model.h"
void loop() {
if (!soundDetected()) {
delay(10);
return;
}
captureWord();
Serial.print("You said ");
Serial.println(classIdxToName(predict(features)));
delay(1000);
}
And that's it: word classification through machine learning on your Arduino board! Say some word and see the classification result on the Serial monitor.
Here's me testing the system (English is not my language, so forgive my bad pronounce). The video quality is very low, I know, but you get the point.
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github
L'articolo Word classification using Arduino and MicroML proviene da Eloquent Arduino Blog.
]]>L'articolo Wifi indoor positioning using Arduino and Machine Learning in 4 steps proviene da Eloquent Arduino Blog.
]]>
We all are used to GPS positioning: our device will use satellites to track our position on Earth. GPS works very well and with a very high accuracy (you can expect only a few meters of error).
But it suffers a problem: it needs Line of Sight (a clear path from your device to the satellites). If you're not in an open place, like inside a building, you're out of luck.
The task of detecting where you are when GPS localization is not an option is called indoor positioning: it could be in a building, an airport, a parking garage.
There are lots of different approaches to this task (Wikipedia lists more than 10 of them), each with a varying level of commitment, difficulty, cost and accuracy.
For this tutorial I opted for one that is both cost-efficient, easy to implement and readily available in most of the locations: WiFi indoor positioning.
In this tutorial about Machine learning on Arduino we're going to use Wifi indoor positioning to detect in which room of our house / office we are. This is the most basic task we can accomplish and will get us a feeling of level of accuracy we can achieve with such a simple setup.
On this basis, we'll construct more sophisticated projects in future posts.
To accomplish this tutorial, you really need 2 things:
If you're doing this at home or in your office, there's a good change your neighbours have WiFi networks in their apartments you can leverage. If you live in an isolated contryside, sorry, this will not work for you.
So, how exactly does Wifi indoor positioning works in conjuction with Machine learning?
Let's pretend there are 5 different WiFi networks around you, like in the picture below.
As you can see there are two markers on the map: each of these markers will "see" different networks, with different signal strengths (a.k.a RSSI).
As you move around, those numbers will change: each room will be identified by the unique combination of the RSSIs.
The features for this project are going to be the RSSIs (Received signal strength indication) of the known WiFi networks. If a network is out of range, it will have an RSSI equal to 0.
Before actually recording the sample data to train our classifier, we need to do some preliminary work. This is because not all networks will be visible all the time: we have to work, however, with a fixed number of features.
First of all we need to enumerate all the networks we will encounter during the inference process.
To begin, we take a "reconnaissance tour" of the locations we want to predict and log all the networks we detect. Load the following sketch and take note of all the networks that appear on the Serial monitor.
#include <WiFi.h>
void setup() {
Serial.begin(115200);
WiFi.mode(WIFI_STA);
WiFi.disconnect();
}
void loop() {
int numNetworks = WiFi.scanNetworks();
for (int i = 0; i < numNetworks; i++) {
Serial.println(WiFi.SSID(i));
delay(3000);
}
Now that we have a bunch of SSIDs, we need to assign each SSID to a fixed index, from 0 to MAX_NETWORKS
.
You can implement this part as you like, but in this demo I'll make use of a class I wrote called Array
(you can see the source code and example on Github), which implements 2 useful functions:
push()
to add an element to the arrayindexOf()
to get the index of an element.See how to install the Eloquent library if you don't have it already installed.
At this point we populate the array with all the networks we saved from the reconnaissance tour.
#include <eDataStructures.h>
#define MAX_NETWORKS 10
using namespace Eloquent::DataStructures;
double features[MAX_NETWORKS];
Array<String, MAX_NETWORKS> knownNetworks("");
void setup() {
Serial.begin(115200);
WiFi.mode(WIFI_STA);
WiFi.disconnect();
knownNetworks.push("SSID #0");
knownNetworks.push("SSID #1");
knownNetworks.push("SSID #2");
knownNetworks.push("SSID #3");
// and so on
}
The second step is to convert the scan results into a features vector. Each feature will be the RSSI of the given SSID, in the exact order we populated the knownNetworks
array.
In practice:
features[0] == RSSI of SSID #0;
features[1] == RSSI of SSID #1;
features[2] == RSSI of SSID #2;
features[3] == RSSI of SSID #3;
// and so on
The code below will do the job.
void loop() {
scan();
printFeatures();
delay(3000);
}
void scan() {
int numNetworks = WiFi.scanNetworks();
resetFeatures();
// assign RSSIs to feature vector
for (int i = 0; i < numNetworks; i++) {
String ssid = WiFi.SSID(i);
uint16_t networkIndex = knownNetworks.indexOf(ssid);
// only create feature if the current SSID is a known one
if (!isnan(networkIndex))
features[networkIndex] = WiFi.RSSI(i);
}
}
// reset all features to 0
void resetFeatures() {
const uint16_t numFeatures = sizeof(features) / sizeof(double);
for (int i = 0; i < numFeatues; i++)
features[i] = 0;
}
void printFeatures() {
const uint16_t numFeatures = sizeof(features) / sizeof(float);
for (int i = 0; i < numFeatures; i++) {
Serial.print(features[i]);
Serial.print(i == numFeatures - 1 ? 'n' : ',');
}
}
Grab some recordings just staying in a location for a few seconds and save the serial output to a file; then move to the next location and repeat: 10-15 samples for each location will suffice.
If you do a good job, you should end with distinguible features, as show in the plot below.
// replace
features[networkIndex] = WiFi.RSSI(i);
// with
#define MIN_RSSI -90 // adjust to your needs
features[networkIndex] = WiFi.RSSI(i) > MIN_RSSI ? WiFi.RSSI(i) : 0;
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
#include "model.h"
void loop() {
scan();
classify();
delay(3000);
}
void classify() {
Serial.print("You are in ");
Serial.println(classIdxToName(predict(features)));
}
Move around your house/office/whatever and see your location printed on the serial monitor!
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github
L'articolo Wifi indoor positioning using Arduino and Machine Learning in 4 steps proviene da Eloquent Arduino Blog.
]]>L'articolo How to do Gesture identification through machine learning on Arduino proviene da Eloquent Arduino Blog.
]]>
We're going to use the accelerations along the 3 axis (X, Y, Z) coming from an IMU to infer which gesture we're playing. We'll use a fixed number of recordings (NUM_SAMPLES
) starting from the first detection of movement.
This means our feature vectors are going to be of dimension 3 * NUM_SAMPLES
, which can become too large to fit in the memory of the Arduino Nano. We'll start with a low value for NUM_SAMPLES
to keep it as leaner as possible: if your classifications suffer from poor accuracy, you can increase this number.
First of all, we need to read the raw data from the IMU. This piece of code will be different based on the specific chip you use. To keep things consistent, we'll wrap the IMU logic in 2 functions: imu_setup
and imu_read
.
I'll report a couple of example implementations for the MPU6050
and the MPU9250
(these are the chip I have at hand). You should save whichever code you use in a file called imu.h
.
#include "Wire.h"
// library from https://github.com/jrowberg/i2cdevlib/tree/master/Arduino/MPU6050
#include "MPU6050.h"
#define OUTPUT_READABLE_ACCELGYRO
MPU6050 imu;
void imu_setup() {
Wire.begin();
imu.initialize();
}
void imu_read(float *ax, float *ay, float *az) {
int16_t _ax, _ay, _az, _gx, _gy, _gz;
imu.getMotion6(&_ax, &_ay, &_az, &_gx, &_gy, &_gz);
*ax = _ax;
*ay = _ay;
*az = _az;
}
#include "Wire.h"
// library from https://github.com/bolderflight/MPU9250
#include "MPU9250.h"
MPU9250 imu(Wire, 0x68);
void imu_setup() {
Wire.begin();
imu.begin();
}
void imu_read(float *ax, float *ay, float *az) {
imu.readSensor();
*ax = imu.getAccelX_mss();
*ay = imu.getAccelY_mss();
*az = imu.getAccelZ_mss();
}
In the main .ino file, we dump the values to Serial monitor / plotter.
#include "imu.h"
#define NUM_SAMPLES 30
#define NUM_AXES 3
// sometimes you may get "spikes" in the readings
// set a sensible value to truncate too large values
#define TRUNCATE_AT 20
double features[NUM_SAMPLES * NUM_AXES];
void setup() {
Serial.begin(115200);
imu_setup();
}
void loop() {
float ax, ay, az;
imu_read(&ax, &ay, &az);
ax = constrain(ax, -TRUNCATE_AT, TRUNCATE_AT);
ay = constrain(ay, -TRUNCATE_AT, TRUNCATE_AT);
az = constrain(az, -TRUNCATE_AT, TRUNCATE_AT);
Serial.print(ax);
Serial.print('\t');
Serial.print(ay);
Serial.print('\t');
Serial.println(az);
}
Open the Serial plotter and make some movement to have an idea of the range of your readings.
Due to gravity, we get a stable value of -9.8 on the Z axis at rest (you can see this in the previous image). Since I'd like to have almost 0 at rest, I created a super simple calibration procedure to remove this fixed offset from the readings.
double baseline[NUM_AXES];
double features[NUM_SAMPLES * NUM_AXES];
void setup() {
Serial.begin(115200);
imu_setup();
calibrate();
}
void loop() {
float ax, ay, az;
imu_read(&ax, &ay, &az);
ax = constrain(ax - baseline[0], -TRUNCATE, TRUNCATE);
ay = constrain(ay - baseline[1], -TRUNCATE, TRUNCATE);
az = constrain(az - baseline[2], -TRUNCATE, TRUNCATE);
}
void calibrate() {
float ax, ay, az;
for (int i = 0; i < 10; i++) {
imu_read(&ax, &ay, &az);
delay(100);
}
baseline[0] = ax;
baseline[1] = ay;
baseline[2] = az;
}
Much better.
Now we need to check if motion is happening. To keep it simple, we'll use a naive approach that will look for an high value in the acceleration: if a threshold is exceeded, a gesture is starting.
If you did the calibration step, a threshold of 5 should work well. If you didn't calibrate, you have to come up with a value that suits your needs.
#include imu.h
#define ACCEL_THRESHOLD 5
void loop() {
float ax, ay, az;
imu_read(&ax, &ay, &az);
ax = constrain(ax - baseline[0], -TRUNCATE, TRUNCATE);
ay = constrain(ay - baseline[1], -TRUNCATE, TRUNCATE);
az = constrain(az - baseline[2], -TRUNCATE, TRUNCATE);
if (!motionDetected(ax, ay, az)) {
delay(10);
return;
}
}
bool motionDetected(float ax, float ay, float az) {
return (abs(ax) + abs(ay) + abs(az)) > ACCEL_THRESHOLD;
}
If no motion is happening, we don't take any action and keep watching. If motion is happening, we print the next NUM_SAMPLES
readings to Serial.
void loop() {
float ax, ay, az;
imu_read(&ax, &ay, &az);
ax = constrain(ax - baseline[0], -TRUNCATE, TRUNCATE);
ay = constrain(ay - baseline[1], -TRUNCATE, TRUNCATE);
az = constrain(az - baseline[2], -TRUNCATE, TRUNCATE);
if (!motionDetected(ax, ay, az)) {
delay(10);
return;
}
recordIMU();
printFeatures();
delay(2000);
}
void recordIMU() {
float ax, ay, az;
for (int i = 0; i < NUM_SAMPLES; i++) {
imu_read(&ax, &ay, &az);
ax = constrain(ax - baseline[0], -TRUNCATE, TRUNCATE);
ay = constrain(ay - baseline[1], -TRUNCATE, TRUNCATE);
az = constrain(az - baseline[2], -TRUNCATE, TRUNCATE);
features[i * NUM_AXES + 0] = ax;
features[i * NUM_AXES + 1] = ay;
features[i * NUM_AXES + 2] = az;
delay(INTERVAL);
}
}
void printFeatures() {
const uint16_t numFeatures = sizeof(features) / sizeof(float);
for (int i = 0; i < numFeatures; i++) {
Serial.print(features[i]);
Serial.print(i == numFeatures - 1 ? 'n' : ',');
}
}
Record 15-20 samples for each geasture and save them to a file, one for each gesture. Since we're dealing with highly dimensional data, you should collect as much samples as possible, to average out the noise.
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
In this project on Machine learning, differently from the previous and simpler ones, we're not achieving 100% accuracy easily. Motion is quite noise, so you should experiment with a few params for the classifier and choose the ones that perform best. I'll showcase a few examples:
Now that we selected the best model, we have to export it to C code. Here comes the culprit: not all models will fit on your board.
The core of SVM (Support Vector Machines) are support vectors: each trained classifier will be characterized by a certain number of them. The problem is: if there're too much, the generated code will be too large to fit in your flash.
For this reason, instead of selecting the best model on accuracy, you should make a ranking, from the best performing to the worst. For each model, starting from the top, you should import it in your Arduino project and try to compile: if it fits, fine, you're done. Otherwise you should pick the next and try again.
It may seem a tedious process, but keep in mind that we're trying to infer a class from 90 features in 2 Kb of RAM and 32 Kb of flash: I think this is an acceptable tradeoff.
We're fitting a model to infer a class from 90 features in 2 Kb of RAM and 32 Kb of flash!
Click To Tweet
I'll report a few figures for different combinations I tested.
Kernel | C | Gamma | Degree | Vectors | Flash size | RAM (b) | Avg accuracy |
---|---|---|---|---|---|---|---|
RBF | 10 | 0.001 | - | 37 | 53 Kb | 1228 | 99% |
Poly | 100 | 0.001 | 2 | 12 | 25 Kb | 1228 | 99% |
Poly | 100 | 0.001 | 3 | 25 | 40 Kb | 1228 | 97% |
Linear | 50 | - | 1 | 40 | 55 Kb | 1228 | 95% |
RBF | 100 | 0.01 | - | 61 | 80 Kb | 1228 | 95% |
As you can see, we achieved a very high accuracy on the test set for all the classifiers: only one, though, fitted on the Arduino Nano. Of course, if you use a larger board, you can deploy the others too.
RAM
column: all the values are equal: this is because in the implementation is independant from the number of support vectors and only depends on the number of features.#include "model.h"
void loop() {
float ax, ay, az;
imu_read(&ax, &ay, &az);
ax = constrain(ax - baseline[0], -TRUNCATE, TRUNCATE);
ay = constrain(ay - baseline[1], -TRUNCATE, TRUNCATE);
az = constrain(az - baseline[2], -TRUNCATE, TRUNCATE);
if (!motionDetected(ax, ay, az)) {
delay(10);
return;
}
recordIMU();
classify();
delay(2000);
}
void classify() {
Serial.print("Detected gesture: ");
Serial.println(classIdxToName(predict(features)));
}
Here we are: it has been a long post, but now you can classify gestures with an Arduino Nano and 2 Kb of RAM.
No fancy Neural Networks, no Tensorflow, no 32-bit ARM processors: plain old SVM on plain old 8 bits with 97% accuracy.
Click To Tweet
Here's a short demo of me playing 3 gestures and getting the results on the serial monitor.
On my machine, the sketch targeted at the Arduino Nano (old generation) requires 25310 bytes (82%) of program space and 1228 bytes (59%) of RAM. This means you could actually run machine learning in even less space than what the Arduino Nano provides. So, the answer to the question Can I run machine learning on Arduino? is definetly YES.
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github
L'articolo How to do Gesture identification through machine learning on Arduino proviene da Eloquent Arduino Blog.
]]>L'articolo Morse alphabet identification on Arduino with Machine learning proviene da Eloquent Arduino Blog.
]]>
For our task we'll use a simple push button as input and a fixed number of samples taken at a fixed interval (100 ms), starting from the first detection of the button press. I chose to record 30 samples for each letter, but you can easily customize the value as per your needs.
With 30 samples at 100 ms frequency, we'll have 3 seconds to "type" the letter and on the Serial monitor will appear a sequence of 0s and 1s, representing if the button was pressed or not; the inference procedure will translate this sequence into a letter.
As a reference, here are a couple example of what we'll be working with.
// A (•‒)
0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
// D (‒••)
0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1
// E (•)
0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
To the bare minimum, we'll need a push button and two wires: one to ground and the other to a digital pin. Since in the example we'll make the button an INPUT_PULLUP
, we'll read 0 when the button is pressed and 1 when not.
All we need to do is detect a press and record the following 30 samples of the digital pin:
#define IN 4
#define NUM_SAMPLES 30
#define INTERVAL 100
double features[NUM_SAMPLES];
void setup() {
Serial.begin(115200);
pinMode(IN, INPUT_PULLUP);
}
void loop() {
if (digitalRead(IN) == 0) {
recordButtonStatus();
printFeatures();
delay(1000);
}
delay(10);
}
void recordButtonStatus() {
for (int i = 0; i < NUM_SAMPLES; i++) {
features[i] = digitalRead(IN);
delay(INTERVAL);
}
}
void printFeatures() {
const uint16_t numFeatures = sizeof(features) / sizeof(float);
for (int i = 0; i < numFeatures; i++) {
Serial.print(features[i]);
Serial.print(i == numFeatures - 1 ? 'n' : ',');
}
}
Open the Serial monitor and type a few times each letter: try to introduce some variations each time, for example waiting some more milliseconds before releasing the dash.
Save the recordings for each letter in a file named after the letter, so you will get meaningful results later on.
You may end with duplicate recordings: don't worry, that's not a problem. I'll paste my recordings for a few letters, as a reference.
// A (•‒)
0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1
0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1
// D (‒••)
0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,1,1,1,0,0,0,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1
// E (•)
0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
// S (•••)
0,0,0,1,1,1,0,0,0,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,1,1,1,1,0,0,0,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,1,1,1,1,0,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,1,1,1,1,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,1,1,1,1,0,0,1,1,1,1,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,1,1,1,1,0,0,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
// T (‒)
0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
If you do a good job, you should end with quite distinguible features, as show in the plot below.
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
#include "model.h"
void loop() {
if (digitalRead(IN) == 0) {
recordButtonStatus();
Serial.print("Detected letter: ");
Serial.println(classIdxToName(predict(features)));
delay(1000);
}
delay(10);
}
Type some letter using the push button and see the identified value printed on the serial monitor.
That’s it: you deployed machine learning in 2 Kb!
On my machine, the sketch targeted at the Arduino Nano (old generation) requires 12546 bytes (40%) of program space and 366 bytes (17%) of RAM. This means you could actually run machine learning in even less space than what the Arduino Nano provides. So, the answer to the question Can I run machine learning on Arduino? is definetly YES.
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github
L'articolo Morse alphabet identification on Arduino with Machine learning proviene da Eloquent Arduino Blog.
]]>L'articolo How to do color identification through machine learning on Arduino proviene da Eloquent Arduino Blog.
]]>
We're going to use the RGB components of a color sensor (TCS3200 in my case) to infer which object we're pointing it at. This means our features are going to be of 3-dimensional, which leads to a really simple model with very high accuracy.
You can do color identification on Arduino using Machine learning without Neural Networks #Arduino #microml #ml #tinyml #MachineLearning #ai #svm
Click To Tweet
We don't need any processing to get from the sensor readings to the feature vector, so the code will be straight-forward: read each component from the sensor and assign it to the features array. This part will vary based on the specific chip you have: I'll report the code for a TCS 230/3200.
#define S2 2
#define S3 3
#define sensorOut 4
double features[3];
void setup() {
Serial.begin(115200);
pinMode(S2, OUTPUT);
pinMode(S3, OUTPUT);
pinMode(sensorOut, INPUT);
}
void loop() {
readRGB();
printFeatures();
delay(100);
}
int readComponent(bool s2, bool s3) {
delay(10);
digitalWrite(S2, s2);
digitalWrite(S3, s3);
return pulseIn(sensorOut, LOW);
}
void readRGB() {
features[0] = readComponent(LOW, LOW);
features[1] = readComponent(HIGH, HIGH);
features[2] = readComponent(LOW, HIGH);
}
void printFeatures() {
const uint16_t numFeatures = sizeof(features) / sizeof(float);
for (int i = 0; i < numFeatures; i++) {
Serial.print(features[i]);
Serial.print(i == numFeatures - 1 ? 'n' : ',');
}
}
Open the Serial monitor and put some colored objects in front of the sensor: move the object a bit and rotate it, so the samples will include different shades of the color.
Save the recordings for each color in a file named after the color, so you will get meaningful results later on.
If you do a good job, you should end with distinguible features, as show in the contour plot below.
For a detailed guide refer to the tutorial
from sklearn.ensemble import RandomForestClassifier
from micromlgen import port
# put your samples in the dataset folder
# one class per file
# one feature vector per line, in CSV format
features, classmap = load_features('dataset/')
X, y = features[:, :-1], features[:, -1]
classifier = RandomForestClassifier(n_estimators=30, max_depth=10).fit(X, y)
c_code = port(classifier, classmap=classmap)
print(c_code)
At this point you have to copy the printed code and import it in your Arduino project, in a file called model.h
.
#include model.h
void loop() {
readRGB();
Serial.println(classIdxToName(predict(features)));
delay(1000);
}
Put some colored object in front of the sensor and see the identified object name printed on the serial monitor.
Given the simplicity of the task, you should easily achieve near 100% accuracy for different colors (I had some troubles distinguishing orange from yellow because of the bad illumination). Just be sure to replicate the exact same setup both during training and classification.
That’s it: you deployed machine learning in 2 Kb!
On my machine, the sketch targeted at the Arduino Nano (old generation) requires 5570 bytes (18%) of program space and 266 bytes (12%) of RAM. This means you could actually run machine learning in even less space than what the Arduino Nano provides. So, the answer to the question Can I run machine learning on Arduino? is definetly YES.
Did you find this tutorial useful? Was is it easy to follow or did I miss something? Let me know in the comments so I can keep improving the blog.
Check the full project code on Github
L'articolo How to do color identification through machine learning on Arduino proviene da Eloquent Arduino Blog.
]]>