rvm – Eloquent Arduino Blog http://eloquentarduino.github.io/ Machine learning on Arduino, programming & electronics Sun, 31 May 2020 16:50:57 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.6 Even smaller Machine learning models for your MCU: up to -82% code size https://eloquentarduino.github.io/2020/02/even-smaller-machine-learning-models-for-your-mcu/ Sat, 15 Feb 2020 16:37:32 +0000 https://eloquentarduino.github.io/?p=893 So far we've used SVM (Support Vector Machine) as our main classifier to port a Machine learning model to a microcontroller: but recently I found an interesting alternative which could be waaaay smaller, mantaining a similar accuracy. Table of contentsThe current stateA new algorithm: Relevance Vector MachinesTraining a classifierPorting to CPerformace comparisonSize comparisonDisclaimer The current […]

L'articolo Even smaller Machine learning models for your MCU: up to -82% code size proviene da Eloquent Arduino Blog.

]]>
So far we've used SVM (Support Vector Machine) as our main classifier to port a Machine learning model to a microcontroller: but recently I found an interesting alternative which could be waaaay smaller, mantaining a similar accuracy.

RVM vs SVM support vectors

The current state

I chose SVM as my main focus of intereset for the MicroML framework because I knew the support vector encoding could be very memory efficient once ported to plain C. And it really is.

I was able to port many real-world models (gesture identification, wake word detection) to tiny microcontrollers like the old Arduino Nano (32 kb flash, 2 kb RAM).

The tradeoff of my implementation was to sacrifice the flash space (which is usually quite big) to save as much RAM as possible, which is usually the most limiting factor.

Due to this implementation, if your model grows in size (highly dimensional data or not well separable data), the generated code will still fit in the RAM, but "overflow" the available flash.

In a couple of my previous post I warned that model selection might be a required step before being able to deploy a model to a MCU, since you should first check if it fits. If not, you must train another model hoping to get fewer support vectors, since each of them contributes to the code size increase.

A new algorithm: Relevance Vector Machines

It was by chance that I came across a new algorithm that I never heard of, called Relevance Vector Machine. It was patented by Microsoft until last year (so maybe this is the reason you don't see it in the wild), but now it is free of use as far as I can tell.

Here is the link to the paper if you want to read it, it gives some insights into the development process.

I'm not a mathematician, so I can't describe it accurately, but in a few words it uses the same formulation of SVM (a weightened sum of kernels), applying a Bayesan model.

This serves in the first place to be able to get the probabilities of the classification results, which is something totally missing in SVM.

In the second place, the algorithm tries to learn a much more sparse representation of the support vectors, as you can see in the following picture.

RVM vs SVM support vectors

When I first read the paper my first tought was just "wow"! This is exactly what I need for my MicroML framework: a ultra-lightweight model which can still achieve high accuracy.

Training a classifier

Now that I knew this algorithm, I searched for it in the sklearn documentation: it was not there.

It seems that, since it was patented, they didn't have an implementation.

Fortunately, there is an implementation which follows the sklearn paradigm. You have to install it:

pip install Cython
pip install https://github.com/AmazaspShumik/sklearn_bayes/archive/master.zip

Since the interface is the usual fit predict, it is super easy to train a classifier.

from sklearn.datasets import load_iris
from skbayes.rvm_ard_models import RVC
import warnings

# I get tons of boring warnings during training, so turn it off
warnings.filterwarnings("ignore")

iris = load_iris()
X = iris.data
y = iris.target
clf = RVC(kernel='rbf', gamma=0.001)
clf.fit(X, y)
y_predict = clf.predict(X)

The parameters for the constructor are similar to those of the SVC classifier from sklearn:

  • kernel: one of linear, poly, rbf
  • degree: if kernel=poly
  • gamma: if kernel=poly or kernel=rbf

You can read the docs from sklearn to learn more.

Porting to C

Now that we have a trained classifier, we have to port it to plain C that compiles on our microcontroller of choice.

I patched my package micromlgen to do the job for you, so you should install the latest version to get it working.

 pip install --upgrade micromlgen

Now the export part is almost the same as with an SVM classifier.

 from micromlgen import port_rvm

 clf = get_rvm_classifier()
 c_code = port_rvm(clf)
 print(c_code)

And you're done: you have plain C code you can embed in any microcontroller.

Performace comparison

To test the effectiveness of this new algorithm, I applied it to the datasets I built in my previous posts, comparing side by side the size and accuracy of both SVM and RVM.

The results are summarized in the next table.

Dataset SVM RVM Delta
Flash(byte) Acc. (%) Flash(byte) Acc. (%) Flash Acc.
RGB colors 4584 100 3580 100 -22% -0%
Accelerometer gestures(linear kernel) 36888 92 7056 85 -80% -7%
Accelerometer gestures(gaussian kernel) 45348 95 7766 95 -82% -0%
Wifi positioning 4641 100 3534 100 -24% -0%
Wake word(linear kernel) 18098 86 3602 53 -80% -33%
Wake word(gaussian kernel) 21788 90 4826 62 -78% -28%

** the accuracy reported are with default parameters, without any tuning, averaged in 30 runs

As you may see, the results are quite surpising:

  • you can achieve up to 82% space reduction on highly dimensional dataset without any loss in accuracy (accelerometer gestures with gaussian kernel)
  • sometimes you may not be able to achieve a decent accuracy (62% at most on the wake word dataset)

As in any situation, you should test which one of the two algorithms works best for your use case, but there a couple of guidelines you may follow:

  • if you need top accuracy, probably SVM can achieve slighter better performance if you have enough space
  • if you need tiny space or top speed, test if RVM achieves a satisfiable accuracy
  • if both SVM and RVM achieve comparable performace, go with RVM: it's much lighter than SVM in most cases and will run faster

Size comparison

As a reference, here is the codes generated for an SVM classifier and an RVM one to classify the IRIS dataset.

uint8_t predict_rvm(double *x) {
    double decision[3] = { 0 };
    decision[0] = -0.6190847299428206;
    decision[1] = (compute_kernel(x,  6.3, 3.3, 6.0, 2.5) - 72.33233 ) * 0.228214 + -2.3609625;
    decision[2] = (compute_kernel(x,  7.7, 2.8, 6.7, 2.0) - 81.0089166 ) * -0.29006 + -3.360963;
    uint8_t idx = 0;
    double val = decision[0];
    for (uint8_t i = 1; i < 3; i++) {
        if (decision[i] > val) {
            idx = i;
            val = decision[i];
        }
    }
    return idx;
}

int predict_svm(double *x) {
    double kernels[10] = { 0 };
    double decisions[3] = { 0 };
    int votes[3] = { 0 };
        kernels[0] = compute_kernel(x,   6.7  , 3.0  , 5.0  , 1.7 );
        kernels[1] = compute_kernel(x,   6.0  , 2.7  , 5.1  , 1.6 );
        kernels[2] = compute_kernel(x,   5.1  , 2.5  , 3.0  , 1.1 );
        kernels[3] = compute_kernel(x,   6.0  , 3.0  , 4.8  , 1.8 );
        kernels[4] = compute_kernel(x,   7.2  , 3.0  , 5.8  , 1.6 );
        kernels[5] = compute_kernel(x,   4.9  , 2.5  , 4.5  , 1.7 );
        kernels[6] = compute_kernel(x,   6.2  , 2.8  , 4.8  , 1.8 );
        kernels[7] = compute_kernel(x,   6.0  , 2.2  , 5.0  , 1.5 );
        kernels[8] = compute_kernel(x,   4.8  , 3.4  , 1.9  , 0.2 );
        kernels[9] = compute_kernel(x,   5.1  , 3.3  , 1.7  , 0.5 );
        decisions[0] = 20.276395502
                    + kernels[0] * 100.0
                    + kernels[1] * 100.0
                    + kernels[3] * -79.351629954
                    + kernels[4] * -49.298850195
                    + kernels[6] * -40.585178082
                    + kernels[7] * -30.764341769
        ;
        decisions[1] = -0.903345464
                    + kernels[2] * 0.743494115
                    + kernels[9] * -0.743494115
        ;
        decisions[2] = -1.507856504
                    + kernels[5] * 0.203695177
                    + kernels[8] * -0.160020702
                    + kernels[9] * -0.043674475
        ;
        votes[decisions[0] > 0 ? 0 : 1] += 1;
        votes[decisions[1] > 0 ? 0 : 2] += 1;
        votes[decisions[2] > 0 ? 1 : 2] += 1;
                int classVal = -1;
        int classIdx = -1;
        for (int i = 0; i < 3; i++) {
            if (votes[i] > classVal) {
                classVal = votes[i];
                classIdx = i;
            }
        }
        return classIdx;
}

As you can see, RVM actually only computes 2 kernels and does 2 multiplications. SVM, on the other hand, computes 10 kernels and does 13 multiplications.

This is a recurring pattern, so RVM is much much faster in the inference process.

Disclaimer

micromlgen and in particular port_rvm are work in progress: you may experience some glitches or it may not work in your specific case. Please report any issue on the Github repo.

L'articolo Even smaller Machine learning models for your MCU: up to -82% code size proviene da Eloquent Arduino Blog.

]]>