{ "version": "https://jsonfeed.org/version/1.1", "user_comment": "This feed allows you to read the posts from this site in any feed reader that supports the JSON Feed format. To add this feed to your reader, copy the following URL -- https://eloquentarduino.github.io/category/programming/arduino-machine-learning/arduino-machine-learning-tutorial/feed/json/ -- and add it your reader.", "home_page_url": "https://eloquentarduino.github.io/category/programming/arduino-machine-learning/arduino-machine-learning-tutorial/", "feed_url": "https://eloquentarduino.github.io/category/programming/arduino-machine-learning/arduino-machine-learning-tutorial/feed/json/", "language": "en-US", "title": "Arduino Machine Learning tutorial – Eloquent Arduino Blog", "description": "Machine learning on Arduino, programming & electronics", "items": [ { "id": "https://eloquentarduino.github.io/?p=1264", "url": "https://eloquentarduino.github.io/2020/10/decision-tree-random-forest-and-xgboost-on-arduino/", "title": "Decision Tree, Random Forest and XGBoost on Arduino", "content_html": "

You will be surprised by how much accuracy you can achieve in just a few kylobytes of resources: Decision Tree, Random Forest and XGBoost (Extreme Gradient Boosting) are now available on your microcontrollers: highly RAM-optmized implementations for super-fast classification on embedded devices.

\n

\"DecisionTree\"

\n

\n

Decision Tree

\n

Decision Tree is without doubt one of the most well-known classification algorithms out there. It is so simple to understand that it was probably the first classifier you encountered in any Machine Learning course.

\n

I won't go into the details of how a Decision Tree classifier trains and selects the splits for the input features: here I will explain how a RAM-efficient porting of such a classifier is implemented.

\n

To an introduction visit Wikipedia; for a more in-depth guide visit KDNuggets.

\n

Since we're willing to sacrifice program space (a.k.a flash) in favor of memory (a.k.a RAM), because RAM is the most scarce resource in the vast majority of microcontrollers, the smart way to port a Decision Tree classifier from Python to C is "hard-coding" the splits in code, without keeping any reference to them into variables.

\n

Here's what it looks like for a Decision tree that classifies the Iris dataset.

\n

As you can see, we're using 0 bytes of RAM to get the classification result, since no variable is being allocated. On the other side, the program space will grow almost linearly with the number of splits.

\n

Since program space is often much greater than RAM on microcontrollers, this implementation exploits its abundance to be able to deploy larger models. How much large? It will depend on the flash size available: many new generations board (Arduino Nano 33 BLE Sense, ESP32, ST Nucleus...) have 1 Mb of flash, which will hold tens of thousands of splits.

\n

Random Forest

\n

Random Forest is just many Decision Trees joined together in a voting scheme. The core idea is that of "the wisdom of the corwd", such that if many trees vote for a given class (having being trained on different subsets of the training set), that class is probably the true class.

\n

Towards Data Science has a more detailed guide on Random Forest and how it balances the trees with thebagging tecnique.

\n

As easy as Decision Trees, Random Forest gets the exact same implementation with 0 bytes of RAM required (it actually needs as many bytes as the number of classes to store the votes, but that's really negligible): it just hard-codes all its composing trees.

\n

XGBoost (Extreme Gradient Boosting)

\n

Extreme Gradient Boosting is "Gradient Boosting on steroids" and has gained much attention from the Machine learning community due to its top results in many data competitions.

\n
    \n
  1. "gradient boosting" refers to the process of chaining a number of trees so that each tree tries to learn from the errors of the previous
  2. \n
  3. "extreme" refers to many software and hardware optimizations that greatly reduce the time it takes to train the model
  4. \n
\n

You can read the original paper about XGBoost here. For a discursive description head to KDNuggets, if you want some more math refer to this blog post on Medium.

\n

Porting to plain C

\n

If you followed my earlier posts on Gaussian Naive Bayes, SEFR, Relevant Vector Machine and Support Vector Machines, you already know how to port these new classifiers.

\n

If you're new, you will need a couple things:

\n
    \n
  1. install the micromlgen package with
  2. \n
\n
pip install micromlgen
\n
    \n
  1. (optionally, if you want to use Extreme Gradient Boosting) install the xgboost package with
  2. \n
\n
pip install xgboost
\n
    \n
  1. use the micromlgen.port function to generate your plain C code
  2. \n
\n
from micromlgen import port\nfrom sklearn.tree import DecisionTreeClassifier\nfrom sklearn.datasets import load_iris\n\nclf = DecisionTreeClassifier()\nX, y = load_iris(return_X_y=True)\nclf.fit(X, y)\nprint(port(clf))
\n

You can then copy-past the C code and import it in your sketch.

\n

Using in the Arduino sketch

\n

Once you have the classifier code, create a new project named TreeClassifierExample and copy the classifier code into a file named DecisionTree.h (or RandomForest.h or XGBoost.h depending on the model you chose).

\n

The copy the following to the main ino file.

\n
#include "DecisionTree.h"\n\nEloquent::ML::Port::DecisionTree clf;\n\nvoid setup() {\n    Serial.begin(115200);\n    Serial.println("Begin");\n}\n\nvoid loop() {\n    float irisSample[4] = {6.2, 2.8, 4.8, 1.8};\n\n    Serial.print("Predicted label (you should see '2': ");\n    Serial.println(clf.predict(irisSample));\n    delay(1000);\n}
\n

Bechmarks

\n

How do the 3 classifiers compare against each other?

\n

We will evaluate a few keypoints:

\n\n

for each classifier on a variety of datasets. I will report the results for RAM and Flash on the Arduino Nano old generation, so you should consider more the relative figures than the absolute ones.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
DatasetClassifierTraining
time (s)
AccuracyRAM
(bytes)
Flash
(bytes)
Gas Sensor Array Drift Dataset Decision Tree1,60.781 \u00b1 0.122905722
13910 samples x 128 featuresRandom Forest30.865 \u00b1 0.0832906438
6 classesXGBoost18,80.878 \u00b1 0.0742906506
Gesture Phase Segmentation DatasetDecision Tree0,10.943 \u00b1 0.0052905638
10000 samples x 19 featuresRandom Forest0,70.970 \u00b1 0.0043066466
5 classesXGBoost18,90.969 \u00b1 0.0033066536
Drive Diagnosis DatasetDecision Tree0,60.946 \u00b1 0.0053065850
10000 samples x 48 featuresRandom Forest2,60.983 \u00b1 0.0033066526
11 classesXGBoost68,90.977 \u00b1 0.0053066698
\n

* all datasets are taken from the UCI Machine Learning datasets archive

\n

I'm collecting more data for a complete benchmark, but in the meantime you can see that both Random Forest and XGBoost are on par: if not that XGBoost takes 5 to 25 times longer to train.

\n

I've never used XGBoost, so I may be missing some tuning parameters, but for now Random Forest remains my favourite classifier.

\n

Code listings

\n
// example IRIS dataset classification with Decision Tree\nint predict(float *x) {\n  if (x[3] <= 0.800000011920929) {\n      return 0;\n  }\n  else {\n      if (x[3] <= 1.75) {\n          if (x[2] <= 4.950000047683716) {\n              if (x[0] <= 5.049999952316284) {\n                  return 1;\n              }\n              else {\n                  return 1;\n              }\n          }\n          else {\n              return 2;\n          }\n      }\n      else {\n          if (x[2] <= 4.950000047683716) {\n              return 2;\n          }\n          else {\n              return 2;\n          }\n      }\n  }\n}
\n
// example IRIS dataset classification with Random Forest of 3 trees\n\nint predict(float *x) {\n  uint16_t votes[3] = { 0 };\n\n  // tree #1\n  if (x[0] <= 5.450000047683716) {\n      if (x[1] <= 2.950000047683716) {\n          votes[1] += 1;\n      }\n      else {\n          votes[0] += 1;\n      }\n  }\n  else {\n      if (x[0] <= 6.049999952316284) {\n          if (x[3] <= 1.699999988079071) {\n              if (x[2] <= 3.549999952316284) {\n                  votes[0] += 1;\n              }\n              else {\n                  votes[1] += 1;\n              }\n          }\n          else {\n              votes[2] += 1;\n          }\n      }\n      else {\n          if (x[3] <= 1.699999988079071) {\n              if (x[3] <= 1.449999988079071) {\n                  if (x[0] <= 6.1499998569488525) {\n                      votes[1] += 1;\n                  }\n                  else {\n                      votes[1] += 1;\n                  }\n              }\n              else {\n                  votes[1] += 1;\n              }\n          }\n          else {\n              votes[2] += 1;\n          }\n      }\n  }\n\n  // tree #2\n  if (x[0] <= 5.549999952316284) {\n      if (x[2] <= 2.449999988079071) {\n          votes[0] += 1;\n      }\n      else {\n          if (x[2] <= 3.950000047683716) {\n              votes[1] += 1;\n          }\n          else {\n              votes[1] += 1;\n          }\n      }\n  }\n  else {\n      if (x[3] <= 1.699999988079071) {\n          if (x[1] <= 2.649999976158142) {\n              if (x[3] <= 1.25) {\n                  votes[1] += 1;\n              }\n              else {\n                  votes[1] += 1;\n              }\n          }\n          else {\n              if (x[2] <= 4.1499998569488525) {\n                  votes[1] += 1;\n              }\n              else {\n                  if (x[0] <= 6.75) {\n                      votes[1] += 1;\n                  }\n                  else {\n                      votes[1] += 1;\n                  }\n              }\n          }\n      }\n      else {\n          if (x[0] <= 6.0) {\n              votes[2] += 1;\n          }\n          else {\n              votes[2] += 1;\n          }\n      }\n  }\n\n  // tree #3\n  if (x[3] <= 1.75) {\n      if (x[2] <= 2.449999988079071) {\n          votes[0] += 1;\n      }\n      else {\n          if (x[2] <= 4.8500001430511475) {\n              if (x[0] <= 5.299999952316284) {\n                  votes[1] += 1;\n              }\n              else {\n                  votes[1] += 1;\n              }\n          }\n          else {\n              votes[1] += 1;\n          }\n      }\n  }\n  else {\n      if (x[0] <= 5.950000047683716) {\n          votes[2] += 1;\n      }\n      else {\n          votes[2] += 1;\n      }\n  }\n\n  // return argmax of votes\n  uint8_t classIdx = 0;\n  float maxVotes = votes[0];\n\n  for (uint8_t i = 1; i < 3; i++) {\n      if (votes[i] > maxVotes) {\n          classIdx = i;\n          maxVotes = votes[i];\n      }\n  }\n\n  return classIdx;\n}
\n

L'articolo Decision Tree, Random Forest and XGBoost on Arduino proviene da Eloquent Arduino Blog.

\n", "content_text": "You will be surprised by how much accuracy you can achieve in just a few kylobytes of resources: Decision Tree, Random Forest and XGBoost (Extreme Gradient Boosting) are now available on your microcontrollers: highly RAM-optmized implementations for super-fast classification on embedded devices.\n\n\nDecision Tree\nDecision Tree is without doubt one of the most well-known classification algorithms out there. It is so simple to understand that it was probably the first classifier you encountered in any Machine Learning course.\nI won't go into the details of how a Decision Tree classifier trains and selects the splits for the input features: here I will explain how a RAM-efficient porting of such a classifier is implemented.\nTo an introduction visit Wikipedia; for a more in-depth guide visit KDNuggets.\nSince we're willing to sacrifice program space (a.k.a flash) in favor of memory (a.k.a RAM), because RAM is the most scarce resource in the vast majority of microcontrollers, the smart way to port a Decision Tree classifier from Python to C is "hard-coding" the splits in code, without keeping any reference to them into variables.\nHere's what it looks like for a Decision tree that classifies the Iris dataset.\nAs you can see, we're using 0 bytes of RAM to get the classification result, since no variable is being allocated. On the other side, the program space will grow almost linearly with the number of splits.\nSince program space is often much greater than RAM on microcontrollers, this implementation exploits its abundance to be able to deploy larger models. How much large? It will depend on the flash size available: many new generations board (Arduino Nano 33 BLE Sense, ESP32, ST Nucleus...) have 1 Mb of flash, which will hold tens of thousands of splits. \nRandom Forest\nRandom Forest is just many Decision Trees joined together in a voting scheme. The core idea is that of "the wisdom of the corwd", such that if many trees vote for a given class (having being trained on different subsets of the training set), that class is probably the true class.\nTowards Data Science has a more detailed guide on Random Forest and how it balances the trees with thebagging tecnique.\nAs easy as Decision Trees, Random Forest gets the exact same implementation with 0 bytes of RAM required (it actually needs as many bytes as the number of classes to store the votes, but that's really negligible): it just hard-codes all its composing trees.\nXGBoost (Extreme Gradient Boosting)\nExtreme Gradient Boosting is "Gradient Boosting on steroids" and has gained much attention from the Machine learning community due to its top results in many data competitions.\n\n"gradient boosting" refers to the process of chaining a number of trees so that each tree tries to learn from the errors of the previous\n"extreme" refers to many software and hardware optimizations that greatly reduce the time it takes to train the model\n\nYou can read the original paper about XGBoost here. For a discursive description head to KDNuggets, if you want some more math refer to this blog post on Medium.\nPorting to plain C\nIf you followed my earlier posts on Gaussian Naive Bayes, SEFR, Relevant Vector Machine and Support Vector Machines, you already know how to port these new classifiers.\nIf you're new, you will need a couple things:\n\ninstall the micromlgen package with \n\npip install micromlgen\n\n(optionally, if you want to use Extreme Gradient Boosting) install the xgboost package with \n\npip install xgboost\n\nuse the micromlgen.port function to generate your plain C code\n\nfrom micromlgen import port\nfrom sklearn.tree import DecisionTreeClassifier\nfrom sklearn.datasets import load_iris\n\nclf = DecisionTreeClassifier()\nX, y = load_iris(return_X_y=True)\nclf.fit(X, y)\nprint(port(clf))\nYou can then copy-past the C code and import it in your sketch.\nUsing in the Arduino sketch\nOnce you have the classifier code, create a new project named TreeClassifierExample and copy the classifier code into a file named DecisionTree.h (or RandomForest.h or XGBoost.h depending on the model you chose).\nThe copy the following to the main ino file.\n#include "DecisionTree.h"\n\nEloquent::ML::Port::DecisionTree clf;\n\nvoid setup() {\n Serial.begin(115200);\n Serial.println("Begin");\n}\n\nvoid loop() {\n float irisSample[4] = {6.2, 2.8, 4.8, 1.8};\n\n Serial.print("Predicted label (you should see '2': ");\n Serial.println(clf.predict(irisSample));\n delay(1000);\n}\nBechmarks\nHow do the 3 classifiers compare against each other?\nWe will evaluate a few keypoints:\n\ntraining time\naccuracy\nneeded RAM\nneeded Flash\n\nfor each classifier on a variety of datasets. I will report the results for RAM and Flash on the Arduino Nano old generation, so you should consider more the relative figures than the absolute ones.\n\n\n\nDataset\nClassifier\nTraining time (s)\nAccuracy\nRAM (bytes)\nFlash (bytes)\n\n\n\n\nGas Sensor Array Drift Dataset \nDecision Tree\n1,6\n0.781 \u00b1 0.12\n290\n5722\n\n\n13910 samples x 128 features\nRandom Forest\n3\n0.865 \u00b1 0.083\n290\n6438\n\n\n6 classes\nXGBoost\n18,8\n0.878 \u00b1 0.074\n290\n6506\n\n\nGesture Phase Segmentation Dataset\nDecision Tree\n0,1\n0.943 \u00b1 0.005\n290\n5638\n\n\n10000 samples x 19 features\nRandom Forest\n0,7\n0.970 \u00b1 0.004\n306\n6466\n\n\n5 classes\nXGBoost\n18,9\n0.969 \u00b1 0.003\n306\n6536\n\n\nDrive Diagnosis Dataset\nDecision Tree\n0,6\n0.946 \u00b1 0.005\n306\n5850\n\n\n10000 samples x 48 features\nRandom Forest\n2,6\n0.983 \u00b1 0.003\n306\n6526\n\n\n11 classes\nXGBoost\n68,9\n0.977 \u00b1 0.005\n306\n6698\n\n\n\n* all datasets are taken from the UCI Machine Learning datasets archive\nI'm collecting more data for a complete benchmark, but in the meantime you can see that both Random Forest and XGBoost are on par: if not that XGBoost takes 5 to 25 times longer to train.\nI've never used XGBoost, so I may be missing some tuning parameters, but for now Random Forest remains my favourite classifier.\nCode listings\n// example IRIS dataset classification with Decision Tree\nint predict(float *x) {\n if (x[3] <= 0.800000011920929) {\n return 0;\n }\n else {\n if (x[3] <= 1.75) {\n if (x[2] <= 4.950000047683716) {\n if (x[0] <= 5.049999952316284) {\n return 1;\n }\n else {\n return 1;\n }\n }\n else {\n return 2;\n }\n }\n else {\n if (x[2] <= 4.950000047683716) {\n return 2;\n }\n else {\n return 2;\n }\n }\n }\n}\n// example IRIS dataset classification with Random Forest of 3 trees\n\nint predict(float *x) {\n uint16_t votes[3] = { 0 };\n\n // tree #1\n if (x[0] <= 5.450000047683716) {\n if (x[1] <= 2.950000047683716) {\n votes[1] += 1;\n }\n else {\n votes[0] += 1;\n }\n }\n else {\n if (x[0] <= 6.049999952316284) {\n if (x[3] <= 1.699999988079071) {\n if (x[2] <= 3.549999952316284) {\n votes[0] += 1;\n }\n else {\n votes[1] += 1;\n }\n }\n else {\n votes[2] += 1;\n }\n }\n else {\n if (x[3] <= 1.699999988079071) {\n if (x[3] <= 1.449999988079071) {\n if (x[0] <= 6.1499998569488525) {\n votes[1] += 1;\n }\n else {\n votes[1] += 1;\n }\n }\n else {\n votes[1] += 1;\n }\n }\n else {\n votes[2] += 1;\n }\n }\n }\n\n // tree #2\n if (x[0] <= 5.549999952316284) {\n if (x[2] <= 2.449999988079071) {\n votes[0] += 1;\n }\n else {\n if (x[2] <= 3.950000047683716) {\n votes[1] += 1;\n }\n else {\n votes[1] += 1;\n }\n }\n }\n else {\n if (x[3] <= 1.699999988079071) {\n if (x[1] <= 2.649999976158142) {\n if (x[3] <= 1.25) {\n votes[1] += 1;\n }\n else {\n votes[1] += 1;\n }\n }\n else {\n if (x[2] <= 4.1499998569488525) {\n votes[1] += 1;\n }\n else {\n if (x[0] <= 6.75) {\n votes[1] += 1;\n }\n else {\n votes[1] += 1;\n }\n }\n }\n }\n else {\n if (x[0] <= 6.0) {\n votes[2] += 1;\n }\n else {\n votes[2] += 1;\n }\n }\n }\n\n // tree #3\n if (x[3] <= 1.75) {\n if (x[2] <= 2.449999988079071) {\n votes[0] += 1;\n }\n else {\n if (x[2] <= 4.8500001430511475) {\n if (x[0] <= 5.299999952316284) {\n votes[1] += 1;\n }\n else {\n votes[1] += 1;\n }\n }\n else {\n votes[1] += 1;\n }\n }\n }\n else {\n if (x[0] <= 5.950000047683716) {\n votes[2] += 1;\n }\n else {\n votes[2] += 1;\n }\n }\n\n // return argmax of votes\n uint8_t classIdx = 0;\n float maxVotes = votes[0];\n\n for (uint8_t i = 1; i < 3; i++) {\n if (votes[i] > maxVotes) {\n classIdx = i;\n maxVotes = votes[i];\n }\n }\n\n return classIdx;\n}\nL'articolo Decision Tree, Random Forest and XGBoost on Arduino proviene da Eloquent Arduino Blog.", "date_published": "2020-10-19T19:31:02+02:00", "date_modified": "2020-12-10T12:26:23+01:00", "authors": [ { "name": "simone", "url": "https://eloquentarduino.github.io/author/simone/", "avatar": "http://1.gravatar.com/avatar/d670eb91ca3b1135f213ffad83cb8de4?s=512&d=mm&r=g" } ], "author": { "name": "simone", "url": "https://eloquentarduino.github.io/author/simone/", "avatar": "http://1.gravatar.com/avatar/d670eb91ca3b1135f213ffad83cb8de4?s=512&d=mm&r=g" }, "tags": [ "microml", "ml", "Arduino Machine learning", "Arduino Machine Learning tutorial" ] } ] }