OpenCV on STM32F7-Discovery

OpenCV on STM32F7-Discovery

I am one of the developers of the operating system Embox , and in this article I will tell you how I managed to run OpenCV on the STM32746G board.

If you put something like "OpenCV on STM32 board" into a search engine, you can find quite a few who are interested in using this library on STM32 boards or other microcontrollers.
There are several videos that, judging by the title, should demonstrate what is needed, but usually (in all the videos I saw) on the STM32 board, only the image from the camera was received and the result was displayed on the screen, and the image processing itself was done either on a standard computer or more powerful cards (for example, Raspberry Pi).

Why is it difficult?

The popularity of search queries is explained by the fact that OpenCV is the most popular library of computer vision, which means more developers are familiar with it, and the ability to run desktop-ready code on a microcontroller greatly simplifies the development process. But why there are still no popular ready-made recipes for solving this problem?

The problem of using OpenCV on small scarves is associated with two features:

  • If you compile a library even with a minimal set of modules, it simply won’t fit into the flash memory of the same STM32F7Discovery (even without OS) due to a very large code (several megabytes of instructions)
  • The library itself is written in C ++, which means
    • Need support for a positive runtime (exceptions, etc.)
    • Few LibC/Posix support, which is usually found in the OS for embedded systems - you need a standard plus library and a standard STL template library (vector, etc.)

Porting to Embox

As usual, before porting any programs to the operating system, it's a good idea to try building it as it was intended by the developers. In our case, there are no problems with this - the source can be found on github , the library is built under GNU/Linux with the usual cmake.

From the good news - OpenCV out of the box can be assembled as a static library, which makes porting easier. We compile a library with a standard config and see how much space they occupy. Each module is collected in a separate library.

  & gt;  size lib/* so --totals
  text data bss dec hex filename
 1945822 15431 960 1962213 1df0e5 lib/
 17081885 170312 25640 17277837 107a38d lib/
 10928229 137640 20192 11086061 a928ed lib/
  842311 25680 1968 869959 d4647 lib/
  423660 8552 184 432396 6990c lib/
 8034733 54872 1416 8091021 7b758d lib/
  90741 3452 304 94497 17121 lib/
 6338414 53152 968 6392534 618ad6 lib/
 21323564 155912 652056 22131532 151b34c lib/
  724323 12176 376 736875 b3e6b lib/
  429036 6864 464 436364 6a88c lib/
 6866973 50176 1064 6918213 699045 lib/
  698531 13640 160 712331 ade8b lib/
  466295 6688 168 473151 7383f lib/
  315858 6972 11576 334406 51a46 lib/
 76510375 721519 717496 77949390 4a569ce (TOTALS)  

As can be seen from the last line, .bss and .data do not take up much space, but the code is more than 70 MiB. It is clear that if it is linked statically with a specific application, the code will be less.

Let's try to throw out as many modules as possible so that a minimal example is going to be set up (which, for example, will simply output the version of OpenCV), so look at cmake ..-LA and disable everything that is disabled in the options.

  -DBUILD_opencv_java_bindings_generator = OFF \
  -DBUILD_opencv_stitching = OFF \
  -DWITH_V4L = OFF \
  & lt; ... & gt;  

  & gt;  size lib/libopencv_core.a --totals
  text data bss dec hex filename
 3317069 36425 17987 3371481 3371d9 (TOTALS)  

On the one hand, this is only one module of the library, on the other hand, it is not optimized by the compiler for code size ( -Os ). ~ 3 MiB code is still quite a lot, but it gives hope for success.

Run in the emulator

On the emulator, debugging is much easier, so first make sure that the library works on qemu. As an emulated platform, I chose Integrator/CP, since Firstly, it is also ARM, and secondly, Embox supports graphics output for this platform.

Embox has a mechanism for building external libraries, adding OpenCV as a module with its help (transferring all the same options for the minimum assembly as static libraries), then adding the simplest application that looks like this:


 #include & lt; stdio.h & gt;
 #include & lt; opencv2/core/utility.hpp & gt;

 int main () {
  printf ("OpenCV:% s", cv :: getBuildInformation (). c_str ());

  return 0;

Putting the system together, starting it up - we get the expected output.

  root @ embox:/# opencv_version
 General configuration for OpenCV 4.0.1 =====================================
  Version control: bd6927bdf-dirty

  Timestamp: 2019-06-21T10: 02: 18Z
  Host: Linux 5.1.7-arch1-1-ARCH x86_64
  Target: Generic arm-unknown-none
  CMake: 3.14.5
  CMake generator: Unix Makefiles
  CMake build tool:/usr/bin/make
  Configuration: Debug

  CPU/HW features:
  requested: DETECT
  disabled: VFPV3 NEON

  C/C ++:
  Built as dynamic libs ?: NO
 & lt;  Next comes the other build parameters - with which flags was compiled,
  which OpenCV modules are included in the build, etc. & gt;  

The next step is to run some example, best of all, a standard one that the developers themselves offer y yourself on the site . I selected the Kenny Boundary Detector .

The example had to be slightly rewritten in order to display the image with the result directly in the frame buffer. I had to do it because The imshow () function can draw images via QT, GTK and Windows interfaces, which, of course, will not be in the config for STM32. In fact, QT can also be run on STM32F7Discovery, but this will be discussed in another article:)

After a brief clarification of the exact format in which the result of the operation of the boundary detector is stored, we obtain an image.

Original Picture


Run on STM32F7Discovery

On 32F746GDISCOVERY there are several hardware memory sections that we can somehow use

  1. 320KiB RAM
  2. 1MiB flash for image
  3. 8MiB SDRAM
  4. 16MiB QSPI NAND flash drive
  5. MicroSD card slot

An SD card can be used to store images, but in the context of running a minimal example, this is not very useful.
The display has a resolution of 480x272, which means that the memory for framebuffer will be 522,240 bytes with a depth of 32 bits, i.e.this is larger than the size of the RAM, so the framebuffer and the heap (which is also required for OpenCV to store data for images and auxiliary structures) will be placed in SDRAM, everything else (memory for stacks and other system needs) will be sent to RAM .

If you take the minimum config for STM32F7Discovery (throw out the entire network, all the teams, make the stacks as small as possible, etc.) and add OpenCV there with examples, with the required memory will be the following:

  text data bss dec hex filename
 2876890 459208 312736 3648834 37ad42 build/base/bin/embox  

For those who are not very familiar with which sections they add up, I’ll explain: .text and .rodata are instructions and constants (roughly speaking, readonly- data), in .data lie the data that is changed, in .bss lies the "zaned" variables, which, nevertheless, need a place (this section will "go" to RAM) .

The good news is that .data / .bss should fit, but with .text trouble - there is only 1MiB for the image of memory. You can throw a picture from the example out of .text and read it, for example, from an SD card into memory at startup, but fruits.png weighs about 330KiB, so this will not solve the problem: most of .text consists of the OpenCV code.

By and large, there is only one thing left - loading part of the code on a QSPI flash drive (it has a special mode of operation for mapping memory onto the system bus, so that the processor can directly access this data). In this case, a problem arises: firstly, the memory of the QSPI flash drive is not available immediately after rebooting the device (you need to initialize the memory-mapped-mode separately), and secondly, you cannot “flash” this memory with the usual bootloader.

As a result, it was decided to link all the code in QSPI, and to flash it with a self-written bootloader, which will receive the desired TFTP binary.


The idea of ​​porting this library to Embox appeared about a year ago, but time after time it was postponed due to various reasons. One of them is support for libstdc ++ and standart template library. The problem of C ++ support in Embox is beyond the scope of this article, so here I can only say that we managed to achieve this support in the right amount for the work of this library:)

As a result, these problems were overcome (at least sufficiently for the OpenCV example to work), and the example started. It takes 40 seconds for the board to search the borders for a Kenny filter. This, of course, is too long (there are considerations on how to optimize this matter, you can write a separate article about this if successful).

However, the intermediate goal was to create a prototype that would show the basic possibility of running OpenCV on STM32, respectively, this goal was achieved, hurray!

tl; dr: step by step instructions

0: Download Embox sources, like this:

  git clone & amp; & amp;  cd ./embox 

1: Let's start by building a boot loader that will "flash" a QSPI flash drive.

  make confload-arm/stm32f7cube  

Now you need to configure the network, because We will upload the image via TFTP. In order to set the IP addresses of the card and host, you need to change the conf/rootfs/network file.

Configuration Example:

  iface eth0 inet static
  hwaddress aa: bb: cc: dd: ee: 02  

gateway is the address of the host from which the image will be downloaded, address is the address of the board.

After that, we collect the loader:


2: Normal boot loader (sorry for the pun) on the board - there is nothing specific here, you need to do this just like any other application for STM32F7Discovery. If you don’t know how to do it, you can read about it here .
3: Compile an image with config for OpenCV.

  make confload-platform/opencv/stm32f7discovery

4: Extract from the ELF sections that need to be written to QSPI in qspi.bin

  arm-none-eabi-objcopy -O binary build/base/bin/embox build/base/bin/qspi.bin \
  --only-section = .text --only-section = .rodata \
  --only-section = '. ARM.ex *' \
  --only-section = .data  

There is a script in the conf directory that does this, so you can run it

  ./conf/ # Binary needed is build/base/bin/qspi.bin  

5: Using tftp, load qspi.bin.bin on a QSPI flash drive. On the host, you need to copy qspi.bin to the root folder of the tftp server (usually/srv/tftp/or/var/lib/tftpboot/; packages for the corresponding server are in most popular distributions, usually called tftpd or tftp-hpa, sometimes you need to make systemctl start tftpd.service to start).

  # tftpd option
  sudo cp build/base/bin/qspi.bin/srv/tftp
  # option for tftp-hpa
  sudo cp build/base/bin/qspi.bin/var/lib/tftpboot  

On Embox-e (i.e., in the bootloader), you need to execute the following command (we assume that the server has the address

  embox & gt;  qspi_loader qspi.bin  

6: Using the goto command, you need to "jump" into QSPI memory. The specific location will vary depending on how the image slinks; you can see this address with the command mem 0x90000000 (the start address is placed in the second 32-bit word of the image); you also need to set the stack flag -s , the address of the stack is at 0x90000000, example:

  embox & gt; mem 0x90000000
  0x90000000: 0x20023200 0x9000c27f 0x9000c275 0x9000c275
  ↑ ↑
  this address is the address
  stack first

  embox & gt; goto -i 0x9000c27f -s 0x20023200 # The -i flag is needed to disable interrupts during system initialization

  & lt;  Starting from here, the output will not be a bootloader, but an image with OpenCV & gt;  

7: Starting

  embox & gt;  edges 20  

and enjoy the 40-second search for boundaries :)

If something goes wrong - write an issue in our repository , or to the newsletter, or in the comments here.

Source text: OpenCV on STM32F7-Discovery