I recently installed OpenCV via cmake with the option WITH_TBB=ON on a raspberry pi 3.
The code I wanted to accalerate is basically a circle detection with HoughCircles.
Unfortunately the CPU usage is the same as before having TBB enabled, that is somewhere below 30%.
Why is it that way? I supposed HoughCircles is highly parallelizable, according to this http://www.ijcsi.org/papers/IJCSI-9-6-3-481-486.pdf.
EDIT:
taking a look at the source code mentioned by matman (see [here](https://github.com/opencv/opencv/blob/master/modules/imgproc/src/hough.cpp)), at line 1058 there are two for-loops which as far as I can see are just filling up the accumulator. How can they be parallelized with parallel_for_, and would that actually help speeding up the detection?
↧