May 10, 2017

Realtime Facial Recognition processing in Python on Windows

We’ve started working on a few machine learning projects and have thoroughly enjoyed diving into Python. One project that we really wanted to explore was the realm of facial recognition, using dlib (, a state of the art machine learning library.

On Ubuntu its very easy to install, with just a few “apt-get commands”. Installing it on Windows 10, is quite more complicated.
The very first step involves downloading dlib, packages, C++ compilation tools, Boost libraries, ensuring proper 32 or 64 bit libraries….

No, just kidding. Don’t even think about doing this. Uninstall everything related to Python. Go to Continuum (, download and install Anaconda. Then go back to Anaconda, find/install a package that already built, and you are ready to go ( This will save you around a billion hours, give or take.

After that it was a matter of downloading the Facial recognition package ( and starting to code our python trainer and video rendering system.

We found a live stream of President Trump’s Inauguration event, which provided a great test platform for our trainer. The subject we were looking for were all facing forward (for the most part) throughout the event. The video was picked solely for research and nothing is endorsed/criticized from us using the video stream. It was the easiest, freely available video we could find with subjects that would easily be recognized by our training system.

We opted to use the “screengrab” method for processing the video, as it allowed us to be able to process any video being played on the computer, rather than just the inputted video line. (Note when you display it to the window to show our augmented reality view, you have to convert the image to “COLOR_BGR2RGB” again to make it look normal).

The system is a bit slow to render even on our monster workstation, so we’ll need to investigate using either CUDA or tweaking the code a bit.

Our next project will involve turning our system into a web-service to analyze a feed and return results to command&control drone system. This would allow us to have all the video recognition processing done on a very fast machine, while having the drone’s C&C computer just have to make basic decisions based on what is found in the feed. (Insert find “Sarah Conner” joke here).

Note, a great resource for video processing/learning in Python is at ( Big shout out to Harrison Kinsley for his excellent work on GTAV self driving. Real fun series that we all watch!