Author(s): Jan Werth Originally published on Towards AI. Photo by Victor Freitas on Unsplash Table of content Introduction Prerequisite Eye Detection via Cascaders– Cascaders– Scalefactor and Min_Neighbour Find Matching eye-pairs– Matching eyes step by step with code– Less than three eyes– Draw facebox from eyebox Conclusions Update Dear reader, I wrote this article at the beginning of the COVID-19 pandemic in 2022. Unfortunately, I used a licensed image for my analysis, which prevented me from publishing the article. After years, I finally found the owner of this particular image and thankfully gained their permission to use the image for this article. Thanks a lot to NDTV allowing me to use their copyrighted image for this analysis. NDTV provides latest News from India and around the world. The information in this article might be not as fresh as back in 2022. However, it still explains the details on how to approach a data science problem and guides you through it step by step. Introduction Lately, we created a face-mask detection algorithm to run on our embedded hardware, the phyBOARD Pollux. There are many ways to create a facemask detection algorithm, such as… using the TensorFlow object detection API trained with and without images of masked faces or training a cascade classifier with and without images of masked faces or mixing a facial detection model with a following mask detection model. There are reasons when it is not advisable/possible to use a deep neural network, and it is more suitable to use a more lightweight solution such as the openCV cascaders. One reason would be the implementation into an embedded system with limited computation capabilities. For demonstration purposes, we chose to use the latter method by detecting the face with an openCV cascade classifier and following up with a trained MobileNet mask classifier (not described here). What we noticed was that the classic haarcascade classifiers included frontal_face, frontal_face_alt, etc. Did not work properly when the person would carry a mask. Now, there would be the option to train a model like the MTCNN or a haarcascader to detect faces with masks. However, we decided to use an eye detection haarcascader and optimize it to create the face bounding boxes for further detection. Prerequisite The entire code can be found here: JanderHungrige/Face_via_eye_detection Permalink Failed to load latest commit information. Here we use an eye-haarcascader to recognize faces. This helps with… github.com Python 3.x opencv-python matplotlib numpy I recommend Anaconda or Python3 virtualenv to create a virtual environment. Then the libraries can be installed via: Eye Detection via Cascader Look at me [edited by author, with permission NDTV, all rights reserved by NDTV] Cascaders The openCV library has several good working cascade classifies, which are widely used in e.g., facial recognition tasks. The main cascaders can be found here: opencv/opencv Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub. github.com As mentioned in the introduction, we want to detect faces. However, if a person wears a mask, the classic frontal_face cascader and its derivatives fail. Therefore, we use the haarcascade_eye.xml in this post and adapt it to give us facial bounding boxes. You can also use others like haarcascade_eye_tree_eyeglasses.xml or haarcascade_lefteye_2splits.xml. As the name suggest, this cascader is optimized to detect eyes. First, we import our libraries and load the cascader. Scalefactor and Min_Neighbour There are two parameters for the cascader, the scalefactor and the min_neighbour. scalefactor: What the haarcascader is doing, it is sliding a window over an image to find corresponding haar features (e.g., for a face) it was trained on. The size is fixed during training. To allow finding haar features with different sizes (e.g., large face in the foreground), the image is scaled down after each iteration, presenting a different cutout in the sliding window (image pyramid). The scalefactor defines the percentage the image is scaled down between each iteration. The downscaling takes place until the rescaled image size hits the dimensions of the model input dimensions (x or y). For more details, check this out the Viola-Jones algorithm. min_neighbour: During the sliding windows' analysis, the algorithm finds many false positives. To reduce the false positives, the min_neigbour parameter determines how many times an object needs to be detected between the scaling to count as a true positive. So, setting it to, e.g., 2, overall scaling iterations in the same region, a face has to be detected at least two times to count as a detection of a face. Influence of the parameters scalefactor and min_neighbour on the detection [edited by author, with permission NDTV, all rights reserved by NDTV] In the image (you probably have to zoom in a bit), you can see that an increase in the min_neighbours variable leads to fewer “false eyes” detections. With the increase in the scale factor, fewer non-eye-objects are detected as an eye. The idea is to find the sweet spot between both parameters. For our task, the scalefactor=1.1 and min_neighbour = 5 seem to work best. Find Matching eye Pairs The most difficult part of retrieving the correct bounding boxes of the face from the eyes is to correctly match eye pairs. Therefore, we first want to see in which order eyes are detected. Let us mark the eyes with the index at which they are detected. This results in marked and labeled eyes: Labeled eyes [edited by author, with permission NDTV, all rights reserved by NDTV] You can see, that the cascader first moves from left to right before changing the height. This can result in the index sequence not matching the eye pairs, as seen below. Cascader movement [edited by author, with permission NDTV, all rights reserved by NDTV] So, the first idea was to look for similarities in the y-axis, proximity in the x-axis, and similar box size. However, depending on how a person's face is shown in an image, the consideration of only one value would lead to mistakes. Below, we see that, e.g., with a tilted head, the y-value […]
↧