For many of us, facial recognition in digital images may seem like one of Facebook and Google’s recent parlor tricks to make it easier for you to “tag” your friends in vacation photos. But if you do any work in privacy law, ethics, etc., the spread of facial recognition technology may open up some interesting policy implications and research opportunities. Here, we dig a little deeper into how facial recognition technologies work “under the hood”.
Facebook, your iPhone, Google…they all seem to know where the faces are. You snap a picture and upload it to Facebook? It instantly recognizes a face and tells you to tag your friends. And when it does that, it’s actually just being polite; it already has a pretty good sense of which friends it’s recognized–it’s just looking to you to confirm.
If you’re like me, you’ve probably had some combination of reactions, ranging from “Awesome!” to “Well, that’s kinda creepy…” to “How the heck does it do that?” And if you do any work in privacy law, ethics, etc., the spread of facial recognition technology may be more than a mere parlor trick to you. It has major policy implications, and will likely open up a lot of interesting research opportunities.
But how the heck does it work? Well, we dug into this a bit recently to find out…
Fortunately, there happens to be a very nice open source library called OpenCV that we can use to explore some of the various facial recognition algorithms that are floating around out there. OpenCV can get pretty labyrinthine pretty fast, so you may also want to dig into a few wonderful tutorials (see “Resources” below) that are emerging on the subject.
We explored an algorithm called Eigenfaces, along with a nifty little method called Haar Cascades, to get a sense of how algorithms can be trained to recognize faces in a digital image and match them to unique individuals. These are just a few algorithms among many, but the exploration should give you a nice idea of the kinds of problems that need to be tackled in order to effectively recognize a face.
But first, let’s jump to the punchline! When it’s all said and done, here’s what it does:
And here’s how it does it, in both layman’s terms and in code snippets:
First, create two sets of images. The first will be a set of “negative” images of human faces. These are images of generic people, but not those that we want our algorithm to be able to recognize. (Note: Thanks to Samaria & Harter–see “Resources” below–for making these images available to researchers and geeks to use when experimenting with facial recognition!)
The second is a set of “positive” images of the faces that we want our algorithm to be able to recognize. We’ll need about 10-15 snapshots of each person we want to be able to recognize, and we can easily capture these using our computer’s webcam. And for simplicity’s sake, we’ll make sure all of these images are the same size and are nicely zoomed into the center of each face, so we don’t have to take size variation or image quality into account for now.
Then, feed all of these images into the OpenCV Eigenfaces algorithm:
USER_LIST = os.listdir(config.POSITIVE_DIR) for user in USER_LIST: for filename in walk_files(config.POSITIVE_DIR + "/" + user, '*.pgm'): faces.append(prepare_image(filename)) labels.append(USER_LIST.index(user)) pos_count += 1 # Read all negative images for filename in walk_files(config.NEGATIVE_DIR, '*.pgm'): faces.append(prepare_image(filename)) labels.append(config.NEGATIVE_LABEL) neg_count += 1 print 'Read', pos_count, 'positive images and', neg_count, 'negative images.'
Next, “train” the Eigenfaces algorithm to recognize whose faces are whose. It does this by mathematically figuring out all the ways the negative and positive faces are similar to each other and essentially ignoring this information as “fluff” that’s not particularly useful to identifying individuals. Then, it focuses on all of the ways the faces are different from each other, and uses these unique variations as key information to predict whose face is whose. So, for example, if your friend has a unibrow and a mole on their chin and you don’t, the Eigenfaces algorithm would latch onto these as meaningful ways of identifying your friend. The exact statistics of this are slightly over my head, but for those of you who are into that kind of thing, principal component analysis is the “special statistical sauce” that powers this process.
# Train model print 'Training model...' model = cv2.face.createEigenFaceRecognizer() model.train(np.asarray(faces), np.asarray(labels))
When training the model, we can also examine an interesting byproduct–the “mean Eigenface”. This is essentially an abstraction of what it means to have an entirely “average face”, according to our model:
Kind of bizarre, huh?
Now, the real test: we need to be able to recognize these faces from a webcam feed. And unlike our training images, our faces may not be well-centered in our video feed. We may have people moving around or off kilter, so how do we deal with this? Enter…the Haar Cascade!
The Haar Cascade will scan through our webcam feed and look for “face-like” objects. It does this by taking a “face-like” geometric template, and scanning it across each frame in our video feed very, very quickly. It examines the edges of the various shapes in our images to see if they match this very basic template. It even stretches and shrinks the template between scans, so it can detect faces of different sizes, just in case our face happens to be very close up or very far away. Note that the Haar Cascade isn’t looking for specific individuals’ faces–it’s just looking for “face-like” geometric patterns, which makes it relatively efficient to run:
haar_faces = cv2.CascadeClassifier(config.HAAR_FACES) def detect_single(image): """Return bounds (x, y, width, height) of detected face in grayscale image. If no face or more than one face are detected, None is returned. """ faces = haar_faces.detectMultiScale(image, scaleFactor=config.HAAR_SCALE_FACTOR, minNeighbors=config.HAAR_MIN_NEIGHBORS, minSize=config.HAAR_MIN_SIZE, flags=cv2.CASCADE_SCALE_IMAGE) if len(faces) != 1: return None return faces
Once the Haar Cascade has identified a “face-like” thing in the video feed, it crops off that portion of the video frame and passes it back to the Eigenfaces algorithm. The Eigenfaces algorithm then churns this image back through its classifier. If the image matches the unique set of statistically identifying characteristics of one of the users we trainted it to recognize, it will spit out their name. If it doesn’t recognize the face as someone from the group of users it was trained to recognize, it well tell us that, too!
# Test face against model. label, confidence = model.predict(crop) if label >= 0 and confidence < config.POSITIVE_THRESHOLD: print 'Recognized ' + USER_LIST[label] else: print 'Did not recognize face!'
Interested in exploring this further with a class or as part of a research project? Get in touch and we’re happy to help you on your way!
- Sobel, B. (11 June 2015). “Facial recognition technology is everywhere. It my not be legal.” The Washington Post. https://www.washingtonpost.com/news/the-switch/wp/2015/06/11/facial-recognition-technology-is-everywhere-it-may-not-be-legal/
- Meyer, R. (24 June 2014). “Anti-Surveillance Camouflage for Your Face”. The Atlantic. http://www.theatlantic.com/technology/archive/2014/07/makeup/374929/
- “How does Facebook suggest tags?” Facebook Help Center. https://www.facebook.com/help/122175507864081
- “OpenCV Tutorials, Resources, and Guides”. PyImageSearch. http://www.pyimagesearch.com/opencv-tutorials-resources-guides/
- “Face Recognition with Open CV: Eigenfaces”. OpenCV docs 220.127.116.11. http://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#eigenfaces
- “Face Detection Using Haar Cascades”. OpenCV Docs 3.1.0. http://docs.opencv.org/3.1.0/d7/d8b/tutorial_py_face_detection.html
- “Raspberry Pi Face Recognition Treasure Box”. Adafruit tutorials. https://learn.adafruit.com/raspberry-pi-face-recognition-treasure-box/overview
- F. Samaria & A. Harter. “Parameterisation of a stochastic model for human face identification” 2nd IEEE Workshop on Applications of Computer Vision December 1994, Sarasota (Florida).