The History of Facial Recognition Technologies: How Image Recognition Got So Advanced
Part I of the series: Should Facial Recognition Technologies be Banned? We take a look at the history of facial recognition, from nascency to present-day capabilities.
This article is the first in a three-part series that deal with the currently hot-button topic pertaining to calls for greater regulation and the resulting conversations and government scrutiny on facial recognition software.
Before we get to discussing the technology and the calls for regulation and some resulting bans (which will be covered be in the next few articles), we need first take a step back and look at the history of how the technology came to be, to give context to how facial recognition works—and how it is being used in technological applications—today.
The birth of a facial recognition methodology: using numerical measures of facial features
The implementation of facial recognition has seen many iterations, which saw roots in the 1960s when facial recognition was manually implemented by Woodrow Wilson Bledsoe. Bledsoe is largely considered the father of facial recognition for developing a system that classified photos of faces through a RAND tablet, which was a graphical computer input device. With this device, Bledsoe manually recorded the coordinate locations of facial features such as a person’s mouth, nose, eyes, and even their hairline.
Equipped with a manual log of various faces, facial recognition then plotted new photographs against the database to identify individuals with the closest numerical resemblance based on the plotted information. While this served as a foundational basis, and proof that facial recognition was a viable biometric, it was severely hindered by the technology of the period, proving to have inadequate processing power to meet demanding computing rigors required to scale and refine the technology.
Facial recognition was incrementally refined throughout the 1970s by the likes of Goldstein, Harmon, and Lesk, but largely remained a manually computed process.
The leap from manual computation for facial recognition to a computer-assisted approach using Eigenfaces
It wasn’t until the late ’80s to early ’90s that significant developments would be made in the field of facial recognition in the form of an application of linear algebra. In what was to become known as the Eigenface approach, Sirovich and Kirby implemented an approach that started as a low-dimensional representation of facial images. Through this, they demonstrated that an analysis of features on a set of images could form a set of basic features. They also established that less than a hundred values were needed to accurately code a normalized image of a face.
The Eigenface method is today used as a basis of many deep learning algorithms, paving way for modern facial recognition solutions.
The modern-day game-changers spurred on by the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
ImageNet is essentially a democratized dataset that can be used for machine learning research. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is a yearly challenge that exists to evaluate the ability of algorithms to correctly classify images within its repository to increasing degrees of accuracy.
In the 2010s, a good classification error rate was around 25%. In 2012, AlexNet, which was a deep convolutional neural net (CNN) bested that result by getting an error rate of 15.3%. This was a game-changer because it was the first time that such results were achieved, beating competing algorithms in that year by over 10.8%.
AlexNet went on to become the winning entry in the ILSVRC in that year.
Subsequent image processing solutions in the following years improved on results of AlexNet. In 2013, ZFNet, also a CNN, achieved an error rate of 14.8%. In 2014, GooLeNet/Inception achieved an error rate of 6.67%. In 2015, ResNet further brought the error rate down to 3.6%.
With this, machines could theoretically detect and classify images—albeit based on a set image database, and without the ability to contextualize the image—as good as, or better than human beings.
Computer processing of images has become so progressively powerful, in no small part thanks to AlexNet. Today, machines can technically identify images to a higher degree of accuracy than a human can.
The strides we’ve made in recent history can be attributed to the changing approach in image processing. Researchers gradually moved away from human coding techniques and presets, moving onto the utility of deep neural networks and machine learning. This meant that image processing, identification, and classification has led to unparalleled levels of accuracy we have today, that we can apply to facial recognition.
The role of Moore’s Law in image processing
In the year that AlexNet won ILSVRC, it used two Nvidia Geforce GTX 580 GPUs to process and achieve those results. While the approach of how images are processed differently keeps getting refined, the improving efficiency and power of graphical processing units (GPUs) also increase the efficacy of how images are processed—and in extension, how faces are detected.
With Moore’s law, which has rightly predicted since 1975, that the number of transistors we can cram into the same amount of space doubles. This means double the raw computing power than the years prior, which follows an exponential curve.
Advanced facial recognition technology is spurred on by breakthroughs in image processing
Turns out, how we achieved the advanced telematics capabilities today—where machines can identify images and faces better than humans—essentially boils down to two key factors:
- The exponential increase of computing resources at the same cost.
- Incremental strides from research labs with how we process images.
To surmise the points above, we have the following explainer video that will give you an overview of how rapidly facial recognition is evolving:
Now that you know what powers facial recognition and how we got there, look out for our next article, where we go through exactly what facial recognition technology is capable of, and the many applications of such a solution.