In recent years, facial recognition applications have experienced rapid growth. Beyond the everyday use of unlocking smartphones, facial recognition is also commonly applied in office or factory access control, electronic locks in buildings and apartments, and even in areas like retail and financial customer experience optimization. In this article, we will explore in depth the principles of facial recognition, how it is optimized in various application scenarios and edge devices, detailed technical explanations, and various promising applications of facial recognition.
Facial recognition (or face recognition) is a type of identity verification biometric technology that uses computer vision to convert facial images into a set of facial feature values. These feature values are then compared with those stored in a database. If the similarity between the two sets of values exceeds a predefined threshold, the system identifies them as the same person. There are two primary applications: 1:1 verification (confirming if a person matches the ID they hold) and 1:N identification (recognizing a target individual from a database). In recent years, new-generation facial recognition systems have adopted deep neural network (DNN) technology, significantly improving recognition accuracy and adaptability to various facial changes (such as occlusion). This advancement has led to widespread adoption of facial recognition in numerous applications.
Facial recognition has numerous advantages, with the five most critical being:
The recognition accuracy of next-generation facial recognition engines has surpassed that of human eyes. For instance, FaceMe® has a false recognition rate of less than one in a million.
It can quickly complete facial identity verification within 0.5 seconds, extremely convenient.
Compared to other authentication methods (fingerprint scanning), facial recognition is more convenient and hygienic due to its zero-contact nature.
By integrating with network surveillance cameras and access control systems, it can strictly regulate personnel access, protecting company assets and personnel safety. It also strengthens system login procedures, preventing attacks from hackers.
Facial recognition is flexible and can be applied to many applications, such as security, finance, retail, offices, healthcare, and public sectors.
Face detection is the first step the engine takes to confirm the presence of faces as they appear on a live camera feed, a video recording or as it scans still image captures. The whole field of view is scanned for any area containing full or even partial human faces. Fast and precise face detection is a critical first step to ensure the performance of the entire facial recognition process. FaceMe® can detect more than one face simultaneously, count how many faces are present, and perform detection on each of them individually.
Handled by another deep neural network model (referred to as the AI model), this step converts the detected face data into vector values in a high-dimensional space, known as facial feature values. Due to the nature of vectors, the similarity between two facial feature values can be assessed by calculating the distance between the two vectors. The task of the AI model is to group feature values belonging to the same person as closely as possible in the spatial domain while separating those of different individuals.
During the training process, the AI model utilizes a vast amount of training data (over tens of millions), a large number of computation units (from hundreds of millions to billions), and extremely high-dimensional feature values (over 1024 dimensions). This allows the AI model to learn how to accurately analyze facial features and classify faces in a high-dimensional space. The training process often requires hundreds to thousands of GPU hours and involves iterative adjustments and optimizations of numerous model parameters and training settings, making it an extremely challenging task.
By calculating the similarity between two facial feature values and comparing that similarity to a predefined threshold, it can be determined whether the two faces belong to the same person. In the context of eKYC (electronic Know Your Customer), identity verification refers to comparing the face on an ID with the face in front of the camera to complete the verification process, which is an application of 1:1 facial match comparison.
Additionally, if the facial feature values of faces appearing in a frame are compared against multiple facial feature values stored in a database, this constitutes 1:N facial recognition. Common applications include access control and attendance systems. Beyond the traditional method of comparing all feature values one by one, FaceMe® also provides a fast search algorithm that significantly reduces the number of comparisons required, thereby accelerating the recognition speed.
During the comparison and search process, only feature values are needed for operation, which eliminates the need to store facial images in the database. Furthermore, FaceMe® enhances security and privacy by encrypting feature values with AES-256 encryption.
Besides the primary functions of face detection and identity verification, facial recognition technology also includes additional features liveness detection (anti-spoofing), deepfake detection, and mask detection.
Common methods used to bypass facial recognition systems include using photographs or videos of a person's face to impersonate them, making liveness detection critical in facial recognition applications.
What is Liveness Detection? Liveness detection refers to the use of computer vision algorithms to determine whether the person being identified is a real person, thereby preventing identity theft.
Liveness Detection can be classified according to the different camera modules:
Using 2D Cameras for Liveness Detection
2D cameras (such as webcams and smartphone front-facing cameras) can perform liveness detection through both active and passive methods. Active Liveness Detection verifies user liveness through interaction, involving user prompts such as head gestures (nodding or shaking) or facial expressions (blinking or opening the mouth). Passive Liveness Detection uses deep learning algorithms to analyze the face in front of the camera, assess its authenticity based on variations in light and shadow, geometric changes during camera movement, skin features, and subtle facial movements to determine liveness. Since 2D liveness detection does not require specialized camera modules, it has a lower implementation cost and is easier to adopt, making it suitable for eKYC applications.
In 2022, FaceMe® SDK achieved iBeta Level 2 certificationand ISO 30107-3 compliance with a perfect score during Presentation Attack Detection (PAD) testing. In 2023, it excelled in the liveness PAD section of the latest Face Analysis Technology Evaluation (FATE) conducted by the National Institute of Standards and Technology (NIST) in the United States. Notably, in the convenience testing category, it stood out among 82 facial recognition algorithms globally, achieving the top ranking.
Using 3D Depth Cameras for Liveness Detection
With 3D depth cameras, algorithms can analyze both the facial image and the depth of field in front of the camera, effectively blocking most flat attacks, such as printed photos or videos displayed on screens. As a result, 3D liveness detection is faster and more intuitive compared to 2D methods. However, the cost of 3D cameras is higher. FaceMe® supports several 3D depth cameras, including Intel® RealSense™, 3D structured light cameras on iPads and iPhones, and Himax.
Using IR+RGB Camera Modules for Liveness Detection
IR+RGB camera modules are commonly used in facial recognition for access control, attendance devices, and Microsoft’s Windows Hello. This type of module contains two optical lenses: one that captures visible light (RGB) and another that captures infrared light (IR). Since materials like paper and tablet screens absorb or filter infrared light, this effectively prevents such attacks. Compared to 3D depth cameras, IR+RGB camera modules offer significant cost advantages while achieving recognition speeds and accuracies close to those of 3D depth cameras, which is why they are widely adopted.
Deepfakes are images or video that were generated by AI tools, aimed at portraying real or make-believe people, and making it appear as if they said something or did something that actually never happened. Most of the time these are created for deceptive purposes. In terms of eKYC or facial recognition login, perpetrators may try to use deepfakes to impersonate a target individual.
When hackers use deepfake video attacks, FaceMe®’s anti-spoofing detection can successfully block these attempts, regardless of whether they involve 2D, 3D, or IR+RGB cameras. However, if hackers employ a combined attack that uses camera signals to bypass anti-spoofing detection, additional measures are required. This is where FaceMe®’s newly launched deepfake detection feature comes into play, utilizing a specially designed, developed, and trained model to detect whether the image signals are generated by deepfake technology. This capability helps prevent various impersonation attacks using deepfake technology, ensuring the information security of eKYC, login, and transaction processes.
FaceMe® can accurately identify individuals even while they are wearing masks, achieving a recognition rate of up to 98.21%. When individuals are wearing masks, facial recognition allows for quick identity verification while maintaining access control and security management efficiency.
To improve vehicle inspection quality, Toyota Japan has integrated facial recognition technology into its vehicle inspection system. Certified inspectors need to be verified by their system to ensure quality control, and they can quickly complete identity verification even when wearing full protective gear including safety glasses, masks, and helmets.
The accuracy of facial recognition can be assessed by a low false non-match rate (FNMR) and an extremely low false match rate (FMR). The false match rate (FMR) indicates the rate at which a person is incorrectly identified as someone else, whereas the false non-match rate (FNMR) represents the failure to match two images belonging to the same person.
3 Factors Affecting Facial Recognition Accuracy
The performance of a recognition engine depends on the technology of the vendor that developed it. This includes the design of the AI model architecture, the volume of training data, the diversity of training data (adequately covering various races, genders, ages, facial angles, lighting conditions, and facial obstructions), and the tuning of parameters during the training process.
Image quality encompasses camera resolution (e.g., 720p), shutter speed, focusing ability, and image quality (e.g., noise and artifacts).
This includes the size of the face, lighting conditions on the face (e.g., no backlighting or overexposure), the angle of the face (e.g., a frontal view with rotation less than 50 degrees), and whether the face is partially obstructed (e.g., wearing a mask or sunglasses).
Among these factors, the capability of the facial recognition engine is the most critical. In order to evaluate the performance of different recognition engines, rigorous testing methods and large datasets are needed. For example, the National Institute of Standards and Technology (NIST) conducts Facial Recognition Technology Evaluations (FRTE) using standardized metrics, execution environments, and API specifications to compare various facial recognition algorithms. The NIST FRTE dataset includes a wide variety of facial image types, with volumes reaching millions in each category.
The FaceMe® SDK ranks among the top in the globally recognized NIST FRTE 1:1 and 1:N tests. In the VISA-Border 1:1 photo test, it achieved a 99.83% correct recognition rate with a false match rate of one in a million. Face ID on smartphones offers approximately 96% recognition accuracy with a false match rate of one in a million, while Windows Hello on Windows systems provides a 95% recognition rate with a false match rate of one in a hundred thousand, proof that FaceMe® delivers a more accurate and reliable facial recognition algorithm.
Facial recognition technology can be implemented in two ways: "cloud-based facial recognition services" (e.g., AWS) and "edge device facial recognition." Each method has its own advantages, but edge device facial recognition offers several key benefits. With edge device facial recognition, the software is implemented the device itself, eliminating waiting on facial images to be uploaded to a cloud database. This allows for real-time face detection and recognition, resulting in excellent performance and fast recognition rates. In fact, most facial recognition applications on edge devices perform face detection and feature extraction directly on the device. When conducting face matching with database comparisons, whether facial data is stored on the edge device or in a cloud database, the matching is done based on extracted facial feature values. For example, with FaceMe®, the feature file size is only 3KB to 5KB, which means the time required for data upload, matching, and return is significantly shorter than that of cloud facial recognition. Typically, this process can be completed microseconds.
In recent years, the computing power of AI edge devices has greatly increased, enabling facial recognition applications across more scenarios, such as smart locks, mobile devices, point-of-sale systems (POS), interactive kiosks, and digital signage.
CyberLink’s FaceMe® AI facial recognition engine is a facial recognition SDK (Software Development Kit) specifically developed for edge devices. It can be flexibly integrated into various edge computing devices and supports many different chips and operating systems. FaceMe® has demonstrated impressive performance in the globally recognized NIST FRTE facial recognition technology evaluation report. It can be easily implemented in various IoT application scenarios, providing a secure, reliable, and highly accurate facial recognition solution.
To deploy facial recognition applications on edge devices, several factors need to be considered. Below, we discuss key aspects related to chips, operating systems, and AI models:
When implementing facial recognition on edge devices, choosing the right chip for the application scenario is crucial, as this choice affects both cost and performance. The selection of chips can be categorized as follows:
ARM architecture-based system-on-chips (SoC’s) are characterized by low power consumption and low heat generation, making them suitable for most lightweight AIoT device needs.
Major chip manufacturers like MediaTek, Qualcomm, and NXP have taken the lead in edge AI computing by integrating new AI Processing Units (APUs) or Neural Processing Units (NPUs) into ARM-based SoC’s. This enhances AI computing speed while optimizing performance and reducing power consumption, making them the preferred choice for implementing facial recognition on edge devices.
GPU’s offer exceptional performance, enabling them to execute larger, more computationally demanding AI models. Their processing power allows them to handle video streams from hundreds of cameras simultaneously, reducing the number of workstations needed for large-scale security applications, thereby significantly lowering costs.
Intel CPU’s benefit from a comprehensive and mature supply chain. If you wish to directly purchase a ready-made computer, you can consider the Intel NUC series, which balances size and power consumption. With the neural network acceleration capabilities of Intel OpenVINO, these devices can efficiently run facial recognition algorithms at the edge. For industrial applications or other specific requirements, you can also look into industrial computers that utilize Intel CPU’s.
Each type of chip has compatible operating systems (OS), and a good facial recognition engine should support a variety of chip and operating system combinations.
CyberLink's FaceMe® supports a wide range of operating systems, offering multiple cross-platform solutions that accommodate over ten combinations of operating systems and chips:
In addition to the combinations of operating systems and chips, FaceMe® has the option to enable hardware acceleration, which enhances the computational speed of the FaceMe® deep learning algorithms. This includes support for NVIDIA CUDA™, cuDNN, TensorRT, NVIDIA Jetson, Intel OpenVINO™, MediaTek NeuroPilot, NXP NPU, and Qualcomm SNPE (GPU/DSP).
Facial recognition applications are quite diverse, and different scenarios involve various considerations such as hardware costs, shooting angles, and recognition accuracy. Therefore, a good facial recognition vendor should offer different recognition models tailored to specific scenarios and hardware usage. For example, applications like smart locks only require frontal face recognition, allowing the use of a lighter facial recognition model to be implemented on lower-cost devices.
FaceMe® provides three recognition models to meet various application needs, accuracy levels, and cost requirements. Feel free to contact us for more information about the recognition models to receive more customized and professional implementation recommendations!
Currently, due to its high accuracy and excellent customer experience, facial recognition technology has been implemented across various industries and scenarios. Here’s an introduction to a few use cases, highlighting the key aspects of each case.
Main application categories include:
Facial recognition holds significant market potential, with the maturation of both software and hardware technologies accelerating its growth. Companies can greatly improve efficiency and enhance user experiences as a result. However, this growth also brings about the need for comprehensive and rigorous regulatory standards in both commercial and public sectors, along with user education. These measures are essential to address public concerns and enable consumer to embrace using this AI biometric technology.
Join thousands of successful professionals and stay updated with the latest trends in AI facial recognition. Subscribe to our newsletter today!