1.Introduction
Adversarial machine learning, AI-powered identity spoofing attacks, and, in general, the failure of machine learning models are emerging critical problems that often attract the attention of the media and society. Surprisingly, however, the response of academic research, particularly concerning authentication systems, lags behind. The objective of this document is to describe the current state on this front. In other words, the goals of this paper are threefold. First, to present the essential notions of adversarial machine learning. Second, to survey the existing knowledge and directions for preventing adversarial machine learning attacks. Third, to present the related challenges in the area of AI-powered identity spoofing attacks in the context of authentication systems. The paper is organized as follows. Section 2 is a brief overview of recent identity authentication systems and their links with machine learning. Sections 3 and 4 are the core of the paper, providing essential basic knowledge about adversarial machine learning, potential mitigation strategies, and the hottest research topics in this area. Section 5 discusses the impacts of adversarial machine learning in authentication. Finally, Section 6 draws conclusions and discusses open challenges. The intended audience for this document includes business managers with a modest background in adversarial machine learning. We present sufficient essential background in such a way as to allow connection and understanding of the current status and organization of the existing works.
1.1.Background and Motivation
With advancements in the artificial intelligence (AI) field, there have been growing concerns about the potential risks of AI being employed for attacks. For example, using AI to conduct automated tasks to pilfer cryptocurrencies, create realistic but fake multimedia content to spread fake news, and facilitate large-scale identity spoofing and social engineering attacks. This paper is motivated to tackle a new AI-powered identity spoofing attack, namely, the voice impersonation attack, which allows an adversary without physical access to the victim’s speech to train a model to accurately impersonate the victim’s voice, masquerading as the legitimate user of an automatic speaker verification (ASV) system. Furthermore, an identified successful voice impersonation attack model can be reused to attack online or in-the-wild systems, e.g., the voice authentication system of a financial, health care, or even a national institute.
Therefore, to provide practice-driven contributions and real-world recommendations to concretely help safety-critical businesses and organizations assess their risks and improve their ASV systems, this paper has three objectives: (1) Comprehensive Voice Impersonation: This paper is the first one to discuss and tackle voice impersonation threats targeting human-computer speaker recognition systems. Previous works only consider closely related problems such as the background sound insertion attack and the ASV attack. (2) Data Property Exploration: Through the first voice impersonation attack campaign using both genuine victim voice data and publicly available parallel data, we conduct the first empirical study to explore the following data presence property: to find out when voice impersonation is possible across a range of racial groups when only a small amount of the victim’s genuine data is available for the adversary. The results are important for data collection, model design, and sample security performance of the target system.
1.2. Scope and Objectives
Authentication is an essential security function, aiming to control access to a system or to a specific service by determining the degree of trustworthiness of the party that desires access. Since the beginning of computer communications, user identification has followed with original tokens that exist in all complex systems, even at the present day. However, passwords have been the target of multiple attacks due to various aspects such as their volatility, memorability, characteristics, or management.
Considering all of these factors, we propose a practice machine learning enhanced authentication system that specializes in detecting human life presence in cooperative scenarios such as service access control by attackers. More precisely, the experimental results show the potential of combining linear discriminant analysis, principal component analysis, and the support vector machine algorithm in implementing efficient authentication mechanisms founded on an encrypted 3D hand vein biometric system, presenting high accuracy, sensitivity, positive predictive value, and performing superiorly at reducing biases associated with multiple and uncontrolled contributory factors.
2. Fundamentals of Adversarial Machine Learning
In this section, we introduce the fundamentals of Adversarial Machine Learning. First, we define machine learning and the usual definition of a model. Then, we define the standard method for training a machine learning model; that is, gradient descent with backpropagation. With these definitions in place, we can introduce adversarial machine learning and subsequently introduce adversarial spoofing attacks, jumpstarting our discussion towards secure machine learning models.
2.1 Introduction to Machine Learning
Machine learning is defined as the field of study that gives computers the capability to learn without being explicitly programmed. There are three approaches to machine learning. Supervised learning models are trained using labeled data; classification and regression are examples of this. Unsupervised learning models are trained using unlabeled data. Unsupervised learning uses the input data to learn the underlying structure of the data. Clustering and association are examples of this learning approach. Semisupervised learning models are trained with input data, often with an input-to-output mapping. This approach combines both labeled and unlabeled data in the learning.
Let D be the input data, T be the target data, and conf Model be the confidence model. Then, the confidence model can be given as conf Model(D, T). Model*(D) = L(Model(D), arg max(T) Model1(D), Model2(D), · · · , ModelN(D))
Where L(Model(D), arg maxT Model1(D), Model2(D), · · · , ModelN(D)) is an objective function used to train the model. The purpose of training a model is to make predictions about the real world. The performance of a model can be evaluated by means of related statistical measures. However, the classification accuracy often falls short in providing a good measure of the confidence in the prediction of the model.
2.1. Basic Concepts and Definitions
In this section, we introduce and formally define the basic concepts necessary to understand and address modern identity fraud, highlighting the differences between traditional and AI-powered attacks, while clarifying the relationship between individual and institutional fraud, in which the same individual may attempt to defraud more than one institution. A scheme used to qualify and authenticate a user, providing the user with an identity and access to a set of claimed attributes, is more generally called an authenticator. A password-based authenticator, for example, is verified when a user enters a password, and through a cryptographic verification procedure, the system where the authenticator is applied is convinced of its correctness.
So, the gains that an attested and authentic verified entity gets are usually called its claimed attributes. Moreover, the discount levels offered for using a credit card, for example, depend on the user’s claimed attributes. They are obtained from the information generated by an attribute provider, which certifies the attribute based on trust procedures and is signed and agreed upon with the certificate authorities, which challenge and confirm the user’s attribute. A user’s claimed attribute is valid as long as the attestation and verification procedures are performed correctly, and also that the user is correctly identified.
2.2. Types of Adversarial Attacks
A common modus operandi of an adversarial attack in classification, regression, and decision-making problems is to inject an artifact into the observation in the hope that the artifact alters the decision made or at least makes the classification network fail to make the correct decision every time. The attacker can bias the network to general failure by making the network abandon and refuse to make a decision, amplifying the adversarial attack and thereby making the decision-maker perform in a chaotic manner. The attacker can also minimize the adversarial attack to make the decision-maker think about the decision, and by that, the decision-maker accepts failure by refusal or fatigues and thus accepts biases.
There are several types of adversarial attacks: analog, digital, and combined attacks. In digital adversarial attacks, a network is distorted by digital pixel input attacks, such as convolution kernels, scaling, rotating, normal or uniform noise, and blurring, as these have demonstrably high failure rates in changing the way the network classifies the image. In an analog adversarial attack, attackers alter an actual object so it can be visually classified in an incorrect class. Between the digital and analog offenses, combined adversarial attacks involve attackers combining some kind of digital pixel input attack into the real-world anomalous object that is to be presented, thereby affecting an incorrect classification.
3. Adversarial Attacks in Authentication Systems
An adversary’s goal in the field of authentication is to find one or multiple unfair advantages compared to a legitimate user. Firstly, depending on the information used by authentication mechanisms, adversaries will inevitably favor certain types of data, such as biometrics, passwords, facial recognition, transcripts of personal conversations, or others. Secondly, the adversary will favor interactions and exchanges that are not costly and proceed at a low concentration level. Given that there are numerous AI-based authentication mechanisms, an intelligent way for an adversary to identify the kind of information used, the necessary degree of concentration, the natural interaction, the time limit constraints, the type of machine learning model, the metrics, and the background data used is to closely observe and study separately the operation of each system. An adversary might end up confused among AI-based anti-spoofing facial recognition systems. Some systems might authenticate users using a simple picture of the face, whereas others might require deep and concentrated 3D and even liveness detection.
One of the key components of authentication algorithms is a type of activity that seeks to obtain control over the rate of testing or verifying users. To prevent an adversary from capturing and identifying this activity is not an easy task. The basic principle of a CAPTCHA is to be easy for a human to decode and very difficult for a machine. Defending a machine learning-based authentication model against single DoS attacks involves the concepts and methods of numerical transformations and encryption. However, dealing with AI-based adversaries requires combined techniques of reinforcement learning and adversarial learning. No natural anti-spoofing rule can resist the advances of AI. Combined rules guarantee the collection of fresh real data for training and ensure spillage control. Reinforcement learning acts as the first rule and ensures the requirement for human intervention when an increase in recognition errors becomes apparent. With this secure functionality, the system is likely to be difficult to bypass by an adversary, even if the model in use is familiar to the adversary.
3.1.Traditional Authentication Methods
When a user interacts with a computing system, one of the primary concerns is to ensure the user is an authorized person. This process is called authentication. Secret or private objects that are used to prove identity are called authentication data. A legitimate entity is verified by showing a valid credential, such as a password, cryptographic smart card, biometric data, location, or by performing an action or possessing a unique object. An authentication system requires one or more types of authentication data that need to be kept secret by the entity. The higher the security, the more sign-in methods should be required. The most common authentication techniques include: 1. Something the user knows, such as passwords, passphrases, and personal identification numbers. 2. Something the user possesses, such as physical security tokens, key fobs, and contactless cards. 3. Something the user is, such as biometric authentication methods consisting of fingerprints, faces, voices, and handwritten signatures. 4. Something the user does, such as cognitive behavior consisting of keystroke dynamics, voice recognition, and signature dynamics.
3.2.Vulnerabilities to Adversarial Attacks
Most of today’s authentication systems are built upon well-established traditional machine learning algorithms. These traditional algorithms, despite being fast and easy to implement, have a serious issue with respect to adversarial attacks. A small, carefully crafted change in data, a so-called adversarial example, can cause these algorithms to make a mistake, even when they input the distilled knowledge about the algorithms. It is due to this characteristic of adversarial attacks that the traditional machine learning algorithms are not considered robust and secure for authentication systems. Adversarial learning attacks are identified as one of the existing threats to machine learning. Biometric systems, a class of traditional performance-based authentication mechanisms, are considered less discriminating and less robust against attacks.
Convolutional Neural Network has achieved impressive performance in biometric identification and verification tasks. In general, the CNN-based identity verification was initially trained on a large-scale labeled dataset in a supervised manner to extract the discriminative features for biometric identity representation and authentication. However, the recent findings of the vulnerability of CNN in object and facial recognition against adversarial perturbations have exposed serious security and robustness issues when integrating CNN with biometric authentication systems. It is reported that a CNN-based object classifier can be easily deceived when an adversarial perturbation is imperceptibly added to benign images. The existence of such vulnerabilities in deep learning classifiers will seriously threaten the reliability and strength of biometricbased identity authentication. The enormous feature abstraction capacity in deep CNN that captures and processes high-level semantic content makes it extremely sensitive to adversarial manipulations. The reported results indicate depletion of CNN’s tuning senses to differentiate between authentic objects within the same class and adversarially perturbed objects from different classes upon violating the natural image manifold. In simple words, an example image of a stop sign may be detected as a yield sign when adversarial manipulation is added to the stop sign’s pixels. Despite the high perceptibility of adversarial perturbation, this type of deceitfulness is fatal for biometric identity authentication. The vulnerability of adversarial manipulation against CNN also imposes severe cybersecurity risks to biometric surveillance and forensic investigation.
4.Detecting Adversarial Attacks in Authentication Systems
To our knowledge, the proposed approach is the first that depicts and leverages the textual mutation strategies and adversarially unmasked distributions to detect adversarial attacks in authentication systems. Detecting and blocking adversarial attacks during inferences in the real world is of great importance. Testing real-world security systems is not sustainable for adversarial networks, as such deployments may prove harmful. Thus, part of the experiments has been designed to validate the proposed approach and then subsequently train the authentication system so that it is ready to handle authentication requests in the real world. The experiments show state-of-the-art performance of our proposed approach. Experimental results show the generation of adversarial examples that can be detected with a small drop in delivery adequacy. Further study provides various lessons and interesting results. The security of a detection method can be assessed by coupled DNNs attack. The adversarial attack and detection can be regarded as an arms race. However, the evolving detectors have a chance to win this race. The first-line defense, the deep learningbased adversarial detector, helps to cope with the adversarial attacks without changing the generation strategy. Our work showed that these adversarial examples can be detected with high confidence, though some of these examples cannot defeat the recognition. In the early attack stage, we explored heavily the mutation examples, which aimed to directly mislead the detection.
4.1. Anomaly Detection Techniques
Anomaly-based detection systems attempt to create a profile of normal user behavior and then identify deviations from that baseline. Developments in deep learning have allowed researchers to adapt this approach using RNNs or LSTMs to model sequential data, and unsupervised techniques such as autoencoders to learn a representation of normal activity. During the time of writing, anomaly detection techniques have realized a relatively low adoption rate despite remarkable performance, due to several technical challenges including noisy data labels, scalability, choice of architecture, training complexity, detection accuracy in the presence of adversarial attacks, as well as a lack of explainability.
To our best knowledge, no research has been done in the investigation of applying anomaly detection techniques powered by recent machine learning and/or deep learning models to protect authentication systems from impersonation attacks. We are not aware if unsupervised techniques work on their own or if there is a need to combine them with supervised models to rely on human feedback. We aim to fill this research gap and study how machine learning methods, either data-driven or based on human-designed features from raw data, could monitor the authenticity of authentication attempts and detect those generated by pure AI agents. Our results could be a stepping stone to make authentication systems secure with a very low verification time, critical in scenarios such as border control or airport check-in, and/or an ambiguous operational risk.
4.2.Machine Learning-based Approachesv
A popular approach in adversarial machine learning is poisoning the training data, which is already generated by normal APIs. Then, these poisoned training datasets are used to create the ML models. Therefore, by creating poisoned datasets, training the attack ML models, and then using them in the attack process, adversaries can adapt their attacks towards many of the ML-based solutions to act as if these deployed solutions are actually non-existent. Nonetheless, training attack models to be used specifically for such purposes may not be necessary for adversaries. This is because these ML models, with their sensitive structure and cooperative identity characteristics, help the adversary to generate, train, and exploit the models by using the normal APIs, as well as the regular updating process via these normal APIs, which may already be dual-use sensitive, biased, and vulnerable by default.
By obtaining generated understanding and trained structures of these deployed models, only a short attack process, which is already supported by many tools and platforms, is needed by the adversaries to convert the generated understanding into fast and dynamic profitable attacks, regardless of the ML layer. Successful examples in face recognition and verification applications include deep interpretable-discriminative models that generate and exploit system-specific attributes from face images, as well as using generative models to learn system-specific face representations for black-box identity verification solutions.
5. Preventing Adversarial Attacks in Authentication Systems
In Section 3, we have shown that deep learning classifiers are vulnerable to adversarial machine learning attacks, and the identity spoofing attacks can be prevented using CAPTCHA mechanisms. In this section, we propose mechanisms to prevent identity spoofing during authentication. The first CAPTCHA is a personalized text-based mechanism that integrates limited but useful pre-selected personal information. The second CAPTCHA is an image-based mechanism that requires a human to recognize and interact with simple images with the aid of a graphical reminder. We also discuss how to mitigate the adversarial attacks using fingerprinting techniques.
The authentication solution can be divided into two categories: the first type incapacitates the adversarial manipulations carefully, and the second type enhances the generalization of classifiers by hardening the input distribution. The design of the CAPTCHA as Proof of Backup model begins with a list of pre-defined personal property questions. Each user will be asked about their specific pre-selected personal objects, and the user is expected to pick up or recognize these objects pre-selected by the user. We use these predefined personal items of the users to create personal CAPTCHAs, and the user will choose the specific answer during the registration phase.
5.1. Adversarial Training and Robust Models
Much effort has been invested in adversarial training as a preventative method for defense against adversarial attacks. Adversarial training has been successful on specific, focused tasks using supervised learning, particularly in computer vision tasks. Adversarial training entails training a model on adversarial examples, performing one or more rounds of adversarial retraining on specified features. With each round of adversarial retraining, the model builds an understanding of the environment and error surfaces, providing update problems for subsequent rounds of robustness improvement. Adversarial retraining methods may follow curriculum learning mechanisms to come up with difficulty sparring partners from the adversary set, or use methods such as heuristic search, gradient descent-based search, or using meta-learning.
While adversaries have successfully overcome the robustness of adversarially trained models in supervised learning tasks, especially in the computer vision domain, adversarial training is yet to be applied to actions of real-world consequence. Using unsupervised learning or reinforcement learning methods to train models has been proposed as a way to instill adversarial robustness against adversarial attacks, but it remains to be explored. However, with its current state in machine learning, training a model to fool an adversary, who may be actively learning the error surface of the model, represents a potential offense for those not yet directly facing significant real-world problems with adversarially trained models.
5.2. Countermeasures and Defense Strategies
We briefly discuss some defensive measures that can potentially be taken to protect these systems from adversarial machine learning attacks: – Collect more data: One of the keys to machine learning is obtaining enough data that accurately represents the feature space that the model will encounter in the real world. However, this is often not feasible, particularly in authentication systems with unique events happening within a short time period and negligible privacy constraints. Forcing user responses in order to collect more candidate pool event impostor pairs for training simulations would be feasible from a system technical perspective but impractical from a user experience perspective. – Train on absence similarly to presence: By adjusting the impostor class generation process and disentangling presence and absence learning by creating a distinct feature space for absence separately, the problem of adversarial machine learning-driven exploitation of similarity scoring models can potentially be neutralized. – Leverage more complex non-convex decision boundaries: Employing more complex non-convex decision boundaries, such as tree ensemble algorithms or deep learning neural networks, can potentially evade the adversarial machine learning-driven exploitation of simple models.