Overview
Copyright detection systems are among the most widely used machine learning systems in industry, and the security of these systems is of foundational importance to some of the largest companies in the world. Examples include YouTube’s Content ID, which has resulted in more than 3 billion dollars in revenue for copyright holders, and Google Jigsaw, which has been developed to detect and remove videos that promote terrorism or jeopardized national security.
Despite their importance, copyright systems have gone largely unstudied by the ML security community.
It is well-known that many machine learning models are susceptible to so-called “adversarial attacks,” in which an attacker evades a classifier by making small perturbations to inputs.
The susceptibility of industrial copyright detection tools to adversarial attacks is discussed in our paper, which is available below. After discussing a range of copyright detection schemes, we present a simple attack on music recognition systems. We show that relatively small perturbations are needed to fool such a system in the white-box case. We also observe that, using larger perturbations, our attack is able to transfer to industrial systems including AudioTag and Content ID.
Proof of concept: attacking an audio fingerprinting system
Fingerprinting systems are generally proprietary, and few publications exist on effective fingerprinting methods. While many modern industrial systems likely use neural nets, a well-known fingerprinting scheme, which was developed by the creators of “Shazam,” uses hand-crafted features. In a nutshell, the algorithm converts an audio clip into a spectrogram, which is a 2D image showing the frequency content of the clip over time. Then, the locations of local maxima in the spectrogram are identified, and used for fingerprinting. A song is recognized by extracting pairs of nearby local maxima, called “hashes,” and comparing them to the hashes in a database of known songs.
To demonstrate the vulnerability of this copyright detection system, we built a differentiable “Shazam” prototype in TensorFlow, and then crafted adversarial music usingĀ a simple gradient-based attack. We chose this model as the basis of our demonstration for two reasons. First, non-adversarially-trained neural networks are known to be quite easy to attack, while this system (which uses hand-crafted features) presents a harder target. Second, this hand-crafted system is one of the few published models that is known to have seen industrial use.
We demonstrate an attack using the original (unperturbed) audio clip below.
White-box attacks
We start by demonstrating a white-box attack, in which the attacker knows the exact model being used for detection. We can fool the fingerprinting system by perturbing audio to remove fingerprint elements (i.e., the “hashes” recognized by the algorithms). In practice, the number of hashes needed to recognize a song will depend on the precision/recall tradeoff chosen by the developers of the classifier. Here’s an audio clip with 90% of the hashes removed.
Here’s an audio clip with 95% of the hashes removed.
Here’s an audio clip with 99% of the hashes removed.
Transfer attacks on the AudioTag music recognition system
We wanted to see whether our attack would transfer to black-box industrial systems. This can be done, however larger perturbations are required because our simple surrogate model is likely quite dissimilar from commercial systems (which probably rely on neural nets rather than hand-crafted features). We were able to evade detection of numerous copyrighted songs, including “Signed, Sealed, Delivered” by Stevie Wonder and “Tik Tok” by Kesha.
First, we tried attacking the online AudioTag service. As a baseline, we first try fooling the detector by adding random white noise.
Here’s a clip that contains enough random Gaussian noise to fool AudioTag: