

It’s well-known folklore that reinforcement learning with human feedback (RLHF), the standard post-training paradigm, reduces “alignment,” the degree to which a pre-trained model has learned features of reality as it actually exists. Quoting from the abstract of the 2024 paper, Mitigating the Alignment Tax of RLHF (alternate link):
LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting pretrained abilities, which is also known as the alignment tax.
I went into this with negative expectations; I recall being offended in high school that The Flashbulb was artificially sped up, unlike my heroes of neoclassical guitar and progressive-rock keyboards, and I’ve felt that their recent thoughts on newer music-making technology have been hypocritical. That said, this was a great video and I’m glad you shared it.
Ears and eyes are different. We deconvolve visual data in the brain, but our ears actually perform a Fourier decomposition with physical hardware. As a result, psychoacoustics is a real and non-trivial science, used e.g. in MP3, which limits what an adversary can do to frustrate classification or learning, because the result still has to sound like music in order to get any playtime among humans. Meanwhile I’m always worried that these adversarial groups are going to accidentally propagate something like McCollough stripes, a genuine cognitohazard that causes edges to become color-coded in the visual cortex for (up to) months after a few minutes of exposure; it’s a kind of possible harm that fundamentally defies automatic classification by definition.
HarmonyCloak seems like a fairly boring adversarial tool for protecting the music industry from the music industry. Their code is incomplete and likely never going to get properly published; again we’re seeing an industry-capture research group taking and not giving back to the Free Software community. I think all of the demos shown here are genuine, but he fully admits that this is a compute-intensive process which I estimate is going to slide back out of affordability by the end of 2026. This is going to stop being effective as soon as we get back into AI winter, but I’m not going to cry for Nashville.
I really like the two attacks shown near the end, starting around 22:00. The first attack, if genuinely not audible to humans, is likely a Mosquito-style frequency that is above hearing range and physically vibrates the components of the microphone. Hofstadter and the Tortoise would be proud, although I’m concerned about the potential long-term effects on humans. The second attack is again adversarial but specific to models on home-assistant devices which are trained to ignore some loud sounds; I can’t tell spectrographically whether that’s also done above hearing range or not. I’m reluctant to call for attacks on home assistants, but they’re great targets.
Fundamentally this is a video that doesn’t want to talk about how musicians actually rip each other off. The “tones and rhythms” that he keeps showing with nice visualizations have been machine-learnable for decades, ranging from beat-finders to frequency-analyzers to chord-spellers to track-isolators built into our music editors. He doubles down on copyright despite building businesses that profit from Free Software. And, most gratingly, he talks about the Pareto principle while ignoring that the typical musician is never able to make a career out of their art.