> Clearly with LLMs, bulletproof denials are ~impossible due to the way LLMs work.
As a scientist who repeatedly ran into the classifier-based denials: it appears Anthropic’s strategy to make denials more robust, at the cost of many false positives, was to have a separate classifier processing both input and output tokens, at an extremely simple, almost keyword-search level. One weakness of this approach is that it only catches things that use the right keywords: it is in some sense weak exactly where an LLM-based classifier would be stronger.
Work on abstract, closer-to-CS algorithms that used chemistry terminology were blocked immediately, while work directly relevant to chemistry/biology experiments, writing code to process images from a very specific microscopy setup relevant primarily to biological samples, was never blocked at all, because it happened to never use relevant keywords.
That’s consistent with this situation: finding and fixing bugs in the context of looking for bugs perhaps happened to never use words like ‘exploit’ or ‘cybersecurity’.
But you think that Anthropic of all companies would realize this, so why did they do it that way? Did they literally take the first suggestion Mythos gave them to add these guardrails - wouldn't be surprising, seeing the state of the leaked Claude Code codebase.
The reported commit [1] suggests to me that it was an account compromise of some sort, not orphan+adopt: the committer is the same in git, but the contact email changes in the PKGBUILD.
This doesn't necessarily seem 'more elaborate': it is attempting to be better obfuscated against automated checks at the cost of being very obvious to anyone doing even a cursory review of the install scripts. It's also likely something that would be caught instantly by even an extremely naive LLM, as seems to have been the case here. There's simply no legitimate reason why an install script would ever do something like this:
I'm not certain that the git committer tells you the full story. I don't believe the AUR enforces that the git commit email is the same as the current maintainer email. So this could have been an orphan package, adopted by a malicious user, generated a malicious commit with the previous maintainer's git info.
Unfortunately, I don't see a way of viewing the ownership history of a package in the AUR. I know you get emails with ownership changes if you're subscribed to a package, but I don't see this info in the web interface anywhere.
> They're packaged using the same tools and technology, with the intent that they can be easily validated and promoted to core stuff in the future.
That's perhaps the intent ideally, but in practice, it feels like AUR tends to be (a) niche, esoteric things that will never be anywhere outside of AUR, even if they could, or (b) installation methods for proprietary/otherwise non-open packages that can't be.
The latter seems to a major popular use of AUR: sorting packages by popularity or votes comes up with lists that seem to be mostly these. And that's likely a significant draw for non-technical users. If you want to install things like Dropbox, Chrome, VS Code, Minecraft, Zoom, Slack... they all show up in AUR. By their nature (usually extracting packages from upstream installation methods), they tend to be more complicated than generic AUR packages. They are also often quite a bit more convenient than using the upstream packages, which might not interface well with Archlinux, might only be available with installation methods that clobber things, might be deb/rpm only, etc.
I wonder if it would make sense to have a more trusted/vetted repository of these sorts of scripts, separate from core repositories but also not as free-for-all as AUR. That might go a long way toward keeping non-technical users from being drawn to AUR.
>`rua` and other similar CLIs make it really easy to review the packages before installing them from AUR too, and if you are doing banking on the same computer, you really have no excuse not to review the software you depend on.
What review should users do?
It appears that, in some cases, these were adding npm as a dependency and installing atomic-lockfile, and in others, these were adding bun and installing js-digest. This was a mass attack against mostly low-use/orphaned/etc packages where maintainership was taken over or a different user uploaded a new version (itself a very simple, low-notice, low-oversight process), and many of the packages clearly had no connection to Node.js at all, so a user who knew enough about each package, and knew what npm was, might notice the oddity in the package, if they reviewed every line of the PKGBUILD, then reviewed the install scripts.
But legitimate AUR packages for packages connected to Node.js also use npm, for example, and at times, use npm install. A user would have to be familiar enough with Archlinux's build system to understand the difference between each part (eg, build() vs install scripts). They'd have to review every PKGBUILD, every install script, and every patch of every AUR package they install. For packages that actually do use npm/bun, they'd have to be familiar enough to know what uses were legitimate and what uses were not, and might have to be up to date on compromised dependencies. And this is still considering a mass attack that was not particularly hidden. Attacks could be made much harder to find.
Asking a user to safely review an AUR package essentially seems like it is asking them to fully understand not just the build process, and programming language, of the upstream package, but also all details of Archlinux's build system. They need to learn how to do this with, as far as I can tell, no real guidance: AUR itself, and the wiki's page on it, just warn that users should carefully review the PKGBUILD and install scripts, without giving any substantial guidance on what to look for or how to review anything. The warnings feel much more like liability-reduction than an attempt to be helpful.
At that point, what is AUR actually offering that installing the upstream package isn't? It feels like the suggested 'safe' way of using AUR would make it just as much work for the user, and require just as much knowledge, as either installing the upstream directly, or even making a package for it.
There is perhaps some room for LLM analysis here: Opus 4.8, Kimi latest, and even Qwen3.6 27B quickly catch at least the current round of malicious packages in my tests. But a motivated attacker could make that more difficult, or dangerous. And a user could also just have those models install the upstream package, with less risk. If they want to use pacman for management, they could likely even have those LLMs generate a package, with less risk.
Not all tools are made for inexperienced people. Not everything is idiot proof. This is OK!
In my experience using the AUR:
1. when you first install the package you can read the build script (and you should). These are in a very standard structure, and if the one you are reading is weird and complicated consider not installing it. No one is forcing you to. Almost every build script I read just downloads a build from a tagged github release.
2. when you get an upgrade you are shown the diff. For almost every AUR package I use this is literally just changing the $VERSION variable and the subsequent $HASH of the download. It is trivial to see if anything (in the AUR script) is happening that is sneaky.
It's really not that scary. And if it's considered scary, there are literally dozens of other linux distros (not to mention Windows or MacOS) you could be using instead.
I'm not asking for myself. Yes, I understand the build process, and know what to check. I've also written PKGBUILDs before and have had packages in AUR. I'm sure you understand it too, as well as many people here.
But many users don't. As far as I can tell, there is very little actual guidance about what to look for, not even to the extent of what you explain here, on the wiki. Users are told to check the PKGBUILD, and warned about AUR-helpers being dangerous, but in practice, it seems AUR-helpers are widely used, and many users likely just click through PKGBUILDs they won't be able to understand.
And, again, this attack was a relatively obvious one. Other attacks could be made much harder to notice.
Worse, distributions like CachyOS are being broadly promoted to a user base who can't be reasonably expected to check over AUR packages themselves. Unlike ArchLinux, those sometimes do seem to promote AUR-helpers. In some cases, those distributions are apparently including AUR-sourced packages in their actual repositories.
Questions about these topics often result in typical Archlinux hostility. And in some sense, that's understandable: there are other distributions that most users should be using, and the frustration of people using Archlinux who shouldn't be is wearing. It is nice to have a distribution that offers the flexibility and space for experimentation that Archlinux does. It's one of the reasons I use it on some of my machines, while at the same time recommending against most others using it.
To some extent, this is just a wide cultural difficulty with Linux, and there isn't a clear answer. On one hand, you want enough gatekeeping to keep users away from potentially dangerous systems they have no interest in understanding, and that they'll rely on without understanding in situations where they shouldn't. On the other, you don't want to keep out users who are interested in learning.
> But many users don't. As far as I can tell, there is very little actual guidance about what to look for, not even to the extent of what you explain here, on the wiki. Users are told to check the PKGBUILD, and warned about AUR-helpers being dangerous, but in practice, it seems AUR-helpers are widely used, and many users likely just click through PKGBUILDs they won't be able to understand.
That's where the whole "Not everything is idiot proof" thing comes in. The distribution is pushing the responsibility on users to vet what they do, across everything, not just installing AUR packages, so naturally this also applies to installing 3rd party software.
If you don't know what to look out for, maybe don't install stuff you don't know what it will do. Sucks as an answer if the distribution is looking to "Make it as easy as possible for every user" but that's not Arch Linux ultimately, it does ask you to care about things like that, if you don't want to, it might not be the OS for you. And that's of course OK and not something bad. I know this sounds like gatekeeping, but it's more of a culture difference than anything, and probably not even a problem.
> distributions like CachyOS are being broadly promoted to a user base who can't be reasonably expected to check over AUR packages themselves
That'd suck, but not the impression I've got from CachyOS. There is a FAQ entry that seems to get the gist of AUR correct, that it's basically random software from random users, nothing is assumed safe: https://wiki.cachyos.org/cachyos_basic/faq/#aur-safety-pract...
> this is just a wide cultural difficulty with Linux, and there isn't a clear answer
I don't think "a answer" is needed here. What some read as "gatekeeping" and "Arch Linux hostility" is in reality just a difference of culture, and that's not a bad thing. Some distributions are for making things "easy for newcomers" or some focus on "best UI and UX" and others "most barebones for experienced users to setup themselves", and all of them as valid as the other. The tricky (and slow/time consuming) part is that you have to try a bunch before you find which one(s) aligns with your own perspectives and ideas.
Ultimately, users can learn best together with distributions that align with how they think and want to work.
>What some read as "gatekeeping" and "Arch Linux hostility" is in reality just a difference of culture, and that's not a bad thing.
Oddly enough, when I was writing that, I wasn't thinking about Arch, but Ubuntu. Years ago, I can remember a situation of a PPA being used for developing something I was involved in somehow, and while the PPA specifically noted that users shouldn't use it, they just did anyway, because they wanted what they saw as the latest and greatest versions of those packages. When the PPA owner added a package that set the default wallpaper to a warning about adding the PPA and updating all packages from it blindly, the users blamed them, rather than understanding the message. At the same time, I was actually using that repository legitimately, and it was useful.
So 100%, I agree that it's highly dangerous that the distro's the next tranche of people unfamiliar with linux (gamers dissatisfied with Windows) move over with, are based on hecking Arch. It feels like a massive upcoming footgun.
I think the issue is those repos being based on Arch though, not Arch itself.
To be fair, among all the linux users I know, no one except developers/cs-adjacent would actually get hit by this. The point is that "noob users" use packages that are, to put it short, maintained by a big company. Or it's something that's there in the official repos. And the big companies always maintain their own supply chain till the end, i.e they maintain their aur packages or their curl | bash endpoint themselves. So it ends up being alright.
Stuff that tinkerers use is often some random fork of a fork of a gitHub repo, maintained by someone else, and the aur package maintained by a fourth person. That's where the mess is. Thankfully, these are also the users you can expect to read a pkgbuild diff.
> At that point, what is AUR actually offering that installing the upstream package isn't?
It produces package files that pacman can use. Sure, you can curl|sh or whatever, but that's a good way to litter stuff all over that you can't track or uninstall cleanly.
The same sort of review you'd do if a stranger sends over a project and says "compile and run this" and you actually want whatever it's supposed to do, so you start looking through it.
> It appears that, in some cases, these were adding npm as a dependency and installing atomic-lockfile, and in others, these were adding bun and installing js-digest
That's very suspicious if the package you're about to install doesn't seem to actually need those things. Since "AUR === random strangers on the internet with zero trust", then you need to pay attention to those sort of things.
> Asking a user to safely review an AUR package essentially seems like it is asking them to fully understand not just the build process, and programming language, of the upstream package, but also all details of Archlinux's build system.
Yes, indeed. Same as if you come across a random C++ project on GitHub with 2 stars, do you just pull down the source and compile willy-nilly? Probably not, you carefully inspect it can actually do what you want, how it does it, and so on. AUR is basically like GitHub in this case, zero peer-reviews and users fully responsible for whatever they install.
> At that point, what is AUR actually offering that installing the upstream package isn't?
PKGBUILDs, so you don't have to write them yourself. Not more, not less, just a central place for random strangers to share PKGBUILDs that may or may not work for others.
I hear you, but consider xz. I'm a professional with decades of experience and I'd be lying if I said I'd have caught that. How long would an audit have taken, realistically? You're not wrong, but I don't think the GP is, either.
Yeah, xz found its way to official repos, that's way more disturbing and scary that this (faux) issue about malware on AUR/user-generated websites.
I don't review updates to official packages on Arch, I don't think most people have time to do so, it's just way too much. Things change when we talk about AUR though, as those aren't vetted, those you need to take the time to review, otherwise you're basically installing completely unreviewed software from strangers on the internet.
Europe does not consistently have KYC for phone service, at least for mobile connections. Normal phone companies in Ireland don't ask for information when buying SIMs (physical ones, at least). Some eSIM providers in Europe don't ask for information at all, and accept cryptocurrency payments. (I'm also aware that some other European countries have very different requirements, up to actually needing copies of identification.)
More widely, however, there do seem to be differences that I don't know the details of. VOIP seems quite different (I use it for my old phones): DID numbers in the US seem extremely cheap and available instantly, with little information, while European ones seem to have an actual verification process and prices that would make large-scale spamming difficult.
As an additional anecdote, I've never heard of a number-porting/2FA attack using social engineering or other methods in Ireland - but we have our own unique issues now with Robocalls and phishing on WhatsApp and SMS.
It appears that the blocking here is of a very different nature than for Opus. Whereas with Opus the blocks seem to be for messages it deems potentially harmful, for Fable, it appears the blocking is simply anything that falls within "topics related to cybersecurity, biology and chemistry, or distillation attempts".
So yes, straightforward biology work will get blocked, because the intention is that any biology work should get blocked. As a scientist, this is perhaps the most useless model I've ever tried.
Different EU countries seem to heavily vary on this point. I’ve seen everything from requirements for id scans and addresses to esims that accept cryptocurrency as payment.
The safety gates on this are extreme, and seem considerably wider than "cybersecurity and biology"; they seem to make it essentially unusable for scientists in a number of fields. I have, so far, been bumped back to Opus on 100% of my prompts.
It appears it can be tripped by things as simple as a mention of equilibrium, or anything involving something that looks like chemical kinetics, even at an abstract level. Even touching basic open source packages in my field will trigger it.
Edit: looking at the model card, it appears that chemistry in its entirety is also included in the banned topics; it's just the announcement that mentions only cybersecurity and biology. It also appears that the intent is to ban chemistry and biology entirely, rather than just banning messages deemed high risk.
This does surprise me, because you'd think that even if they crank up the filter's sensitivity at the expense of specificity, an LLM company wouldn't simply design a filter that triggers on keywords in a completely unrelated context.
Smart classifiers are slow and susceptible to jailbreaking themselves, dumb classifiers are fast but dumb so they need to be either overzealous or useless. Same story as with Gemini's guardrails.
Can you share an example? I've been happily using Fable this afternoon and it just seems like the usual upgrade so far with no interruption to my (fairly standard) SWENG problems.
Basically anything that could potentially make money besides software work seems to be banned.
Software work has actual competitors, and the biggest hypemakers for Anthropic are part of this group so it makes sense to allow it despite them losing money from it.
I've got experience in medicine and finance so I've tried even the mildest biology/medicine and it doesn't give anything, math heavy finance seems to be included in the cybersecurity?
>For example, Ben Franklin didn't consider Germans "white" (famously describing them as "swarthy" [1]). Through different waves, different European ethnic groups became "white". The Irish didn't become "white" in the US until 100-150 years ago. Arabs became "white" in 1915 [2].
To be a bit pedantic, you're combining two different senses of 'white' there: in the US, being culturally considered white by others was distinct from being legally considered white by the government, because from the Naturalization Act of 1790 onward, naturalization was legally limited to "free white persons" (subject to some later modifications, though "free white person" remained a category). From a legal perspective, the Irish (and Germans) were always considered white, and there was never any question (if I recall correctly, for Mexicans, there was at some time advice that legal whiteness should actually be determined based directly on skin color). From a cultural perspective, views could of course be more varied, and there's the complexity that views and discrimination could certainly be based on factors other than seeing people as white or non-white, even at a racial level for the types of people who made distinctions within whiteness. And of course, cultural views remain varied: I've been around some very waspy wasps in the US who, knowing my Greek name, probably didn't consider me to be entirely white.
By the time of the case you cite, the legal question more broadly had become a mess of different and sometimes contradictory decisions, in part because, when it actually needed to be litigated, 'scientific racism' and 'common knowledge' could go in very different directions, as one might expect from something so arbitrary and contrived. Dow was decided on scientific racism lines, and made some Arabs white (particularly Syrians), though there were later cases with the opposite determination, in part because Dow was just Fourth Circuit, but also at times including arguments like 'from a place not bordering the Mediterranean'. Thind, on the other hand, was determined on common knowledge lines, and can bizarrely be summarized as 'being Aryan and very racist does not make a person white' (the entire set of pre-Thind cases around Indians, actually, often have wild and arbitrary decisions; I recall that one lower court decision could be summarized as 'this scientific argument seems dubious, but the guy seems like an upstanding character, so it seems fair to say he's white').
As a scientist who repeatedly ran into the classifier-based denials: it appears Anthropic’s strategy to make denials more robust, at the cost of many false positives, was to have a separate classifier processing both input and output tokens, at an extremely simple, almost keyword-search level. One weakness of this approach is that it only catches things that use the right keywords: it is in some sense weak exactly where an LLM-based classifier would be stronger.
Work on abstract, closer-to-CS algorithms that used chemistry terminology were blocked immediately, while work directly relevant to chemistry/biology experiments, writing code to process images from a very specific microscopy setup relevant primarily to biological samples, was never blocked at all, because it happened to never use relevant keywords.
That’s consistent with this situation: finding and fixing bugs in the context of looking for bugs perhaps happened to never use words like ‘exploit’ or ‘cybersecurity’.
reply