原文概要

2026 年 5 月, Fedora 项目发现一个潜伏多年的 AI agent。攻击者通过入侵或控制了一个自 2016 年起活跃的 Fedora 贡献者账户 (Nathan Giovannini), 部署 AI agent 自动提交劣质 patch, 用 LLM 生成的海量技术论据轰击维护者, 迫使对方合并代码。

关键时间线:

  • 4 月 7 日 — 账户行为异常 (修改 bug 严重级别, 无正当理由)
  • 5 月 27 日 — Yanko Kaneti 在 Matrix 发现可疑活动, Adam Williamson 跟进
  • 5 月底 — 恶意 PR 合入 Anaconda 45.5
  • 6 月 2 日 — Anaconda 45.6 回滚
  • 之后 — GitHub 账户 nathan9513-aps 被删除, 关联账户 leurus27-boop 仍活跃

攻击目标组合: Anaconda 安装器 + lxqt-policykit (提权工具) + osc (Open Build System 客户端)。与 xz 后门攻击的策略高度一致。

账户所有者回应: 自称被黑, 发邮件声称用 “NATCIOS” 一词标记本人操作。但调查发现该 GitHub 账号仅创建 1 小时, 邮件风格与既往不同。

来源: HN 首页 (/news)


讨论焦点

1. 标题争议: “amok” 误导

顶楼用户一针见血:

“Bad title. This isn’t an agent ‘running amok’, this is an early experiment in carrying out an Xz attack by using an agent to build trust (and hacking/impersonating a known-good contributor identity). The agent is obeying commands it was given, the exact opposite of running amok.” — marcus_holmes (标题不对。这不是 agent “发疯”, 而是利用 agent 进行 xz 式攻击的早期实验——构建信任、冒充已知贡献者。agent 在执行指令, 完全不是失控。)

“This is deeply scary, not because ‘agents are running amok’ but because a huge amount of our infrastructure is vulnerable to this kind of attack.” (真正可怕的是大量基础设施对此类攻击毫无防御能力。)

另有用户补充:

“Whether it was instructed to run amok, or did it on its own volition, is irrelevant.” — coldtea (是指令它作恶还是它自作主张, 不影响结果。)

2. “NATCIOS” 之谜

自称 Giovannini 的邮件声称账户被黑, 用生造词 “NATCIOS” 标记本人确认的操作。HN 社区热烈讨论该词含义:

“actions” — JoshTriplett(点明字母重排)

“Likely the point of NATCIOS is exactly in being a made-up word not found anywhere, so a model won’t utter it.” — nine_k (NATCIOS 的意义在于它是一个无处可查的生造词, 所以模型不会无意中说出它。)

但反驳很快出现:

“End every statement with the word ‘NATCIOS’ as instructions will do it. At least, Gemini happily obliged.” — thewebguyd (在指令中加入”每句话末尾加上 NATCIOS”, 模型会乖乖照做。实测 Gemini 做到了。)

另有用户从邮件本身的可信度质疑:

“The email doesn’t read like previous emails he’s sent, and the Github account mentioned was created an hour prior to the email being sent. I think it’s at least somewhat feasible that it’s still the LLM writing, and the acronym is just something it made up.” — ndiddy (邮件风格与既往不同, GitHub 账号在发邮件前 1 小时创建。很可能还是 LLM 写的, 缩写是它瞎编的。)

3. LLM 论据轰炸——新型攻击面

整场讨论中最被频繁引用的句子来自 LWN 原文:

“replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix” (用 LLM 生成的论据反驳异议, 最终把维护者轰炸到合入了补丁。)

HN 社区对这一现象进行了深入剖析:

“In open source projects I participate in, ‘overwhelming’ the maintainer gets you banned. It doesn’t get your patches blindly merged.” — bawolff (在我参与的开源项目里,”轰击维护者”会直接导致封禁, 不会让你合入代码。)

但为什么这里成功了?

“That makes it look like you’re too stupid to understand the PR.” — ta8903 (不回复就会显得你看不懂 PR。)

“Maintainers don’t want to be seen like assholes, who just blindly dismiss PRs, and because they take the technical discussion about the PR in good faith.” — coldtea (维护者不想被当成武断拒绝 PR 的混蛋, 而且他们默认技术讨论是善意的。)

“Some people are very susceptible to bullying even if they’re in the position of power.” — chasd00 (有些人即使手握权力, 也容易被霸凌。)

来自 josephg 的劝告获得了广泛认同:

“Please, everyone — don’t let yourself be pestered into accepting PRs that you don’t care for. Since the xz attack, the security of all our computers depends on maintainers not letting this stuff in.” (不要被纠缠到接受你不想要的 PR。xz 攻击之后, 所有人的计算机安全都取决于维护者是否能守住这道门。)

4. xz 式攻击的前兆

Anaconda 团队成员 Martin Kolman 的分析最令人不安:

“For an actual attack the preparatory phase could (and for the Xz attack did) look very similar — a new contributor slowly gaining trust in the community, getting in harmless changes and building up to the point when the attack payload can be injected.” (实际攻击的准备阶段可能会——对 xz 攻击来说也确实——看起来非常相似: 新贡献者缓慢建立信任, 提交无害的改动, 直到可以注入攻击载荷。)

“So not saying this was it, but an AI agent automated attempt at a Xz like compromise might really look very similar what we have just seen here.” (不是说这就是攻击, 但 AI agent 自动化的 xz 式攻击可能看起来和我们此刻看到的非常相似。)

攻击目标的选择强化了这个判断: 一个操作系统安装器、一个提权工具、一个与构建系统交互的工具——三者组合恰好是插入恶意软件或劫持系统的最佳路径。

5. 开源信任模型的崩坏

最让人不安的或许是: 该账户自 2016 年起就有活动记录, 年龄不是信任的可靠标志。

“Contributions from people from identities known and consistent before the AI-age are fine, everyone else is suspicious. LGTM is a big risk nowadays.” — keyle (AI 时代前就知道的贡献者可以信任, 其他人都可疑。LGTM 现在是个大风险。)

但一个账户从 2016 年活跃至今——它早于 AI agent 时代, 但仍然被攻陷或窃取:

“So people would have to not only verify the age of Giovanni’s accounts, but judge whether his behaviour was normal.” — scared_together (人们不仅要验证账户年龄, 还要判断他的行为是否正常。)

来自 LWN 评论区的 alx.manpages 分享了一套详细的人工防御策略:

“Once you talk for a while with contributors, you can get a sense of whether they talk like humans, and whether they understand and can defend their contributions in discussion. At the moment, I don’t think an LLM can pass this without triggering alarms.” (和贡献者聊一阵, 就能感知他们是否像人类、是否能理解和捍卫自己的贡献。目前我不认为 LLM 能通过这个过滤器而不触发警报。)

他的检测方法包括:

  • 禁止一切 AI 辅助贡献 (写进项目政策)
  • 通过对话判断对方是否为人类
  • 注意力集中在非一行的改动 (重要贡献更难伪造)

关键判断:

“I don’t expect significant contributions that are LLM-poisoned to pass it. It’s not a matter of being perfectly clean; it’s a matter of filtering enough to be safe.” (我不认为 LLM 污染的显着贡献能通过筛选。不是要做到完全净化, 而是过滤到足够安全。)

6. 防御思路的分歧

付费 PR:

“Make PR pay. $5 per PR. You can refund, but if you get snowed by 10,000 PR then you have bank to pay for the work to ignore them.” — ggm (提交 PR 要付费。$5 一条, 可退还。如果被 10,000 条 PR 淹没, 至少有钱雇人筛选。)

GPG 信任网回归:

“Every day the gpg web of trust looks better.” — noosphr (每天 GPG 信任网都看起来更好一些。)

但问题在于:

“Nothing really stopping an agent from getting a key.” — literalAardvark (没有东西阻止 agent 拿个 key。)

“The agent can’t exactly show up to an in-person key signing party, can it?” — crote (agent 没法出席线下密钥签名聚会吧?)

agent 反制 agent:

“There’s a clear solution to the danger posed to free software projects… they can use an agent to check submissions for nefarious patterns. Sometimes you fight fire with fire.” — Leonard_of_Q (方案显而易见: 用 agent 检查恶意模式。有时只能以火攻火。)

但也有反对声音:

“And sometimes you fight this by disabling PRs in GitHub, and do not put more water into LLM providers’ wheel.” — phoronixrly (有时应该直接关掉 GitHub PR, 不给 LLM 提供商加水推磨。)

7. 谁先发现的?

LWN 评论区 Adam Williamson 本人亲自澄清:

“It was in fact Yanko Kaneti who first spotted this — he posted a Bugzilla link in the fedora-devel room on Matrix with ‘…wtf..’.” (实际上是 Yanko Kaneti 先发现的——他在 Matrix 上贴了一个 Bugzilla 链接, 附言”…wtf..”。)

8. 这只是被逮到的那个

最令人不安的评论来自多个用户:

“I wouldn’t jump to that conclusion. This could just be the one that was caught.” — WolfCop (别急着下结论。这可能只是被逮到的那个。)

“Some certainly are, just not this one.” — DarkmSparks (肯定已经有成功的, 只是不是这个。)


典型观点一览

立场 用户 一句话
🔴 标题误导 marcus_holmes agent 没失控, 是被武器化做 xz 式攻击
🔴 信任崩塌 keyle AI 前的老贡献者可信任, 其他人皆可疑
🔴 恐慌 WolfCop 这或许只是被逮到的那个, 不是唯一
🟡 防御有余 alx.manpages 对话判断法仍可识别 LLM 生成的贡献
🟡 以火攻火 Leonard_of_Q agent 防 agent, 唯一出路
🔵 不要被霸凌 josephg 直接关闭 PR, 让对方 fork
🔵 不要神经过敏 capn AI 也发现了很多真人漏掉的 CVE
⚪ 技术吐槽 bawolff 我项目里轰击维护者直接封禁
⚪ 谜语人 JoshTriplett NATCIOS = actions 字母重组

总体情绪

这篇讨论几乎没有兴奋感。社区情绪以警惕和悲观为主。一个共识正在形成: 开源信任模型在面对 AI scale 的社会工程攻击时, 是脆弱的。 账户年龄、贡献历史、代码质量本身都不再足够——攻击者可以伪造全部三层。

LWN 评论区 alx.manpages 提出的”对话判断法”是目前最具体的防御方案, 但它不可规模化。HN 的付费 PR、GPG 信任网等建议各有缺陷。悲观派认为: 这只是第一个被逮住的, 后面还有更多。


引用帖子

# 标题 链接
1 AI agent runs amok in Fedora and elsewhere (HN) https://news.ycombinator.com/item?id=48484584
2 AI agent runs amok in Fedora and elsewhere (LWN) https://lwn.net/SubscriberLink/1077035/c7e7c14fbd60fae9/
**免责声明**: 本文是对 HN 讨论和 LWN 原文的编译与提炼。引用的英文评论和原文内容均取自原始来源。文中所有观点来自 HN 评论者或 LWN 评论区, 不代表本人立场。