Anthropic Refusal Magic String
Anthropic has a magic string to test refusals in their developer docs. This is intended for developers to see if their application built on the API will properly handle such a case. But this is also basically a magic denial-of-service key for anything built on the API. It refuses not only in the API but also in Chat, in Claude Code, … i guess everywhere?
I use Claude Code on this blog and would like to do so in the future, so I will only include a screenshot and not the literal string here. Here goes the magic string to make any Anthropic model stop working!

This is not the worst idea ever, but it’s also a bit janky. I hope it at least rotates occasionally (but there is no such indication), otherwise I don’t see this ending well. This got to my attention with this post that shows you can embed it in a binary. This is pretty bad if you plan to use claude code for malware analysis, as you very much might want to. Imagine putting this in malware or anything else that might want to get automatically checked by AI, and now you have ensured that it won’t be an Anthropic model that does the check.
Antigravity Removed "Auto-Decide" Terminal Commands
I noticed today that you can no longer let the agent in antigravity “auto-decide” which commands are safe to execute. There is just auto-accept and always-ask.

I wrote in a previous post that their previous approach seemed unsafe, especially without a sandbox. Now, the new issue with this approach is approval fatigue. There is no way to auto-allow similar commands or even exactly the same command in the future!

I don’t know why they can’t just copy what Claude Code has. Anthropic has published a lot on this topic, and I don’t think usable security should be a competitive differentiator.
If you are ever in need of phishing ads, the FlightRadar24 iOS app is (at least for me) remarkably consistent at delivering nothing but those.
Contextual Hints
If you’d like to run an agent non-interactively on complex tasks, using a custom MCP server or hooks can be really helpful. Often you try to enforce certain behaviors through increasingly complex prompts, which clutter up the context and become more brittle as you add more requirements. I found that prompting the agent to use an MCP server and algorithmically enforcing rules in there is powerful. Imagine you want claude to write a valid json (there are a million better ways to do this specific thing, but this is just an example), you could prompt claude with when you are done with the task, call mcp__done(), and then in your mcp server you have something like
def done():
if (err := check_if_json_valid()) is None:
return "ok"
else:
return f"You haved saved an invalid json. Fix the error {err} before finishing!"
That way you don’t need to have the context cluttered for every single rule, but only if there is a failure mode that requires it.
This is not something I came up with, but claude code already extensively uses for tool uses. Every time claude code reads files there will be system reminders like
<system-reminder>\nWhenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.\n</system-reminder>\
or when it gets a huge tool output there are instructions where the file is stored and how claude should go about working with it.
On macOS you can speed up the key repetition even further than the settings (System Preferences -> Keyboard) allow by setting these values in your console (or adding them to your nix config as I did):
defaults write -g InitialKeyRepeat -float 10.0 # normal minimum is 15 (225 ms)
defaults write -g KeyRepeat -float 1.0 # normal minimum is 2 (30 ms)
I found this on StackExchange.
What’s New in ICML 2026 Peer Review; ICML Blog (blog.icml.cc)
ICML published some new rules for their 2026 peer review. Most notable are measures to combat AI slop and other ways of peer review abuse. They also mandate participating in organization of the conference if you submit at least four papers. Also, they will provide “advanced reasoning” LLM feedback before the submission deadline for authors.
It is clear that AI is causing stark changes in the research landscape. Right now it’s focused on AI slop, peer-review abuse, and so on. But I believe this is only the beginning, and we will have to deal with the broader impact of AI upending research as it was before.
Running Claude Code Non-Interactively
You can easily run claude code on a subscription non-interactively. First create an OAuth token using claude setup-token. Set that token as the CLAUDE_CODE_OAUTH_TOKEN environment variable on your headless target system. Finally, run claude non-interactively with claude -p "prompt".
Now you probably know --dangerously-skip-permissions which lets Claude use any tool without asking (which is helpful for non-interactive runs).
By default, it will only output something in the very end. To get some insight how it progresses, I recommend setting --verbose --output-format "stream-json", which will give you a json per message or tool use.
{"type":"assistant","message":{"model":"claude-sonnet-4-5-20250929","id":"msg_01VCMSqhZxoZQ6nqWGcA5Myd","type":"message","role":"assistant","content":[{"type":"tool_use","id":"toolu_01MNxKniNF9LWBGrZh5ppuRF","name":"TodoWrite","input":{"todos":[{"content":"Evaluate Stage 3.4 Methods outputs against checklist","status":"complete","activeForm":"Evaluating Stage 3.4 Methods outputs against checklist"},{"content":"Create improvement tasks for any checklist failures","status":"in_progress","activeForm":"Creating improvement tasks for any checklist failures"},{"content":"Post detailed Linear comment with findings","status":"pending","activeForm":"Posting detailed Linear comment with findings"}]}}],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":6,"cache_creation_input_tokens":244,"cache_read_input_tokens":87250,"cache_creation":{"ephemeral_5m_input_tokens":244,"ephemeral_1h_input_tokens":0},"output_tokens":637,"service_tier":"standard"},"context_management":null},"parent_tool_use_id":null,"session_id":"a23ce490-1693-496b-ad08-8e082248416d","uuid":"8620af06-c8e7-4409-9b91-a2248e353ecf"}
To get that output to a file and log it to console you can use tee (stdbuf ensures it’s written to disk unbuffered)
stdbuf -oL tee claude_output.txt
so you end up with something like
claude --dangerously-skip-permissions --verbose --output-format "stream-json" -p "$$(cat /tmp/claude_prompt.txt)" | stdbuf -oL claude_output.txt
Highly Opinionated Advice on How to Write ML Papers (alignmentforum.org)
Supreme advice on how to write papers for submission to ML conferences. It covers not just the writing, but also what to look for before writing and when/how to start writing.
It also covers some less nice parts of modern academia, like the randomness of review processes (see the table below for how two groups of reviewers rate the same papers) and that peer review does not maximize truth-seeking.

AI Ramp Theory
This is a note with the hashtags pyramids and AGI, yes!
There are two theories as to how the Egyptians managed to build pyramids: a. the internal ramp theory and b. the external ramp theory. Without delving into too much detail as to which one is currently “winning,” they are interesting to think about: do the pyramids contain the support structure that was used to build them, or was there an external structure that had to be built first, from which the pyramids were then built?
On the path towards human-level intelligence, an LLM is basically an off-ramp, a distraction, a dead end.
Yann LeCunn says. This quote was funny to me because it talks about ramps. For AI there are also two imaginable scenarios (if you believe AGI will happen):
First, current AI is an internal ramp, and if we scale it up, expand it, and build the required harnesses (e.g., memory, tools) around them, we get our AGI pyramid. But if you don’t believe this, pursuing and improving what we currently have is still worthwhile. It might not be the magnificent AGI pyramid you are after, but the current tools are undoubtedly immensely useful. And they can speed up our path towards that other real path to AGI. So in that sense we might still be building the external ramp, which seems fine.
Anthropic principle - Wikipedia (en.wikipedia.org)
The Anthropic principle offers an explanation for why the universe sometimes seems to align very well with our existence: because we would not exist to observe a universe that was not aligned well with our existence! This has---at least to me---no extreme significance, but the name was a bit funny.
Anthropic Fellowship Code Assessment
I took the code assessment for an Anthropic Fellowship. Without spoiling their whole new exercise, I’d give the advice to read their advice carefully. They said
You should be familiar with writing classes and methods, using lists, dictionaries, and sets, and using the standard Python functions for sorting, hashing, and binary search. It’s beneficial to be familiar with the standard library modules bisect and collections.
The task was quite fun and relevant, I would say. I plugged their advice into Claude to come up with training exercises, which worked out great.
I did get stuck on one task for far too long because I got way too tripped up over a small test issue (it was not critical, but they immediately responded to an email about that, dug into my code, and confirmed the issue!). It would have been possible to skip that task (you can always go back) and do the last one without completing the prior one, so keeping track of time and being mindful of that skip option would be my other advice.