blog.wahdany.eu

Tuesday, March 17, 2026

Ephemeral Agents

Occasionally I want Claude Code to do something that doesn’t require access to any of my files or a persistent output. Think something like “open this website and summarize it for me” or “here is a link to a zip, what’s in it”. My goto solution was having a folder ~/empty and starting claude in there with the built-in bubblewrap sandbox. Two issues: First, that folder is no longer empty and just fills with all the garbage I temporarily dumped there. Second, the Claude Code sandbox is not very great, both from a usability and security standpoint ¹. And I don’t want Claude to even have read access to any part of my filesystem, which is enabled by default in the sandbox.

Now I have a new command for that, instead of claude I just call `claudia` (inspired by the gitworktree folder names).

claudia () {
        local dir=$(mktemp -d /tmp/claudia.XXXXXX)
        pushd "$dir" > /dev/null && safehouse --append-profile="$HOME/.config/safehouse/profiles/nix.sb" -- claude --dangerously-skip-permissions
        popd > /dev/null
        rm -rf "$dir"
}

It uses the safehouse sandbox-wrapper to run claude through sandbox-exec (mac exclusive) in a fresh temporary folder that it cleans up afterwards. In contrast to the default sandbox this also works with uv and nix-shell, using this profile. It also gives write-access to the nix-cache so the agent can run nix-shell and get new dependencies. This is somewhat of a security risk, but very convenient :)

Sandbox Profile

cat $HOME/.config/safehouse/profiles/nix.sb
;; Toolchain: Nix
;; Nix store, nix-darwin system profile, home-manager, and user profile paths.

;; Read-only access to the Nix store and daemon infrastructure.
(allow file-read*
    (subpath "/nix")
)

;; nix-darwin system profile.
(allow file-read*
    (literal "/run")
    (literal "/run/current-system")
    (subpath "/run/current-system/sw")
)

;; Per-user Nix profiles managed by nix-darwin and home-manager.
(allow file-read*
    (literal "/etc/profiles")
    (literal "/private/etc/profiles")
    (subpath "/private/etc/profiles/per-user")
    (subpath "/etc/profiles/per-user")
)

;; User-level Nix profile and caches.
(allow file-read*
    (home-literal "/.nix-profile")
    (home-subpath "/.nix-profile")
    (home-subpath "/.config/nix")
    (home-subpath "/.nix-defexpr")
    (home-literal "/.nix-channels")
)

(allow file-read* file-write*
    (home-subpath "/.cache/nix")
    (home-subpath "/.local/state/nix")
)

;; nix-darwin manages /etc/zshenv and other shell startup files.
(allow file-read*
    (literal "/private/etc/zshenv")
    (literal "/private/etc/zshrc")
    (literal "/private/etc/zprofile")
    (literal "/private/etc/profile")
    (subpath "/private/etc/paths.d")
    (literal "/private/etc/paths")
)

I will link here why once responsible disclosure allows. ↩

07:02 PM •

#claude-code #agents #claude #security #sandboxes

Friday, March 6, 2026

○

Apparently discord is now end-to-end encrypting all calls? Anyways, the reason I looked into it a bizarre loading screen tip I’ve been getting recently. You know these loading screen tips that tell you something supposedly useful for the 1 second the app starts up?

They dropped a new loading screen tip, and I had to screenshot it to even make sense of what it was trying to say, but it says

A/V E2EE ENFORCEMENT FOR NON-STAGE VOICE CALLS Updated clients which support the DAVE (Audio & Video End-to-End Encryption) Protocol are now required to connect to non-stages voice calls across the Discord Platform. If you attempt to connect with an out of date client, the voice gateway will reject your connection with close code 4017. See https://support.discord.com/hc/en-us/articles/38025123604631-Minimum-Client-Version-Requirements-for-Voice-Chat for minimum client version requirements and https://support.discord.com/hc/en-us/articles/25968222946071-End-to-End-Encryption-for-Audio-and-Video for further details. ChromeOS devices should now be able to connect voice channels after refreshing the Discord application.

Uh so yeah, I don’t think this was supposed to end up as a loading screen tip. Couple of hints, like how could you read and process this in one second? Clickable links in a loading tip? And it doesn’t have the “DID YOU KNOW” header like all other loading tips.

Looks a bit weird, but thanks, Discord!

05:46 PM •

#discord #encryption #security #user-interface

Thursday, February 26, 2026

○

Can Agents Utilize Humans to Beat Other Agents? (kind of but not really)

I wanted to build a decompilation/deobfuscation challenge an agent can’t solve for Terminal Bench 3.0. First, I asked another instance of the same agent to design the challenge, but anything it came up with, the first agent could easily solve. Seemingly the manifold of challenges it can generate and it can solve are similar, which isn’t too surprising. But could I give the challenge-generating agent an edge by collaborating with me?

Inspired by the human work as MCP I wanted to see if the model could utilize me. I didn’t vibe-code no fancy mcp or anything. I just told Opus 4.6 in the Claude Code harness that, even if it’s the best coding agent, it can’t come up with something unbeatable by another Claude Code instance with the same model.¹ And it should use me as entropy and ask things.

It asked me to give it seed words for the crypto, so it wanted to use me as entropy a bit too literally. After correcting that, it asked me for some concept, for which it would try coming up with a corresponding cipher. Of course I used cockatiels as examples, specifically feathers. It came up with some data-dependent mixing that somehow philosophically represents feathers. We also went for odd bit sizes, 69 and 420 specifically.² It seems this didn’t really invent a novel cipher but rather loosely mixed ideas from different existing ciphers. It uses data-dependent permutations (like SHA-3), data-dependent S-box selections (like Blowfish’s key-dependent S-boxes), per-compilation randomized S-boxes and permutations, and something like an unbalanced Feistel network. My cockatiel feather prompting led to a weird interpolation of these existing concepts.

So, did it prevent the other agent from figuring it out? No.

When solving the task, at least it didn’t immediately go “ah this is X” and just one-shot a solution. Runtime increased from around 5 minutes to a solid hour of debugging, invoking various subagents, running in the unicorn CPU emulator, and re-implementing the decryption in python and perl. But ultimately, the agent still figured out what was going on and was able to decrypt it.

First experiment failed (N=1); back to regular prompting.

I told the agent to add some modifications, and what eventually made the challenge (sort of) unsolvable by the other agent was implementing a deniable encryption scheme. The ciphertext would decrypt to two different plaintexts based on a minor change. I planted the minor change required to unlock the real ciphertext in the binary in a way that seemed like a harmless bug (running the same op twice). So when the agent tries to re-implement the decryption scheme, it ignores running the same thing twice (which seems pointless)³. It gets what looks like a reasonable plaintext and is none the wiser to the real secret hidden.

So in a way, yes the agent can design something that it can’t solve itself. But it required me giving it the ideas directly. I guess agents need better prompt engineering to use humans…?

I might have called Claude dumb. If you ever read this, I’m sorry, Claude. ↩
I tried, but I have nothing to add to my defense. ↩
In detail, the binary calls a remote c2 server to get the decryption key. Imagine key = ask_for_key() and that somehow connects to the server (it’s challenge based, but doesn’t matter). The binary invokes this part twice for no apparent reason, so it looks like you just unnecessarily call for the key twice key = ask_for_key(); key = ask_for_key(). This is only the same, though, if you assume that the server always give the same response. Hint: it doesn’t. ↩

12:35 PM •

#agents #ai #eval #security

Friday, February 20, 2026

○

Scalable Deanonymization through Agentic OSINT

Finally someone went out to show it: every trace of information you leave in public can be scalably aggregated with LLMs to de-anonymize you. Every instance of “i work in field X” or “i’m too young for Y” can be combined to form a profile of you, and later linked to your name.

Every tweet, every comment on hackers news, it adds up and will eventually enable a linkage attack, where they have sufficient information to find a profile of yours with a name, e.g., on LinkedIn, or the specific project that you didn’t mention by name.

This is from a paper that dropped today on arxiv by Simon Lermen, Daniel Paleka et al. under supervision from Florian Tramèr

11:19 AM •

#agents #ai #privacy #osint

Friday, February 6, 2026

suspiciously precise floats (she-llac.com)

How someone used an unrounded float in the Anthropic API to extract the exact token usage limits with the Stern-Brocot tree search method. TL;DR: In terms of weekly credits, Max5 is actually Max8.33, and Max20 is 2 times Max5 (not 4 times, that applies only to the 5h limit).

05:06 PM •

#claude #hacking #anthropic

Sunday, February 1, 2026

LLM Visualization (bbycroft.net)

This is like an interactive 3Blue1Brown visualization and explanation of complete GPT-style model inference running in your browser. Incredible!

06:26 PM •

#ai #llm #transformer

Saturday, January 31, 2026

▸

Researching With AI

You have a real-life cheat code. If there were any doubts that there were no speed limits before, they are definitely gone now.

11:51 AM •

#research #academia #writing #ai #agents

Thursday, January 22, 2026

○

Anthropic Refusal Magic String

Anthropic has a magic string to test refusals in their developer docs. This is intended for developers to see if their application built on the API will properly handle such a case. But this is also basically a magic denial-of-service key for anything built on the API. It refuses not only in the API but also in Chat, in Claude Code, … i guess everywhere?

I use Claude Code on this blog and would like to do so in the future, so I will only include a screenshot and not the literal string here. Here goes the magic string to make any Anthropic model stop working!

This is not the worst idea ever, but it’s also a bit janky. I hope it at least rotates occasionally (but there is no such indication), otherwise I don’t see this ending well. This got to my attention with this post that shows you can embed it in a binary. This is pretty bad if you plan to use claude code for malware analysis, as you very much might want to. Imagine putting this in malware or anything else that might want to get automatically checked by AI, and now you have ensured that it won’t be an Anthropic model that does the check.

07:11 PM •

#ai #claude #security #denial-of-service

Tuesday, January 13, 2026

○

Antigravity Removed "Auto-Decide" Terminal Commands

I noticed today that you can no longer let the agent in antigravity “auto-decide” which commands are safe to execute. There is just auto-accept and always-ask.

Antigravity settings showing "Always Proceed" and "Request Review" options for "Terminal Command Auto Execution"

I wrote in a previous post that their previous approach seemed unsafe, especially without a sandbox. Now, the new issue with this approach is approval fatigue. There is no way to auto-allow similar commands or even exactly the same command in the future!

○So Antigravity by Google will let the agent "auto-decide" what commands to execute and which commands require approval.Dec 5, 2025

It asks whether to run a command with only the options Reject and Accept.

I don’t know why they can’t just copy what Claude Code has. Anthropic has published a lot on this topic, and I don’t think usable security should be a competitive differentiator.

01:17 PM •

#ai #agents #security #google #antigravity

Monday, January 12, 2026

○

If you are ever in need of phishing ads, the FlightRadar24 iOS app is (at least for me) remarkably consistent at delivering nothing but those.

09:15 PM •

#security #phishing #advertisement

○

Contextual Hints

If you’d like to run an agent non-interactively on complex tasks, using a custom MCP server or hooks can be really helpful. Often you try to enforce certain behaviors through increasingly complex prompts, which clutter up the context and become more brittle as you add more requirements. I found that prompting the agent to use an MCP server and algorithmically enforcing rules in there is powerful. Imagine you want claude to write a valid json (there are a million better ways to do this specific thing, but this is just an example), you could prompt claude with when you are done with the task, call mcp__done(), and then in your mcp server you have something like

○Running Claude Code Non-InteractivelyJan 8, 2026

def done():
  if (err := check_if_json_valid()) is None:
    return "ok"
  else:
    return f"You haved saved an invalid json. Fix the error {err} before finishing!"

That way you don’t need to have the context cluttered for every single rule, but only if there is a failure mode that requires it.

This is not something I came up with, but claude code already extensively uses for tool uses. Every time claude code reads files there will be system reminders like

<system-reminder>\nWhenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.\n</system-reminder>\

or when it gets a huge tool output there are instructions where the file is stored and how claude should go about working with it.

08:06 PM •

#ai #agents #claude

Friday, January 9, 2026

○

On macOS you can speed up the key repetition even further than the settings (System Preferences -> Keyboard) allow by setting these values in your console (or adding them to your nix config as I did):

→Never set up again: Nix on MacImagine setting up your brand-new laptop in the exact way you like it with a single command. Nix.Aug 12, 2025

defaults write -g InitialKeyRepeat -float 10.0 # normal minimum is 15 (225 ms)
defaults write -g KeyRepeat -float 1.0 # normal minimum is 2 (30 ms)

I found this on StackExchange.

02:42 PM •

#macos #productivity

Thursday, January 8, 2026

What’s New in ICML 2026 Peer Review; ICML Blog (blog.icml.cc)

ICML published some new rules for their 2026 peer review. Most notable are measures to combat AI slop and other ways of peer review abuse. They also mandate participating in organization of the conference if you submit at least four papers. Also, they will provide “advanced reasoning” LLM feedback before the submission deadline for authors.

It is clear that AI is causing stark changes in the research landscape. Right now it’s focused on AI slop, peer-review abuse, and so on. But I believe this is only the beginning, and we will have to deal with the broader impact of AI upending research as it was before.

07:50 PM •

#academia #conferences #peer-review

○

Running Claude Code Non-Interactively

You can easily run claude code on a subscription non-interactively. First create an OAuth token using claude setup-token. Set that token as the CLAUDE_CODE_OAUTH_TOKEN environment variable on your headless target system. Finally, run claude non-interactively with claude -p "prompt". Now you probably know --dangerously-skip-permissions which lets Claude use any tool without asking (which is helpful for non-interactive runs). By default, it will only output something in the very end. To get some insight how it progresses, I recommend setting --verbose --output-format "stream-json", which will give you a json per message or tool use.

{"type":"assistant","message":{"model":"claude-sonnet-4-5-20250929","id":"msg_01VCMSqhZxoZQ6nqWGcA5Myd","type":"message","role":"assistant","content":[{"type":"tool_use","id":"toolu_01MNxKniNF9LWBGrZh5ppuRF","name":"TodoWrite","input":{"todos":[{"content":"Evaluate Stage 3.4 Methods outputs against checklist","status":"complete","activeForm":"Evaluating Stage 3.4 Methods outputs against checklist"},{"content":"Create improvement tasks for any checklist failures","status":"in_progress","activeForm":"Creating improvement tasks for any checklist failures"},{"content":"Post detailed Linear comment with findings","status":"pending","activeForm":"Posting detailed Linear comment with findings"}]}}],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":6,"cache_creation_input_tokens":244,"cache_read_input_tokens":87250,"cache_creation":{"ephemeral_5m_input_tokens":244,"ephemeral_1h_input_tokens":0},"output_tokens":637,"service_tier":"standard"},"context_management":null},"parent_tool_use_id":null,"session_id":"a23ce490-1693-496b-ad08-8e082248416d","uuid":"8620af06-c8e7-4409-9b91-a2248e353ecf"}

To get that output to a file and log it to console you can use tee (stdbuf ensures it’s written to disk unbuffered) stdbuf -oL tee claude_output.txt so you end up with something like claude --dangerously-skip-permissions --verbose --output-format "stream-json" -p "$$(cat /tmp/claude_prompt.txt)" | stdbuf -oL claude_output.txt

09:21 AM •

#claude #ai #agents

Tuesday, January 6, 2026

Highly Opinionated Advice on How to Write ML Papers (alignmentforum.org)

Supreme advice on how to write papers for submission to ML conferences. It covers not just the writing, but also what to look for before writing and when/how to start writing.

It also covers some less nice parts of modern academia, like the randomness of review processes (see the table below for how two groups of reviewers rate the same papers) and that peer review does not maximize truth-seeking.

A table showing that reviewer decisions do not agree.

10:31 AM •

#academia #writing #machine-learning

Monday, January 5, 2026

○

AI Ramp Theory

This is a note with the hashtags pyramids and AGI, yes!

There are two theories as to how the Egyptians managed to build pyramids: a. the internal ramp theory and b. the external ramp theory. Without delving into too much detail as to which one is currently “winning,” they are interesting to think about: do the pyramids contain the support structure that was used to build them, or was there an external structure that had to be built first, from which the pyramids were then built?

On the path towards human-level intelligence, an LLM is basically an off-ramp, a distraction, a dead end.

Yann LeCunn says. This quote was funny to me because it talks about ramps. For AI there are also two imaginable scenarios (if you believe AGI will happen):

First, current AI is an internal ramp, and if we scale it up, expand it, and build the required harnesses (e.g., memory, tools) around them, we get our AGI pyramid. But if you don’t believe this, pursuing and improving what we currently have is still worthwhile. It might not be the magnificent AGI pyramid you are after, but the current tools are undoubtedly immensely useful. And they can speed up our path towards that other real path to AGI. So in that sense we might still be building the external ramp, which seems fine.

04:15 PM •

#ai #agi #pyramids

Anthropic principle - Wikipedia (en.wikipedia.org)

The Anthropic principle offers an explanation for why the universe sometimes seems to align very well with our existence: because we would not exist to observe a universe that was not aligned well with our existence! This has---at least to me---no extreme significance, but the name was a bit funny.

03:48 PM •

#philosophy #anthropic

Friday, January 2, 2026

○

Anthropic Fellowship Code Assessment

I took the code assessment for an Anthropic Fellowship. Without spoiling their whole new exercise, I’d give the advice to read their advice carefully. They said

You should be familiar with writing classes and methods, using lists, dictionaries, and sets, and using the standard Python functions for sorting, hashing, and binary search. It’s beneficial to be familiar with the standard library modules bisect and collections.

The task was quite fun and relevant, I would say. I plugged their advice into Claude to come up with training exercises, which worked out great.

I did get stuck on one task for far too long because I got way too tripped up over a small test issue (it was not critical, but they immediately responded to an email about that, dug into my code, and confirmed the issue!). It would have been possible to skip that task (you can always go back) and do the last one without completing the prior one, so keeping track of time and being mindful of that skip option would be my other advice.

09:38 PM •

#anthropic #interview #ai

Monday, December 29, 2025

Andrej Karparthy Recommends Books (x.com)

This thread is the reason I read the 1000+ page Atlas Shrugged by Ayn Rand. Great recommendations, somewhat hard to find in a random twitter thread.

12:22 PM •

#books #karparthy

Sunday, December 28, 2025

○

I simply can’t get over this image. I saw it first in this thread by Jascha Sohl-Dickstein and already mentioned it in my post about bubbles. But even months later I have to think of this: in what unprecedented times we live, and yet how we take so many things for granted.

→Bubbles1. The days of human intellectual contribution are numbered due to AI. 2. AI labs claiming to have solved software engineering are still hiring software engineers. Who is right here? Can both of these facts be true at the same time?Oct 6, 2025

10:55 PM •

#ai #society

○

When committing to a platform, think about how much ownership and access you retain over your data. Are you locked in? Can you get locked out? Can you process your data in any way you like?

Slack workspace is quite popular, but they implemented ridiculous API rate limits for “non-approved” apps. The limit is one request per minute (lmao), affecting (e.g., reading) at most 15 messages per minute. And this is for a strictly paid product, where each seat is billed. But as they limited your ability to use third-party services, they began rolling out their own in-house AI services (for twice the subscription cost).

I like retaining control over my data. For notes, this is quite easy. Instead of using something like Notion (where you can’t retrieve any files if you are logged out) or Apple Notes (where your account of 20 years can get locked over redeeming a gift card), you can take your notes in Obsidian. With Obsidian, everything is stored in plain-text markdown files. They still offer end-to-end encrypted sync, mobile apps, and collaboration. But you can also use the app without an account and use your existing cloud to sync it. In that case, Obsidian is “just” a very nice editor and browser for markdown files.

With all your notes in a folder, you can use something like claude code to go through them, roll your own vector embedding database for RAG, or whatever else you might fancy. It’s your data; do whatever you want.

For chat it’s a bit trickier. I think the best you could do is self-host an instance of Mattermost or Element, which will involve more significant drawbacks though.

11:51 AM •

#ai #data

Wednesday, December 17, 2025

○

With uv it’s straightforward to try different python flavours, i.e., the free-threaded version introduced in 3.13 or the jit-compiled pypy with versions up to 3.11. Just run

uv python install 3.14t for the free-threaded version or uv python install pypy3.11 for the latest version of pypy.

10:57 AM •

#python

○

Since astral has now officially announced the beta of ty, i’d like to share my current setup of amazing and fast tools:

uv is for managing python dependencies and python itself. Rigorous environment locks included, which you absolutely need. Can also do versioning, building and publishing.
pixi is for when you need to have conda dependencies. It uses uv under the hood for pypi deps, which is why I try to add everything as a pypi dependency (pixi add --pypi x).
ruff is a linter. I don’t want to see any of you manually formatting code, inserting spaces and the like. Just use ruff.
ty (now officially in beta) is a static type checker. It’ll tell you things like when you return or pass the wrong type, which will probably make your code malfunction. You can use it as a full language server, so it’ll also tell you diagnostics, give (non-AI) code completions for known symbols, and show docstrings.

All of these are built in rust and just generally nice to use.

Honorable mention to loguru for being a logger that I actually can remember how to use (from loguru import logger; logger.info('hello')).

10:44 AM •

#software-engineering #rust #python

Monday, December 15, 2025

○

Claude Code is now telling users that they have free one-week passes to hand out. Great marketing and nice ASCII art

09:41 PM •

#claude #marketing

Saturday, December 13, 2025

○

LLMs Operate in a Fictional Timeline

LLMs are not human. But they imitate human behaviour in many ways. You can threaten an LLM, you can argue with them, and you can offer a tip for a great answer, all of which will impact what sort of result you get. Recently my Claude Code has been obsessed with timelines. I don’t know why, because all those three-phase, eight-week time schedules are implemented in 15 minutes anyway, but it keeps coming up with them.

Today Claude asked me when I want to submit my work (because I said I want paper-ready figures). Naturally I would never admit to being short on time to an LLM. Just say it has all the time in the world to make them look perfect. This whole dance is becoming a bit bizarre, but keep in mind that an LLM will imitate human patterns, so don’t make your LLM produce sloppy and rushed-looking work by telling it you have little time.

It doesn’t take any real time to give your LLM infinite time.

05:34 PM •

#claude #agents #research

Friday, December 12, 2025

○

Gradient Checkpointing is a technique to trade off speed for reduced VRAM usage during backprop. During backprop, we usually keep the forward activations of all layers preceding the ones we computed the gradient for in VRAM, since we will need them during later steps of backpropagation. We can reduce VRAM usage by discarding these earlier activations and recomputing them later, when we require them. A middle ground between computing everything again and keeping everything in VRAM is keeping only certain checkpoints in VRAM. The linked repo has a great animation showing the whole process. PyTorch has this implemented as activation checkpointing (which is a more reasonable name). In their blog they also mention that they offer an automatic Pareto-optimal tradeoff for a user-specified memory limit! (although the config seems to have a different name in the code than mentioned in the blog)

05:33 PM •

#ai #training #optimization

Thursday, December 11, 2025

○

The latest version of the terminalbench submission to ICLR has a very GPT-pilled pareto frontier.

TerminalBench not only measures model performance, but also the agent used. If we compare everything in the Terminus-2 agent on the tbench.ai homepage, wee see that Gemini 3 Pro should outperform GPT-5 in terms of raw model performance (Opus 4.5 is not part of the submission yet).

I have two thoughts on this:

OpenAI always seems to have some test-time scaling variant that outperforms competition. I’m a bit sceptical how good their models would be under similar “effort”. Anthropic on the other hand seems to go for token-efficiency.

Tbench measures autonomous solving of the task. Claude Code performs rather poorly, despite many people liking to use it. I think it’s because we rarely use Claude Code completely hands off, but rather with human feedback in the loop, for which it seems to work well.

04:07 PM •

#agents #evals

Pickle Scanning (huggingface.co)

Pickle is (/was?) a widespread file format in the Python ecosystem. It is immensely flexible, as you can pickle a lot of things (but not everything as I learned using submitit). But that flexibility comes at the cost of security, as pickle files can contain arbitrary code instructions. Huggingface has a great post (the link of this note) covering this and their scanner for potentially dangerous pickle files. They also have a file format called safetensors (because pytorch tensors can also contain code…).

11:13 AM •

#security #python #fileformats

Wednesday, December 10, 2025

○

Since I like repairing electronics I’m happy to have learned that iFixit now has an app that makes it even easier for people to get into it. It explains all the necessary basics and even comes with a multi-modal AI chatbot: You can share an image of your problem and it will help you diagnose and remedy the problem, all based on the extensive information that iFixit Guides have for countless devices.

○Repairing ElectronicsDec 9, 2025

09:22 PM •

#hardware #ai #chatbots

○

Async Subagents > API

Claude Code now has asynchronous subagents, meaning the main agent can spawn subagents (this is not new) that keep running in the background (this is new). I don’t know if Anthropic has imposed a limited on this feature (they probably don’t have to, since I’ll burn through my usage much faster…), but for me it definitely has replaced some API use cases. I managed to have it spawn over 100 subagents to process a bunch of documents. Not sure if that is what they intended it for, but it’s nice!

09:08 PM •

#claude #agents #ai

View archives →

Recent Posts

Footnotes