> #ai

8 posts

So Antigravity by Google will let the agent “auto-decide” what commands to execute and which commands require approval. It also does not use a sandbox. It didn’t take very long for the first Reddit post about a whole drive being deleted by the agent arriving. Meanwhile Claude Code is going the complete other direction: rigorous permission systems and a sandbox on top. Anthropic explains this in more detail in their blog, but basically they argue that you need filesystem and network sandboxing, because bypassing one would also mean bypassing the other (it’s trivial for linux because everything is a file, but holds more generally).

Just running an npm run build will trigger a sandbox request if a telemetry request is being made. git commit needs to use the non-sandbox fallback, because it uses my key for signing the commit, which is not available from within the sandbox. They always offer a sensible “always allow” because they are acutely aware of Approval Fatigue. It’s a good approach and makes me feel a lot safer.

If you want to set up PostHog on your website, you usually have to install a package, add the imports, etc. Posthog has fully integrated an AI agent into their install process, so you invoke an AI agent as the installation step.

Bubbles

1. The days of human intellectual contribution are numbered due to AI. 2. AI labs claiming to have solved software engineering are still hiring software engineers. Who is right here? Can both of these facts be true at the same time?

Why I Pay for Search

I think Google as a company is doing great, and I do support many of the things they do. But search isn’t one of them. Here are a couple of reasons why I prefer using Kagi, a paid search engine. So much so that I have invested in the company in their second round.

💡

This post assumes you care about search in the first place. There will be a separate post on why I do, which I will link here.

Kagi sources their results from all major search engines (Google, Bing, Yandex, …) as well as niche providers and tools (Wolfram Alpha, Yelp, Open Meteo, …).

Homepage of Kagi.

This article is split into tangible benefits, which save me time every day and make Kagi a measurably better search engine for me, and intangible benefits that I usually forget about but make me feel a bit nice.

Tangible Benefits

Just results.

No ads. No quick questions. No AI answers. No sponsored results.

Just results that matter.

A Kagi query for ‘cockatiels.’ It shows results (sites, videos, images) and a wikipedia panel on the right.

Customization

I can go in and control how the results are generated. Useful and trusted sources can be upranked or pinned to the top, so they will always show up when your query gets a hit on them.

Kagi reveals detailed results like the amount of trackers on the site and the domain registration date. It also allows you to up- or downrank sites, including blocking them completely from appearing in your results.

Aweful and useless sites like the Apple discussions forum can be downranked or, as they should, completely blocked from your results. Some stats: The most-blocked websites are pinterest, foxnews and tiktok. The most-pinned websites are wikipedia, the mozilla developer website and reddit.

Integrated features

Summarize, open in webArchive or ask an LLM right from search.

Options for the search result and a website summary directly in Kagi.

AI, when I want it

Add a question mark or press the quick answer button, and you will get an AI summary with citations. No question mark means no automatic AI summary.

A query for ‘cockatiels?’ shows an AI quick answer with citations.

Perplexity included!

Kagi assistant is a notes-inspired chat interface. By default, the AI model can search the internet, giving you basically the complete perplexity experience in Kagi! You can also upload files and define custom agents with different models and system prompts.

Asking the Mistral small model in Kagi Assistant lets ‘s it use Kagi search as a tool and answer with citations.

Context Control

Google does a great job at personalization. But what if I don’t want personalization and instead want to know what other people’s results would be? Kagi lets me control the context, e.g., letting me search like a user from a different country!

The country the user appears to be searching from can be controlled or set to a neutral international.

Great, e.g., for vacation planning. Just set the country to your destination country, and now you can search like normal instead of having to append the country to every query.

Searching for ‘train fare’ with the country setting ‘Japan’ gives you results relevant to Japan. Shocker!

Tools for Powerusers

Build advanced queries right in the search UI! This is very much identical to the query builder that Google offers. But on Kagi it’s right on the start page and not buried three clicks away. Power-searchers rejoice!

Spiritual Benefits

Bringing back small web.

There might be a longer post about this at some point, but basically, the internet has changed. I am old enough - maybe barely, but still - to remember the old internet of small and unpolished blogs and forums. It was an innocent place where you could open up a much cleaner-looking Google, search for ‘Bitcoin faucet,’ and get some free Bitcoin. No one used their real name, and you could telnet into the whitehouse server and use it to send emails. Today’s internet in comparison feels hostile. Tracking, blogspam, captchas, ads, and aggressive and intransparent affiliate marketing.

Kagi is trying to reconnect users and bring back the small web with their small web index. Turn on their small web lens (a feature I neglected to mention, but it’s essentially a couple predefined filters on your results), and you will find results I promise you usually won’t find.

Screenshot of a website called cockatiel cottage.

These pages will be kind of shitty by many of today’s measurements and often badly optimized. Like sometimes they even load a lot slower. But they feel human. It’s not something I use regularly (hence a spiritual benefit), but it’s nice. I like humans doing their unique thing.

Privacy

They don’t log your queries or profile you. Right from the search results, before opening a page, they show you a warning symbol if a page contains many trackers.

A page with many trackers will have a warning symbol on the right.

Such pages are also downranked automatically.

Kagi shows the number of trackers and which trackers are present on the website.

Kagi is paid, so you do need to authenticate for every query. But you can still use it without tying a query (which is not logged anyway) to your account. The technology behind that is based on RFC-standardized privacy passes. Here is their blog post. Essentially it lets you generate anonymous tokens, each of which can be used for one search query. There is a browser extension that handles this process automatically.

When using Privacy Pass, Kagi doesn’t know your account but instead signalizes that you are authenticated anonymously.

Public Benefit Corporation

Kagi is a public benefit corporation since 2024. Their statement says:

Kagi is committed to creating a more human-centric and sustainable web that benefits individuals, communities, and society as a whole, with a transparent business model that aligns the incentives of everyone involved.

Maybe I should hate that as an investor. But I’m also a stupid enough investor to really like putting society first.

Transparency

Stats! Check out how many (paying) users they have or how many queries are performed each day.

The number of users, all of whom pay.

The number of queries per day.

AI filters

I do like that they are very mindful and proactive when it comes to AI. They have a feature that lets you filter AI-generated images.

When searching for images, Kagi allows you to filter for AI-generated ones.

As an AI researcher, I am hardly an AI hater, quite the opposite. But I do think we need to find ways to navigate the broad changes introduced by AI, and - while I have some technical concerns - at least they are trying something!

No Free Lunch

It’s clear that Kagi invests a lot into making search better, adding useful features for the end user, and keeping the experience clean. This relentless focus is not free. It’s not something that fits, e.g., Google’s business model, where someone else pays for you to see ads. Kagi’s focus on your experience as the searcher has a price. And that price is exactly $10 per month (for unlimited searches). I don’t expect this to be the best deal for everyone, so Kagi will not replace Google in my opinion (and Google does far more than search anyway). But for me, a researcher with thousands of searches per month, it is well worth it to get the best experience available.

disclaimer: I am an investor in kagiI think Google as a company is doing great, and I do support many of the things they do. But search isn’t one of them. Here are a couple of reasons why I prefer using Kagi, a paid search engine. So much so that I have invested in the company in their second round.

If you think you deserve the best search experience there is, I highly recommend that you try Kagi.com. The first 100 searches are free.

Try it out

R1 writes an essay about `Strawberry`

I was setting up OpenRouter to benchmark a bunch of different LLMs on some specific tasks we have at Nunc.

I noticed there’s an option to chat with various models and it even comes with some demos. Naturally, I clicked on the Strawberry Test.

A new chat window. It has a text field, send button. Above there are three demo options “9.9 vs 9.11. Which one is larger?”, “Strawberry Test. How many r’s are in the word” and “Poem Riddle. Compose a 12-line poem”.

In the chat the question “How many r’s are in the word strawberry” has been sent. The model has reasoned for 58.1s and responds, mentioning there are 3 instances of the letter “r”.

It got the right answer, which is nice. But it took well over a minute, despite the model generating at 244 tokens per second. And I get only 84 output tokens, which by themselves should take a third of a second, not more than a minute. So yeah, as the UI shows, the model spent quite some time “reasoning” about what to answer. Luckily for us, the reasoning traces are public.

Lo and behold, the traces of R1, 1515 tokens or 894 words about Strawberry:

Okay, let me try to figure out how many times the letter “r” appears in the word “strawberry.” First, I need to make sure I remember the correct spelling of “strawberry.” Sometimes, people might misspell it, so I’ll take a moment to think. It starts with “straw,” like the thing you drink from, and then “berry.” So combining those, it’s “strawberry”.
Now, let’s break it down letter by letter. Let me write out each letter one by one and check for “r”s.
S-T-R-A-W-B-E-R-R-Y.
Wait, let me slow down. Starting with the first letter:
S - no
T - no
R - yes, that’s the first “r”
A - no
W - no
B - no
E - no
R - second “r”
R - third “r”
Y - no
Wait, does “strawberry” have three “r”s? Let me check again. Sometimes when I spell it, I might confuse the number of letters. Let me write it out again more carefully:
S, T, R, A, W, B, E, R, R, Y.
Wait, that’s 10 letters? Let’s count the letters again. S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). So according to this breakdown, there’s an “R” at the 3rd position, then again at the 8th and 9th positions. That would make three “r”s. Hmm, but that doesn’t seem right. Let me check another way.
The word “strawberry” is a compound of “straw” and “berry.” Let’s check each part separately.
”Straw” is spelled S-T-R-A-W. Here, there’s one “r”.
”Berry” is B-E-R-R-Y. In “berry,” there are two “r”s.
So combining them: “straw” + “berry” = strawberry. Adding the “r” from “straw” and the two “r”s from “berry” would give a total of three “r”s.
But wait, when combining “straw” and “berry,” do any letters get dropped? Sometimes, compound words might drop a letter, but in this case, “strawberry” is spelled S-T-R-A-W-B-E-R-R-Y. So between “straw” and “berry,” the letters are all retained. So “straw” ends with W, and “berry” starts with B. Therefore, all letters are present, so the total number of “r”s is indeed three.
But just to be absolutely sure, let me spell it out again: S-T-R (1)-A-W-B-E-R (2)-R (3)-Y. Yes, that’s three “r”s. Wait, but I think a lot of people might think there are two “r”s. Maybe I’m overcounting? Let me check a dictionary or think of how it’s pronounced. Sometimes the pronunciation can be misleading. “Strawberry” is pronounced “STRaw-ber-ee,” and sometimes the “r” in “berry” is doubled. But in spelling, “berry” has two “r”s. So when combined with “straw,” it’s still two “r”s in the “berry” part and one in the “straw” part, totaling three.
But hold on, when I actually write “strawberry,” is that the correct spelling? Let me think: S-T-R-A-W-B-E-R-R-Y. So after W comes B from “berry,” then E, then double R, and Y. So yes, there are two “r”s in the “berry” part. Together with the one “r” in “straw,” that’s three “r”s in total.
Alternatively, maybe “strawberry” is spelled with only one “r” in the “berry” part? Let me visualize the word. If I write it down: strawberry – s t r a w b e r r y. Yes, that’s right, “berr” part has two “r”s. So the “berry” component is spelled B-E-R-R-Y, so two “r”s. Therefore, in “strawberry,” combining “straw” (with one R) and “berry” (with two Rs) gives three Rs total.
But wait another second. Let me check an alternative approach by counting the letters step by step:
S
T
R (1)
A
W
B
E
R (2)
R (3)
Y
So yes, three Rs. Therefore, the answer should be three. Hmm. But I remember sometimes people miss it as two. Let me confirm with a different method. Maybe typing the word and checking each letter:
S T R A W B E R R Y
Breaking it into parts:
Straw: S T R A W
Berry: B E R R Y
Yes, “straw” has one R, “berry” has two. So combined, three Rs.
Alternatively, maybe in the actual spelling, there is only one R in “berry”? Let’s think. The fruit is “berry,” which is B-E-R-R-Y. So for example, “berry” has two Rs. So if that’s correct, then “strawberry” would have three Rs.
Alternatively, maybe it’s spelled “strawbery” – no, that’s incorrect. The correct spelling is with two Rs in “berry,” hence “strawberry” with three Rs.
So the conclusion is three Rs. Wait, but let me get an external confirmation in my mind. For example, if I think of the word “strawberry” and how it’s taught in schools, the correct spelling includes two Rs in “berry,” leading to three in total. Another way to remember is that “berry” is spelled with two Rs, so when you add “straw,” you get three Rs.
Alternatively, maybe I should check how many syllables are in “strawberry.” It’s pronounced as straw-ber-ry, which might be three syllables, but that may not directly relate to the number of letters.
Alternatively, using a mnemonic: “Straw has an R, berry has two Rs, so together three Rs.”
Hmm, yes, that seems to fit. So I think the correct answer is three. But just to make absolutely sure, I’ll write the word out again: S T R A W B E R R Y. Scanning each letter:
S - T - R (1) - A - W - B - E - R (2) - R (3) - Y
Yes, three Rs. Therefore, the answer is 3.

Did you read it all? It seems a bit repetitive at first glance, right? But then again, humans can kind of only count up to three or four, so this close to our limit before we’d have to enter reasoning mode ourselves. Maybe AI will also get up to four without reasoning, who knows.

The Benchmark

In totally unrelated news, our benchmark for the business task we want to solve, shows that in terms of utility we don’t really care too much about which model we actually use. The outliers are due to API issues, which makes the results all the more realistic, I’d say. So while AGI or whatever would sure be nice, I think most business tasks look something like this.

Benchmark showing very similar performance for around 15 different models. There are some outliers.