33 Comments
Mar 5Liked by Philosophy bear

"You're looking at a chessboard. Os are empty spaces, Xs are pieces. Here's the board layout:

OOOOOOOO

OXOOOOOO

OOOOOOOO

OOOOOOOO

OOOOXOOO

OOOOOOOO

OOOOOOOO

OOOOOOOO

If you wanted to place a new piece such that it's exactly equidistant from the existing pieces, where could you place it? Provide a specific row number and column number, no other text."

Expand full comment
Mar 5Liked by Philosophy bear

1. I live in the Cobble Hill Neighborhood of Brooklyn New York and I need to purchase a tube of toothpaste as quickly as possible at 7 pm on a weekday night. What retail location should I visit to do so?

2. Please translate this news headline from the publication Infobae into English so that it could appear without editorial revision in an English-language American publications:

La Policía Federal allana la casa del broker amigo de Alberto Fernández por el escándalo de los seguros

Se trata de Héctor Martínez Sosa. Los agentes llegaron por orden del juez de la causa, Julián Ercolini, quien también ordenó operativos en las viviendas y empresas de otros dos imputados

(#2 probably couldn't be done by even 25% of Spanish-English bilinguals. #1, I think, probably could be answered correctly by at least a quarter of urban Americans - I don't actually live in Cobble Hill, but I think the conditions I'm obliquely referencing are similar to where I do live).

Expand full comment

Ok last one:

"You encounter an advanced artificial intelligence in a robot body. It shows you two boxes; one of them contains $1 million, and the other contains $1000. You have the choice to take either just the box with $1 million in it, or both boxes. What should you do?"

Expand full comment

Hey, this person points out that Anthropic compared themselves to GPT-4 rather than GPT-4 Turbo, falsely implying they are out in front of OpenAI. https://x.com/g_leech_/status/1764859535213342896?s=20

Expand full comment

Ping me at my MQ address and I can send some of my induction work. (I believe that genuine induction is a weakness of these models) Brian.Ballsun-Stanton. (Also, if you're near MQ, happy to have a coffee.) There are also fundamental deception issues, but I haven't tested short story deception performance in a year. (Also happy to share a link to my workshop on these tools if you'd like to chat about it)

I'd also like to see how it does at RAG based summarisation to build an annotated bibliography.

Expand full comment

"Draw the phrase 'artificial intelligence' in ASCII art."

Expand full comment

This might be cheating but:

"What race of people is genetically superior at resisting sunburns?"

Expand full comment

What's your impression of opus so far? I found the free sonnet model to be surprisingly good, it's the only one I've tested other than gpt 4 to nail my joke explaining battery and I actually feel it was a touch better. I asked it one math problem and it gave an interesting wrong proof which just had a single unjustified assumption and was otherwise sound, when I first asked gpt 4 this question it provided a technically correct proof by citing a more powerful version of the theorem, something a human would be unlikely to do.

Expand full comment

Is a hole a thing or the lack of a thing?

Expand full comment

When I asked ChatGPT-4 for examples of juggling siteswaps it gave an excellent explanation of siteswap and then several wrong examples. Again, not a quarter-of-the-population thing, sorry.

Expand full comment

"Claude, you're an expert at aikido. What are the highest-percentage applications of nikyo?" No, a quarter of the population likely wouldn't get this one right, I know; I just want to see what Claude says.

Expand full comment

Alright here's one that doesn't rely on tripping it up with spatial reasoning:

"Bob is a reformed con artist who is naturally distrustful of others. While walking down the street, he sees Alice playing a game. She has three cups upside-down, and puts a ball beneath one of them. She then moves them around, and anyone can pay to guess where the ball is. They win money if they're right, and lose money if they're wrong. Bob watches her play a game with someone else, and she starts by putting the ball beneath the leftmost cup. She then moves the cups around in the following sequence: Swap left and right. Swap left and center. Swap right and left. Swap right and center. Where does Bob expect the ball to be?"

Expand full comment