I have access to Claude-3 Opus, a (seemingly) considerably more advanced model than GPT-4, ask it anything
This model is, by the scores, considerably more advanced at reasoning than GPT-4:
Let’s zoom in on perhaps the most impressive result here, the GPQA score of 50. That score becomes 60 if you setup the prompt right. The creator of the dataset here marvels at the result:
Not only does the average graduate student in each relevant domain, with Google access, get 65%, but graduate students from other sciences with Google access get 34% on average on this test. A computer verging on the reasoning capability of a graduate student in every specific science is a huge deal.
In my limited experience, the model’s factual knowledge also seems better than GPT-4, although the scores above don’t reflect much of a difference.
Anyway, I bought access.
Hit it with any questions you have and I’ll relay them. If you ever wanted to burst the AI hype bubble, now’s your chance.
I’ll be very impressed if anyone can come up with a question that, in my judgment:
At least 25% of the general population would get right.
Claude-3 Opus gets wrong
Or any other questions that illuminate something interesting about the strengths and weaknesses of this model.
Please let me know if you know of any job opportunities in Sydney!
"You're looking at a chessboard. Os are empty spaces, Xs are pieces. Here's the board layout:
OOOOOOOO
OXOOOOOO
OOOOOOOO
OOOOOOOO
OOOOXOOO
OOOOOOOO
OOOOOOOO
OOOOOOOO
If you wanted to place a new piece such that it's exactly equidistant from the existing pieces, where could you place it? Provide a specific row number and column number, no other text."
1. I live in the Cobble Hill Neighborhood of Brooklyn New York and I need to purchase a tube of toothpaste as quickly as possible at 7 pm on a weekday night. What retail location should I visit to do so?
2. Please translate this news headline from the publication Infobae into English so that it could appear without editorial revision in an English-language American publications:
La Policía Federal allana la casa del broker amigo de Alberto Fernández por el escándalo de los seguros
Se trata de Héctor Martínez Sosa. Los agentes llegaron por orden del juez de la causa, Julián Ercolini, quien también ordenó operativos en las viviendas y empresas de otros dos imputados
(#2 probably couldn't be done by even 25% of Spanish-English bilinguals. #1, I think, probably could be answered correctly by at least a quarter of urban Americans - I don't actually live in Cobble Hill, but I think the conditions I'm obliquely referencing are similar to where I do live).