One of the main reasons I do the pelican thing is that it's making fun of the industry:
1. The smartest model in the world still draws pelicans riding bicycles worse than a five year old.
2. It highlights how absurd the task of comparing these models is. Oh, so it scored 78 on Terminal Bench 2.1? It also drew a crap pelican.
> So you are saying in the end of that piece that chess players came out stronger?
That was an off-the-cuff remark on the podcast which I included in the transcript. It's not my overall thesis.
> The smartest model in the world still draws pelicans riding bicycles worse than a five year old.
The bicycle is something that humans are famously bad at drawing: https://www.washingtonpost.com/news/wonk/wp/2016/04/18/the-h...
One of the main reasons I do the pelican thing is that it's making fun of the industry:
1. The smartest model in the world still draws pelicans riding bicycles worse than a five year old.
2. It highlights how absurd the task of comparing these models is. Oh, so it scored 78 on Terminal Bench 2.1? It also drew a crap pelican.
> So you are saying in the end of that piece that chess players came out stronger?
That was an off-the-cuff remark on the podcast which I included in the transcript. It's not my overall thesis.