Sam Altman unexpectedly brings his timelines to AGI forward, while OpenAI backtrack on superintelligence. None of these changes were heralded, but they are significant. Plus the new year brings new assessments of the true capability of models to automate 'large swathes of the economy'. I'll give my prediction on that front for 2025, announcement a new Simple Bench competition, and showcase Kling 1.6 vs Veo 2 vs Sora, and much more.
wandb.me/simple-bench
(Colab): https://colab.research.google.com/drive/1AVijcPnEkl8Gy_754XbRdG5m7Q5-9slg?usp=sharing
TheAgentCompany Paper: https://arxiv.org/pdf/2412.14161v1
Sam Altman Major Interview: https://www.bloomberg.com/features/2025-sam-altman-interview/?srnd=phx-ai
OpenAI Agent Coming Jan 2025: https://www.theinformation.com/articles/why-openai-is-taking-so-long-to-launch-agents?rc=sy0ihq
Altman Singularity: https://x.com/sama/status/1875603249472139576
Altman Original Timeline: https://www.youtube.com/watch?v=7dCPytNTnjk&t=621s
https://www.ft.com/content/34a7a082-e685-4e02-bca7-61ff89d99ed2
OpenAI Original Emails: https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altman-and-openai-blog
DeepMind Sky News 2014 Article: https://news.sky.com/story/google-buys-uk-intelligence-firm-deepmind-10419783
Altman Blog Reflections: https://blog.samaltman.com/reflections
OpenAI Changes Who Gets AGI: https://openai.com/index/why-our-structure-must-evolve-to-advance-our-mission/?s=09
OpenAI 5 Levels: https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai
Altman 2015: https://blog.samaltman.com/machine-intelligence-part-1
OpenAI React to Anthropic: https://www.theinformation.com/articles/how-anthropic-got-inside-openais-head?rc=sy0ihq
Microsoft $100B Definition: https://www.theinformation.com/articles/microsoft-and-openai-wrangle-over-terms-of-their-blockbuster-partnership?rc=sy0ihq
Epoch Scramble for Task Benchmark: https://x.com/tamaybes/status/1876692639363612919
GPQA Progress: https://epoch.ai/data/ai-benchmarking-dashboard
Task Length Crucial for ARC-AGI: https://anokas.substack.com/p/llms-struggle-with-perception-not-reasoning-arcagi
RL Environment Tweet: https://x.com/vedantmisra/status/1876327518157807990
Jason Wei Talk: https://www.youtube.com/watch?v=yhpjpNXJDco
Miles Brunda