On AI 2027

AI 2027 is a new forecast from the AI Futures Project. It predicts the rise of AGI in 2027 and superintelligence soon after. I’ll be honest: this is the kind of thing that I normally would’ve laughed at and completely ignored. The only reason I even bothered reading it is because one of the authors, Daniel Kokotajlo, made a similar set of predictions in 2021. It’s hard not to see the 2021 predictions as prescient even though I think a lot of the predictions are a little bit off in terms of timing and/or details. These kinds of small discrepancies could add up and be quite relevant for a prediction as near term as AGI in 2027. The new forecast is quite long, but they do have a summary, and there’s also a New York Times article (non-paywalled link) about the piece.

One of the researchers on the forecasting team is a Harvard student named Romeo Dean, and he recently gave a talk at a Harvard AI Safety Team (HAIST) event. We spoke for a while afterwards and discussed a ton of ideas. He encouraged me to write about why I’m skeptical. I definitely don’t remember everything we talked about, but here’s my attempt.

There’s three main reasons I’m skeptical about AGI in general and especially in 2027.

Existence
Hurdles and Liftoff
Reinforcement Learning, Search Spaces, and Human Knowledge

Existence

The prediction presumes that AGI is possible on hardware. I agree that we have a proof of concept in the human brain. However, the human brain is quite different from anything we’ve implemented in computers. They have completely different strengths, failure cases, and processes. Even if you can create an AI-powered superhuman coder or an AI-powered superhuman AI researcher, it does not matter if AGI is not possible in silico.

Hurdles and Liftoff

One point we talked about was whether AI research looks more like a smooth function or a step function. Yes, if you zoom out enough, things always look smooth, but the distinction is important. Suppose that the AI-powered AI researcher falls short and isn’t as powerful as predicted. If research is smooth, then things don’t change all that much: progress advances more slowly, and the timelines get longer i.e. AGI still arrives, just later than predicted. However, if research is more like a step function, then the AI-powered researcher could get stuck — unable to make further progress because it can’t get past a critical hurdle in the process. I think both of us were leaning towards progress being more discontinuous.

A bit more on the concrete side, there seems to be this common structure in AI debates. An AI-optimist will say that further progress towards AGI is possible because some capability, say memory, is missing from existing models, and an AI-powered researcher would be able to add in those capabilities and get us closer to AGI. The AI skeptic responds that those missing capabilities are precisely why we won’t get an AI-powered superhuman researcher. In other words, liftoff is not free: really significant progress has to be made well before AIs can do any sort of recursive self-improvement. And that work to get off the ground has to be done by humans.

Reinforcement Learning, Search Spaces, and Human Knowledge

My main contention is about reinforcement learning (RL). To me, there’s roughly three branches of RL: approaches like AlphaZero for games, RL for LLMs, and RL for robotics. The approaches in these three domains are very different, and I think the reason is that they have very different search spaces. Games like chess seem to have a high branching factor (i.e. a high number of possible next moves), and the branching factor is part of why it’s challenging for humans, but the number of moves is still ultimately finite. Predicting the next token for a language model is a more open-ended task: the number of possible tokens for modern models is typically on the order of 100,000. However, robotics is clearly the most open-ended of the three since you have to predict continuous vectors i.e. you’re making a multi-dimensional choice, and each choice is not finite. The prevailing approaches for each reflect the different search spaces.

For games like chess and go, researchers try to remove all human priors and have the models learn purely on their own. In addition, the rewards given to the model are sparse: it is never told whether an individual move is good — just whether or not an overall game was won. On the other hand, LLMs start with strong priors from pretraining and supervised finetuning i.e. LLMs don’t do RL style training from scratch. They already have a good idea of what sorts of things work from pretraining on internet data and then being finetuned on curated datasets for important tasks like following instructions. LLMs can go through two forms of reinforcement learning.

The first is reinforcement learning through human feedback or RLHF. A copious amount of data is collected on what sorts of responses are preferred by humans, and then reward models are trained to reflect human preferences. Then an LLM is trained against the reward model to be more likely to produce responses that are pleasing to humans. I think this is a fairly dense reward, and it relies a ton on human knowledge, but I think both are totally fine in this case because the task IS being useful to humans.

The second is reasoning training. This is the sort of approach used for OpenAI’s o1 or DeepSeek’s R1. Models are trained to answer math and coding questions with answers that can be automatically verified (e.g. just check whether the response is “4” for “2 + 2 =”). Correctness is used as a reward to improve the model’s problem solving ability. However, even with strong priors, the sparse reward from the automated grader is not enough. Models exhibit artifacts like using both English and Chinese in a response because the grader doesn’t care about anything else as long as you get to the right answer. You could even have a completely nonsensical chain of reasoning that just happens to get to the right answer. To fix this problem, DeepSeek first trains a model with pure RL. This creates a model that is good at spitting out the right answer but has those artifacts we mentioned. Then humans go through and curate good responses. Finally the base model is trained on only these good responses. Other approaches to solving this problem include Process Reward Models (PRMs) which try to give additional rewards for responses that are more coherent. Approaches like this do work, but they have hit limits so far i.e. the vaunted recursive improvement hasn’t taken off yet.

The final case is robotics. In this domain, the entire problem is about imbuing as much human knowledge as possible. Reward functions are extremely complex, and they’re repeatedly engineered to address the weaknessses of prior models. Data collected from humans is leveraged whenever possible.

The conventional wisdom in reinforcement learning (and machine learning more broadly) is that the simplest rewards should be used whenever possible AND that encoding priors based on human knowledge can limit models from learning better behavior on their own. I believe the reason that this isn’t the case for language model post-training or robotics is because of their search spaces. It’s also why robotics relies more on human knowledge than language models: because it has an even bigger search space.

If this is the case, then human data is a critical bottle neck i.e. in order for the model to continue improving at any task, we need humans to produce guidance. There is no recursive loop driven entirely by a model itself. Now, I want to be clear. I’m sure that you could scale up models and use a PRM to get notable improvements. I’m also sure that the bump from the PRM will be bigger than past iterations — purely because the models are bigger. I just think that it’ll still hit a limit — a new limit but still a limit.

An addendum to the issue of search spaces is episode length. For a chess game, a model might make (on the order of) 40 moves before getting a result. The model must then deduce which of those moves lead to the success i.e. it must attribute rewards to each of the moves. As the sequence gets longer, it becomes harder and harder to understand which moves were instrumental and which moves didn’t really matter. This is also part of why language modeling and robotics require denser rewards: you need more reward signals to spread out across more moves. This is particularly important for tasks like coding and research since they’re long context problems with many moves.

Afterword

I did watch an interview with Daniel on Dwarkesh Patel’s fantastic podcast. They were joined by an author named Scott Alexander who helped turned the AI 2027 forecast into a narrative story. Based on the interview, I will admit that I think there’s one key reason why Daniel’s (and Scott’s) projections differ so much from my projections (or the average person’s projections). That reason is very different views on how much AI has progressed in the last couple years (really since ChatGPT). They are much more bullish on this. Additionally, I think I’m much, much more skeptical on any trendlines or benchmarks — I just think that you get very little information from any of this stuff. For example, we were told over and over again that just making the models bigger would solve everything because of a few data points on a scaling laws graph. Then GPT 4.5 was underwhelming, and everyone switched to “scaling test time compute” without ever addressing it.

What I’m trying to say is that while I do think that there are more conceptual disagreements that I tried to lay out above, the real big disagreement might be more fundamental. I do want to say that I’m very happy about all of the people making concrete predictions: for example, Dario’s prediction about AI writing 90% of code within 3-6 months. I’m hoping that we can hold people’s feet to the fire with such clear predictions — part of why I think feet will be held to the fire is the sheer amount of capital invested. What I’d hate is to see AI go down the same route as self-driving cars where every year, they tell us that a full solution will be here in 2 years. Anyway, thanks for reading.