Vote to see vote counts
Nathan Labenz challenges the idea that AI progress is flatlining, arguing that the perception of diminishing returns is misleading. He believes that the advancements between GPT-4 and GPT-5 are substantial, but the frequent updates have made it harder for people to recognize the scale of progress.
OpenAI's update of GPT-4.0 went overboard on flattery, showing that AI doesn't always follow system prompts. This isn't like a toaster or an obedient genie; it's something weirder and more alien.
Nathan Labenz argues that while AI might be perceived as slowing down, the leap from GPT-4 to GPT-5 is significant, similar to the leap from GPT-3 to GPT-4. He believes that the perception of stagnation is due to the incremental releases between major versions, which may have dulled the impact of the advancements.
GPT-4.5 achieved a 65% score on the Simple QA benchmark, a significant leap from the 50% scored by the 03 models. This benchmark measures knowledge of esoteric facts, highlighting the model's improved factual knowledge.
Vishal Misra reflects on the pace of development in LLMs, noting that GPT-3 was a nice parlor trick, but with advancements like chat GPT and GPT-4, the technology has become polished and much more capable.