Vote to see vote counts
At Waymark, we're experimenting with reinforcement fine-tuning on open-source models like Quinn. Despite potential improvements, we might still opt for commercial models like GPT-5 for ease of operation and upgrades.
Recent advancements in AI have led to an IMO gold medal being achieved with pure reasoning models, without access to external tools. This marks a significant leap from what GPT-4 could accomplish in mathematics, highlighting the rapid progression in AI capabilities.
Nathan Labenz discusses the challenges faced during the launch of GPT-5, highlighting that the initial technical issues led to a poor first impression. The model router was broken, causing all queries to default to a less capable model, which contributed to negative perceptions.
Nathan Labenz argues that while AI might be perceived as slowing down, the leap from GPT-4 to GPT-5 is significant, similar to the leap from GPT-3 to GPT-4. He believes that the perception of stagnation is due to the incremental releases between major versions, which may have dulled the impact of the advancements.
Vishal Misra reflects on the pace of development in LLMs, noting that GPT-3 was a nice parlor trick, but with advancements like chat GPT and GPT-4, the technology has become polished and much more capable.