Vote to see vote counts
Vishal Misra's work on understanding LLMs is profound. He has developed models that reduce the complex, multidimensional space of LLMs into a geometric manifold, allowing us to predict where reasoning can move within that space. This approach reflects how humans simplify the complex universe into manageable forms for reasoning.
The iterative nature of science requires LLMs to engage in simulations, theoretical calculations, and experiments to discover scientific insights.
The debate between LLMs and other reasoning models highlights the limitations of LLMs in understanding real-world context and predicting the future.
Large Language Models (LLMs) create Bayesian manifolds during training. They confidently generate coherent outputs while traversing these manifolds, but veer into 'confident nonsense' when they stray from them.
LLMs develop deep representations of the world due to their training process, which incentivizes them to do so.
The integration of geometric reasoning with LLMs can enhance the representation of atoms and design geometries, benefiting scientific research.
The use of LLMs and VLMs in robotics provides a way to incorporate common sense into robotic systems, allowing them to make reasonable guesses about potential outcomes without prior experience of mistakes.
LLMs respond differently to prompts based on information entropy. High information entropy prompts with low prediction entropy lead to more precise outputs, as they reduce the realm of possibilities.
At the core of LLMs, regardless of their complexity or training methods, is the creation of a distribution for the next token. Given a prompt, LLMs predict and select the next word from this distribution, continuing the process iteratively.
The most impactful models for understanding LLMs, according to Martin, are those created by Vishal Misra. His work, including a notable talk at MIT, explores not only how LLMs reason but also offers reflections on human reasoning.