PortalsOS

a16z PodcastColumbia CS Professor: Why LLM...

When a prompt is given to an LLM, it uses the context as new evidence to compute a Bayesian posterior distribution. This allows the model to generate likely responses even to prompts it has never encountered before.

Vote to see vote counts

More or Less#119 OpenAI Sora vs. TikTok: C...

The debate between LLMs and other reasoning models highlights the limitations of LLMs in understanding real-world context and predicting the future.

Even with a trillion parameters, LLMs cannot represent the entire matrix of possible prompts and responses. They interpolate based on training data and new prompts to generate responses, acting more like Bayesian models than stochastic parrots.

LLMs are criticized for lacking a true world model because they predict human responses rather than actual events.

Large Language Models (LLMs) create Bayesian manifolds during training. They confidently generate coherent outputs while traversing these manifolds, but veer into 'confident nonsense' when they stray from them.

Dwarkesh PodcastRichard Sutton – Father of RL ...

LLMs are criticized for lacking a true world model because they predict human-like responses rather than actual outcomes.

The matrix abstraction model for LLMs involves a gigantic matrix where each row corresponds to a prompt, and columns represent the vocabulary tokens. Despite its size, this matrix is sparse, as many rows and columns are irrelevant or zero.

a16z PodcastColumbia CS Professor: Why LLM...

LLMs respond differently to prompts based on information entropy. High information entropy prompts with low prediction entropy lead to more precise outputs, as they reduce the realm of possibilities.

LLMs can perform few-shot learning by creating the right posterior distribution for tokens based on examples provided in a prompt. This process is the same whether it's in-context learning or just a continuation task.

At the core of LLMs, regardless of their complexity or training methods, is the creation of a distribution for the next token. Given a prompt, LLMs predict and select the next word from this distribution, continuing the process iteratively.

The most impactful models for understanding LLMs, according to Martin, are those created by Vishal Misra. His work, including a notable talk at MIT, explores not only how LLMs reason but also offers reflections on human reasoning.

PortalsOS

Related Posts