Summary of deep dive into LLMs like ChatGPT from Andrej Karpathy ⚡️

We often fear the unknown, because we just don't know. Often the simplest thing we can do is to begin anywhere and learn!

Feb 21, 2025

Andrej Karpathy probably needs no introduction, but if you are hearing the name for the first time, he is a world-renowned expert in AI and deep learning who knows his stuff having led major projects at Tesla, OpenAI

This video shines a light on how LLMs are trained, what they are good for, and what they are not and I found it highly practical. It does not explain math, or underlying algorithms but gives you a solid intuition about the current landscape

What I took away from this?

Models are extremely good for brainstorming generating ideas, and writing code but by definition act like stochastic parrots often imitating data provided by a human expert and generate
We should not blindly trust what it generates or deploy it in mission-critical areas as it will have hallucinations. While we should use it for inspiration, we should *always review and verify its outputs before* using it for production-grade work. They are wonderful tools in our toolbelt and we should leverage them to make ourselves more productive.

What are the major steps?

Pre-training: LLMs are first trained on WWW text to form a base models where text is converted in tokens. A model can have X parameters which are knobs that can be tuned to get different outputs and are the vector of probabilities that LLMs use to produce outputs.

RLHF (reinforcement learning from human feedback): Human labelers sit down and write ideal responses to questions which the model then tries to imitate. This leads to GPT models like GPT4o. They work extremely well in verifiable domains like math or code

RL (reinforcement learning) is the next evolution where model outputs are scored by humans and a reward model to nudge the model into producing relevant outputs. Reasoning models can produce their chain of thoughts where they check their work, and backtrack if required to improve accuracy like Open AI o1, o3. This approach helps models perform better in unverifiable domains

Tips

You can ask models to use web search, or code to perform better at tasks like math

Tools

Tiktokenizer: convert human-readable text into tokens that the model uses
Hyperbolic: cloud to run models
Together AI: cloud to run models
Hugging Face: download data sets
LM Studio: Download local models

Stay up to date with news

AiNews.com

Highly recommend watching the video

Deep Dive into LLMs like ChatGPT

and thank you Andrej for putting this out in the world.

#AI #GenAI #LLM

automation hacks

Discussion about this post