Summary of deep dive into LLMs like ChatGPT from Andrej Karpathy ⚡️
We often fear the unknown, because we just don't know. Often the simplest thing we can do is to begin anywhere and learn!

Andrej Karpathy probably needs no introduction, but if you are hearing the name for the first time, he is a world-renowned expert in AI and deep learning who knows his stuff having led major projects at Tesla, OpenAI
This video shines a light on how LLMs are trained, what they are good for, and what they are not and I found it highly practical. It does not explain math, or underlying algorithms but gives you a solid intuition about the current landscape
What I took away from this?
Models are extremely good for brainstorming generating ideas, and writing code but by definition act like stochastic parrots often imitating data provided by a human expert and generate
We should not blindly trust what it generates or deploy it in mission-critical areas as it will have hallucinations. While we should use it for inspiration, we should *always review and verify its outputs before* using it for production-grade work. They are wonderful tools in our toolbelt and we should leverage them to make ourselves more productive.
What are the major steps?
Pre-training: LLMs are first trained on WWW text to form a base models where text is converted in tokens. A model can have X parameters which are knobs that can be tuned to get different outputs and are the vector of probabilities that LLMs use to produce outputs.
RLHF (reinforcement learning from human feedback): Human labelers sit down and write ideal responses to questions which the model then tries to imitate. This leads to GPT models like GPT4o. They work extremely well in verifiable domains like math or code
RL (reinforcement learning) is the next evolution where model outputs are scored by humans and a reward model to nudge the model into producing relevant outputs. Reasoning models can produce their chain of thoughts where they check their work, and backtrack if required to improve accuracy like Open AI o1, o3. This approach helps models perform better in unverifiable domains
Tips
You can ask models to use web search, or code to perform better at tasks like math
Tools
Tiktokenizer: convert human-readable text into tokens that the model uses
Hyperbolic: cloud to run models
Together AI: cloud to run models
Hugging Face: download data sets
LM Studio: Download local models
Stay up to date with news
Highly recommend watching the video
Deep Dive into LLMs like ChatGPT
and thank you Andrej for putting this out in the world.
#AI #GenAI #LLM