LLM Starter Pack#

Here is the minimal set of resources to get started with large language models. They both include practical hands on components that you can do yourself, multiple resource modalities (videos, text, community) and these two in particular are different styles from the same person, which allows them to build upon each other.

Andrej Karpathy’s GPT from Scratch#

Andrej takes a teacher student approach here and shows how to train a smaller GPT on your computer. The dataset is interesting, but of sufficient small so you can train on your own computer. The codebase also is small and easy to read and in two hours you’ll see every piece of a end to end model in code.

Now this won’t get you to a ChatGPT style model and he explains why in the video.

Andrej Karpathy’s State of GPT#

This second video is a conference presentation where Andrej explains all the parts it takes to make a production grade chat model. Hearing it from the same person means the vocabulary aligns making it simpler to correlate the learnings from the previous learnings with this one.