Notes by Sarah Chieng | Reference Video | Reference Slides
Note:
Hi there! It’s Sarah. This page was originally my personal notes on Karpathy’s 1hr “Intro To Large Language Models” video. I thought it was a great thorough and beginner friendly video on LLMs, so I wanted to compile and polish my notes to share :) It’s about a 5 minute read, and of course if you have time, I’d encourage watching the actual video!
A couple of important things, I have:
Also, big thanks to Kudzo Ahegbebu and Harper Carroll for providing feedback on the content, factual correctness, and structure of this page :) This document is a work in progress and ideally can become a very helpful resource for someone to learn more about LLMs. Any feedback is appreciated (💌 [email protected])
Table of Contents:
An LLM is a type of neural network that specializes in processing, understanding, and generating human language.
Given a sequence of words, the LLM predicts the next word.
This requires parameters/ the network to learn a lot about the world
All of this info is compressed into the parameters
**An example of an open-source LLM: [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf#:~:text=Llama 2 is a collection,the index at the bottom.)**