🇬🇧 Limited Time — UK Only·🎓 Free Learning for 1 Month·🤖 Free AI Training Included·📚 4,000+ Lessons · 35,000+ Quizzes·🏆 GCSE Mocks · Olympiad Papers·⚡ Selected Students Only · Limited Places·🎁 Free Value Worth £2,000·🇬🇧 Limited Time — UK Only·🎓 Free Learning for 1 Month·🤖 Free AI Training Included·📚 4,000+ Lessons · 35,000+ Quizzes·🏆 GCSE Mocks · Olympiad Papers·⚡ Selected Students Only · Limited Places·🎁 Free Value Worth £2,000·🇬🇧 Limited Time — UK Only·🎓 Free Learning for 1 Month·🤖 Free AI Training Included·📚 4,000+ Lessons · 35,000+ Quizzes·🏆 GCSE Mocks · Olympiad Papers·⚡ Selected Students Only · Limited Places·🎁 Free Value Worth £2,000·
🤖Free Course

AI & ML

Build intelligent apps with machine learning and large language models.

20 Lessonsadvanced

Lessons

1

Data and simple patterns

AI often starts with data: numbers, lists, categories. A simple 'pattern' might be: the average of a list, or the most common item. Here we work with a list of numbers and find the average—like a tiny step toward what ML does with lots of data.

GPT-4 was trained on roughly 1 trillion tokens of text dataThe ImageNet dataset (14 million labelled images) transformed computer vision AI in 2012Garbage in, garbage out — biased training data produces biased models
2

Simple prediction rule

A very simple 'model' is a rule we write by hand. For example: if score >= 80 then predict 'pass'. Machine learning later learns such rules from data. Here we write a small rule and use it on a few examples.

Linear regression is the simplest ML algorithm — it fits a straight line to dataDecision trees and random forests are easy-to-understand models for classificationOverfitting: model memorises training data but fails on new data — too complex for the dataset
3

Loops over data

AI programs often loop over lots of data: for item in data: ... We might count, sum, or check a condition. Here we loop over a list and collect a result—like building a simple 'dataset' of answers.

GPT-3 training cost ~$4.6 million in compute — the loop ran for weeks on thousands of GPUsThe Adam optimiser (2014) is the standard training algorithm for most modern neural networksBatch size: 32 or 64 is common — balance between stability and training speed
4

Mean and median

Mean is the average (sum / count). Median is the middle value when sorted. Both summarize a list of numbers. AI often uses these to understand data.

5

Counting by category

Group data by category and count. Example: count how many fruits are 'apple', 'banana', etc. This is like building a simple histogram—useful before training a classifier.

6

Thresholds and rules

Many simple AI rules use a threshold: if score > 70 then 'pass'. You can combine several rules. This is the idea behind decision rules and decision trees.

7

Lists of records

Data for AI often comes as a list of records (dicts): each item has the same keys (e.g. age, score, result). You can loop and filter or aggregate by key.

8

Min, max, and simple scaling

Sometimes we scale numbers to a range (e.g. 0–1). One way: (x - min) / (max - min). This can help when comparing different features in data.

9

Majority vote

A simple way to 'combine' several answers is majority vote: count each label and pick the one that appears most. Used in simple ensemble ideas.

10

Distance between points

In 2D, the distance between (x1,y1) and (x2,y2) is sqrt((x2-x1)**2 + (y2-y1)**2). Similar points have small distance. This idea is used in nearest-neighbour and clustering.

11

Train vs test idea

In ML we often split data: use part to 'train' (learn) and part to 'test' (check how well it works on new data). Here we simulate: take 80% of a list as train, 20% as test.

12

What are features?

Features are the inputs we use to make a prediction (e.g. age, score, colour). We often store them as numbers or categories. Good features help the model; bad ones don't.

13

Simple vs complex rules

A very simple rule (e.g. always predict 'yes') might miss patterns (bias). A very complex rule might fit noise (variance). In practice we try to find a balance.

14

Accuracy

Accuracy = correct predictions / total predictions. After we have predictions and true answers, we count how many match and divide by total. It's the simplest metric for classification.

15

Prompts and LLMs

LLMs take text (a prompt) and generate more text. The prompt tells the model what to do. In code we might call an API with a prompt. Here we simulate: a function that 'responds' based on keywords.

16

Fairness and ethics

AI can reflect biases in data. If training data is unfair, the model might be unfair too. We should think about who is affected and whether the system is fair. Checking data and results helps.

17

Data pipeline idea

Real AI often has a pipeline: load data → clean (fix missing, errors) → transform (features) → train → evaluate. Here we do a tiny pipeline: load a list, filter, then compute a stat.

18

Text as tokens

LLMs see text as tokens (pieces: words or subwords). Longer text = more tokens. We can simulate by splitting a string into words and counting.

19

Embeddings idea

An embedding turns text (or other data) into a list of numbers so that similar things have similar numbers. We don't build one here—we just understand: same idea as 'encode as a vector' for comparison.

20

AI concepts recap

You've seen: data and patterns, simple rules, loops over data, mean/median, counting by category, thresholds, train/test split, features, accuracy, prompts, fairness, pipelines, tokens, embeddings. These are building blocks for real ML and LLMs.

Start learning AI & ML free today

AI tutoring · quizzes · projects · works on any device