Week 2 ยท Session 2 ยท 2 hours

Data Rules Everything

Four multiplications and an average per step — that’s the entire engine of deep learning. You’ll leave saying: “I did the math the machine does — by hand.”

Begin โ†’
Today you will

See that it was never magic

โœ๏ธ

Gradient descent by hand

Do with a pencil what every GPU does billions of times a second.

๐Ÿ“‰

Meet overfitting

The U-curve โ€” why a model that aces practice can fail the real test.

โš–๏ธ

Measure real bias

Produce bias on purpose with a data filter, then catch it with a metric.

The Arc

Your step-by-step for today

The engine of deep learning

Gradient descent, step by step

Our data: x = [1,2,3,4], y = [2.1, 3.9, 6.2, 7.8] (so y โ‰ˆ 2x). Our model: prediction = w ยท x. We start at w = 0 with learning rate 0.05. Watch w climb toward 2 all by itself.

The point: subtracting a negative gradient moved w UP โ€” downhill in loss can mean uphill in w. The gradient knows the direction; you just obey the minus sign. That’s the whole loop.
Learning-rate chaos โ€” predict, then reveal

The one knob that can break everything

The learning rate is your step size downhill. Tap each to see what it does to training.

The U-curve

Why acing practice can be a trap

As you make a model more powerful, its error on the training data always falls. But its error on new data falls, hits a sweet spot, then rises โ€” the model started memorizing noise instead of learning patterns. That’s overfitting.

Train error only ever falls โ€” why is that a trap? Because a model that memorized the answer key looks perfect on practice and fails the real exam. The only honest grade is the test split it never saw.
Bias with teeth

How a “94% accurate” model hides real harm

In the notebook you deliberately starve digit 8 down to just 5 training examples, then measure. Tap to reveal what the metrics say.

The sentence of the night: “Overall accuracy hid the harm; per-group metrics exposed it.”
Law #2 ยท earned today
Nobody has to intend bias for it to be real.

You produced it on purpose with a data filter, and fixed it with data. Which is exactly why “we didn’t mean to” is not a defense at a company that never measured per-group.

This already happened

Real bias, real consequences

These are real. For each, ask the Week-2 question: which metric, computed for which group, would have exposed this BEFORE launch? Tap to open.

๐Ÿ“

Badge: Data Scientist

Earn it: 3 gradient-descent steps by hand matching the notebook; U-curve explained; recall-for-8 measured before & after the fix.

๐ŸŽ’ Mission โ€” before next session (1โ€“2 hrs)
  • One NB2 stretch: the blame-detective (which digit absorbed the 8s?) OR the tipping-point plot (recall vs. examples kept).
  • Verify one AI answer against two independent sources; stamp it accept / revise / reject.
  • Confirm your Week 3 corpus is saved and reachable from your laptop.