A Shallow Dive into Deep Learning

Issue 6

Last week, I shared a very basic use case for organizations looking to dip their toes into the AI water, so to speak: creating a knowledge base for a fictional company using ChatGPT-4o, which allows new users to experiment safely with iterative questioning, playing around with the tool’s interface, etc.

This week, I will take a step back and explain how tools like ChatGPT-4o work by first defining three key components:

  1. Deep Learning (TODAY)

  2. Neural Networks

  3. Large Language Models (what tools like ChatGPT-4o are)

This trio is like kind of like a chaotic kitchen staffed by millions of tiny, forgetful chefs. Deep learning is the head chef, barking orders and setting the menu. The neural networks are the line cooks, frantically passing ingredients back and forth and adjusting recipes on the fly. And the LLM is the final dish - a linguistic soufflé that somehow emerges from the chaos, rising impressively despite being made from digital word-scraps and statistical seasonings.

Sometimes it's a perfect, fluffy masterpiece.

Other times it collapses into nonsensical word soup.

But either way, the tiny chefs keep cooking, learning from each success and failure, eternally striving to serve up the perfect response.

But alas, I am getting ahead of myself….

SO…DEEP LEARNING

Deep learning is a type of artificial intelligence (AI) that teaches computers to learn from and make decisions based on large amounts of data. It's inspired by the way the human brain works, using structures called neural networks.

Here’s a simple breakdown:

  • Neural Networks: Think of neural networks as a series of interconnected nodes (like neurons in the brain) that process data. These nodes are organized in layers: an input layer, one or more hidden layers, and an output layer.

  • Learning from Data: Deep learning models are trained using vast amounts of data. For example, to teach a model to recognize images of cats, you would feed it thousands of cat images. The model learns to identify patterns and features (like whiskers, ears, etc.) that distinguish cats from other objects.

  • Layers of Abstraction: Each layer in the neural network extracts different levels of features from the data. The first layer might detect simple features like edges, the next layer might detect shapes, and deeper layers might recognize complex concepts like faces or objects.

  • Training Process: The model makes predictions and compares them to the actual outcomes. It then adjusts its internal parameters (weights) to improve accuracy. This process is repeated many times, gradually improving the model's performance.

It’s not a miracle, nor is it supper efficient.

Try to imagine it like this: You teach a dog to fetch by showing it millions of videos of other dogs fetching.

At first, it might just stare at the screen, but eventually, it starts to understand that fetching involves running, grabbing the stick, and bringing it back.

Soon enough, your dog is fetching like a pro, even if it has no idea why it's doing it—just like a neural network learning to recognize patterns without really 'knowing' what those patterns mean!

Tiffany Perkins-Munn—Head of Marketing Data & Analytics with JP Morgan—has a good short YouTube video discussing it here:

Tomorrow we will dive a bit deeper into Neural Networks—all those individual eager and clueless puppies.

For more, visit NorthLightAI.com.