article ARTICLE
article3 min read

Deep learning weekly piece: the differences between AI, ML, and DL

1. Alan Turing -> Artificial Intelligence (AI)

Alan Turing was a mathematician, cryptographer who deciphered the Enigma machine in WW2, logician, philosopher, Cambridge fellow (at age 22) and ultra-long distance runner. He also lay the foundations of the modern day computer and artificial intelligence.

His work permeated into wider public knowledge in the 1950s. This gave birth to the idea of “General AI”: can computers could posses the same characteristics of human intelligence, including reasoning, interacting, and thinking like we do? The answer was a resounding “no” (at least not yet).

Therefore, we had to focus on “Narrow AI” — technologies that can accomplish specific tasks such as playing chess, recommending your next Netflix TV show, and identifying spam emails. All of these exhibit parts of human intelligence. But how do they work? That’s machine learning.

2. Artificial Intelligence -> Machine Learning (ML)

At a high level, ML generally means algorithms or models that

  1. data: get a lot of (cleaned) data, with human-defined features (e.g. “age”, “height”, “FICO score”, “is this email spam?” etc.)
  2. training: use the data to “tune” the relative importance of each feature
  3. inference: predict something on new data

An example of this is predicting spam emails: Google Gmail collects massive amounts of data about what is spam and what isn’t (this is called “labeled data”). The algorithm then identify common features of spam messages vs. non-spam messages. The algorithm then runs on unlabeled data (i.e. new emails) to predict whether they’re spam or not.

When you select spam messages, you are training Gmail’s ML algorithms to better predict future spam messages

ML requires a lot of human intervention, such as manually telling the spam filter what to look for in spam vs. non-spam messages (e.g. look for the words “Western Union” / look for links to suspicious websites etc.). It’s also not very accurate on images.

3. Machine Learning -> Deep Learning

DL is a subset of ML. It is based on neural networks, a conceptual model of the brain that has been around since the 1950s but largely ignored until recently. That’s because they are very computationally expensive and it’s only recently that 1) processing has become sufficiently cheap and powerful, through GPUs and FPGAs and 2) there’s been enough data to feel the DL algorithms.

DL is the focus of previous posts, so I won’t repeat too many details other than, at a high level, the word “deep” comes from the fact that DL algorithms are trained/run on deep neural networks. These are just neural networks with (usually) three or more “hidden” layers. For example, each layer may play the role of identifying different features of a picture of an Audi A7.

Abstraction of a DL algorithm (hence trained/run on a deep neural network) to recognize an image of an Audi

39
  •  Inspiring
  • comment_icon  Comment