Machine Learning – Supervised VS Unsupervised Learning

Machine Learning – Supervised VS Unsupervised Learning

November 29, 2019 15 By Stanley Isaacs


Hello! In this video, we’ll provide some basics on
supervised and unsupervised learning. An easy way to begin grasping the concept
of supervised learning is by looking directly at the words that make it up. Supervise means to observe and direct the
execution of (a task, project, or activity) Obviously, we aren’t going to be supervising
a person? Instead, we’ll be supervising a machine learning
model that might be able to produce classification regions like we see here. So how do we supervise a machine learning
model? We do this by ‘teaching’ the model. That is, we load the model with knowledge
so that we can have it predict future instances. But this leads to the next question, which
is, how exactly do we teach a model? We teach the model by training it with some
data from a labeled dataset. It’s important to note that the data is labeled. And what does a labeled dataset look like? Well, it can look something like this. This example is just taken from the Iris dataset,
which is a famous dataset used for machine learning. Let’s start by classifying some components
of this table. The names up here, which are called Sepal
length, Sepal width, Petal length, Petal width, and Species are called the Attributes. a) The columns are called Features, which
include the data. b) If you look at a single data point on a
plot, it’ll have all of these attributes. That would make a row on this chart, or an
observation. c) Looking directly at the value of data,
you can have 2 kinds. d) The first is numerical. When dealing with machine learning, the most
commonly used data is numeric. e) The second is categorical? that is, it’s
non-numeric, because it contains characters rather than numbers. In this case, it is categorical because this
dataset is made for Classification f) Usually, a dataset like this will be put
into a .csv file, or comma-separated value file. This file separates Observations by new lines,
and attributes by commas (hence comma-separated) There are two types of supervised learning,
classification and regression. Since we know the meaning of supervised learning,
what do you think unsupervised learning means? Unsupervised Learning is exactly as it sounds. We do not supervise the model, but we let
the model work on its own to discover information that may not be visible to the human eye. Unsupervised learning uses machine learning
algorithms that draw conclusions on UNLABELED data. Unsupervised learning has more difficult algorithms
than supervised learning, since we know little to no information about the data, or the outcomes
that are to be expected. With unsupervised learning, we’re looking
to find things such as groups/clusters, perform
density estimation, and dimensionality reduction. In supervised learning, however, we know what
kind of data we’re dealing with, since it is labelled data. In comparison to supervised learning, unsupervised
learning has: fewer tests and
Fewer models that can be used in order to ensure the outcome of the model is accurate. As such, unsupervised learning creates a less
controllable environment, as the machine is creating outcomes for us. Now, let’s investigate a machine learning
algorithm. Here we can see the output of an algorithm
applied to examining poisonous mushrooms. As you can see, it tells us if a mushroom
is poisonous or edible, depending on its features. So, without looking at the data itself, do
you think this a supervised or unsupervised machine learning problem? The answer is supervised Machine Learning,
as this is an example of classification. That is, it classifies mushrooms into two
different labels: poisonous or edible. Specifically, it does so using a classification
tree algorithm. The biggest difference between Supervised
and Unsupervised Learning is that supervised learning deals with labeled data while Unsupervised
Learning deals with unlabeled data. In supervised learning, we have machine learning
algorithms for classification, and Regression. Classification is the organization of labeled
data. Regression is the prediction of trends in
labeled data to determine future outcomes. While it’s possible to classify data using
regression, covering that now is out of scope for this course. In unsupervised learning, we have clustering. Clustering is the analysis of patterns and
groupings of unlabeled data. Thanks for watching!