Complete Introduction to Machine Learning
Introduction
In this article we will see what machine learning
exactly is from very basic. We will observe how ML exactly came into existence
and what are various terms related to Machine Learning.
What is learning?
·
The acquisition of knowledge or skills through study,
experience, or being taught. (Oxford)
·
Learning is a key process in human
behavior.(psychologydiscussion.net)
·
Learning is the process of acquiring new, or modifying
existing, knowledge, behaviors, skills, values, or preferences. (Wikipedia)
·
Learning has essentially been a human attribute. We
perform tasks better when we learn!
·
Man has made machines which can solve problems using
human-designed algorithms.
·
Now, we are creating algorithms which learn.
Essentially algorithms that solve problems better as they (algorithms) learn!
Origin of Machine Learning
1943 : a neurophysiologist Warren Mcculloch and
mathematician Walter Pitts created a model of neurons using an electrical
circuit, and the 1stneural network was born.
1950: Alan Turing created the Turing test. For a
computer to pass it has to be able to convince the human that it is a human and
not a computer.
1952: The first computer program which could learn as
it Ran was created. It was a game which played checkers created by Arthur
Samuel.
Early
definition of Machine Learning
Tom Mitchell –Professor in Carnegie Mellon University
in his book titled ‘Machine Learning’, defined Machine Learning as:
A computer program is said to learn from experience E
with respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with experience E.
Examples:
1. To better filter emails as spam or not.
Task – Classifying emails as spam or not.
Performance Measure – The fraction of emails
accurately classified as spam or not spam.
Experience – Observing you label emails as spam or not
spam.
2. A checkers learning problem
Task – Playing checkers game.
Performance Measure – percent of games won against
oppose.
Experience – playing implementation games against
itself.
3. Handwriting Recognition Problem
Task – Acknowledging handwritten words within
portrayal.
Performance Measure – percent of words accurately
classified.
Experience – a directory of handwritten words with
given classifications.
Introduction to Machine Learning
An ML algorithm tries to frame data in the context of
a hypothetical function (f).
• That is, given some input variables (input), what is
the predicted output variable (output).
• We represent it as :
𝑂𝑢𝑡𝑝𝑢𝑡 = 𝑓 (𝐼𝑛𝑝𝑢𝑡)
OR
O𝑢𝑡𝑝𝑢𝑡𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 = 𝑓 (𝐼𝑛𝑝𝑢𝑡𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠)
OR
𝑌 = 𝑓(𝑋)
Example :
Precipitation = f(Windspeed, cloudcover%, temperature)
Car_price = f(make, model, engine, color,...)
Weight = f(Height) OR maybe Weight = f(Height, Age,
Gender)
Visibility = f(Distance, FogDensity)
Modelling uses machine learning algorithms in which
machine learns from Data just like humans learn from experience.
Machine learning models can be classified into
following three types based on the task performed and the nature of the output.
Regression:
Output variable to be predicted is a continuous variable. For example, score of
student in a subject.
Classification:
The output variable to be predicted in categorical variable. For example,
classify incoming emails as spam or ham.
Clustering: No
predefined notion of a label is allocated to the groups cluster formed. Example: customer segmentation.
We can also classify machine learning models into two broad categories.
1. Supervised
learning: Past data available is used for building the model.
Regression and classification algorithms fall under this category.
2.
Unsupervised learning: No predefined
labels are assigned to pass data. Clustering and association algorithms fall
under this category.
There are two types of ML algorithms:
1) Parametric algorithms
Algorithms that simplify the function to a known form
are called parametric machine learning algorithms.
A learning model that summarizes data with a set of
parameters of fixed size (independent of the number of training examples) is
called a parametric model.
E.g. The line used in Linear Regression is represented
by the form:
𝐵0 +
𝐵1𝑋1 + 𝐵2𝑋2 = 0
where 𝐵0, 𝐵1 and 𝐵2 are the coefficients of the
line that control the intercept and slope, and 𝑋1 and 𝑋2 are two input variables.
Examples of
parametric algorithms:
·
Regression – is a classification algorithm
·
Linear Discriminant Analysis – is a dimensionality
reduction technique
·
Perceptron – is an algorithm for supervised learning
of binary classifiers
Advantages:
·
Simpler: These methods are easier to understand and
interpret results.
·
Speed: Parametric models are very fast to learn from
data.
·
Less Data: They do not require as much training data
and can work well even if the data is not perfect.
Limitations:
·
Constrained: By choosing a functional form these
methods are highly constrained to the specified form.
·
Limited Complexity: The methods are more suited to
simpler problems.
·
Poor Fit: In practice the methods are unlikely to
match the underlying mapping function.
·
2)
Non-parametric algorithm
Algorithms that do not make strong assumptions about
the form of the mapping function are called nonparametric machine learning
algorithms.
By not making assumptions, they are free to learn any
functional form from the training data.
Example: k-nearest neighbor’s algorithm that makes
predictions based on the k most similar training patterns for a new data
instance. The method does not assume anything about the form of the mapping
function other than patterns that are close are likely have a similar output variable.
Examples of
non-parametric algorithms:
·
Decision Trees like CART and C4.5
·
Naive Bayes
·
Support Vector Machines
·
Neural Networks
Advantages:
·
Flexibility: Capable of fitting a large number of
functional forms.
·
Power: No assumptions (or weak assumptions) about the
underlying function.
·
Performance: Can result in higher performance models
for prediction.
Limitations:
·
More data: Require a lot more training data to
estimate the mapping function.
·
Slower: A lot slower to train as they often have far
more parameters to train.
·
Over fitting: More of a risk to over fit the training
data and it is harder to explain why specific predictions are made.
- Jay Charole
- Mar, 11 2022