Support Vector Machines (SVM) - Classification¶
Support Vector Machines are the type of supervised learning algorithms used for regression, classification and detecting outliers. SVMs are remarkably one of the powerful models in classical machine learning suited for handling complex and high dimensional datasets.
With SVM supporting different kernels (linear, polynomial, Radial Basis Function(rbf), and sigmoid), SVM can tackle different kinds of datasets, both linear and non linear.
While the maths behind the SVMs are beyond the scope of this notebook, here is the idea behind SVMs:
The way SVM works can be compared to a street with a boundary line. During SVM training, SMV draws the large margin or decision boundary between classes based on the importance of each training data point. The training data points that are inside the decision boundary are called support vectors and hence the name.
Image source: Wikimedia
We are going to use Iris flower dataset.The dataset contain 3 species which are:
Iris Versicolor. These species are what we can refer to categories/classes. The features are sepal length, sepal width, petal length, petal width. All features were measured in centimeters(cm).
There are 50 samples for each specy, so we have 150 samples for all species. Below are the pictures of these 3 species.
from IPython.display import Image url_setosa = 'http://upload.wikimedia.org/wikipedia/commons/5/56/Kosaciec_szczecinkowaty_Iris_setosa.jpg' url_virginica = 'http://upload.wikimedia.org/wikipedia/commons/9/9f/Iris_virginica.jpg' url_versicolor = 'http://upload.wikimedia.org/wikipedia/commons/4/41/Iris_versicolor_3.jpg' urls = [url_setosa, url_virginica,url_versicolor] def display_image(url): image = Image(url, width=250, height=200) return image
# Displaying Iris Setosa display_image(urls)