Logistic regression on the Iris data set
Mon, Feb 29, 2016The Iris data set has four features for Iris flower.
- sepal length
- sepal width
- petal length
- petal width
Using a three class logistic regression the four features can be used to classify the flowers into three species (Iris setosa, Iris virginica, Iris versicolor).
Using this Jupyter notebook combinations of two features we are used to classify the species. The mis-predicted values are shown below.
measure 1 | measure 2 | incorrect predictions |
---|---|---|
sepal length | sepal width | 29 |
sepal length | petal length | 6 |
sepal length | petal width | 8 |
sepal width | petal length | 7 |
sepal width | petal width | 7 |
petal length | petal width | 6 |
The previous post shows that some combinations of features are easier to use to separate the species than others.
Logistic regression can also be used on the two principal components and mis-predicts five specimens.
A mesh when drawn over the plot shows the three classes of the logistic regression.