Feature Reduction (Blu) | Stage 2
The Sk-Learn Python module contains a data set called Iris. It contains samples of flower part measurements. It also contains the names of 3 different types of irises. The data was created to categorize an iri as a certain type based on its flower part measurements. Using PCA / Feature Reduction / Dimensionality Reduction, we want to optimize the number of rows needed to categorize an iris. First, reduce the dimension of the data set to 2 dimensions THEN plot a scatter plot that shows the dimensions where the x and y value are the dimensions and use color to show the cluster / iris type assigned to it. Second, do the same thing but now with only 1 dimension. Is one dimension enough to categorize the iris into a type?
Download data sets here:
Power in Numbers
Programs
Locations
Volunteers