ISSN: 2265-6294

Data Analytics with the IRIS Data set – finding the correlations between the products

Main Article Content

G.N.V Vibhav Reddy ,K Naga Sai Pravallika , M Deekshitha, E Shiva Ganesh , Y Praveen ,


The term "correlation" refers to a mutual relationship or association between quantities. In almost any business, it is useful to express one quantity in terms of its relationship with others. Given two products X and Y, I have to find their correlation, i.e., their linear dependence/independence. Both products have equal dimension. The result should be a floating point number from [-1.0 .. 1.0]. The Iris flower data is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems" as an example of linear discriminant analysis. It is sometimes called Anderson’s Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. Two of the three species were collected in the Gaspé Peninsula, “all from the same pasture, and picked on the same day and measured at the same time by the same person with the same apparatus.”The aim of the project is to find the special differences between the flower species and calculate their correlation terms with each other.

Article Details