Icon

03. X-Validation

Iris DatasetThe Iris flower data set or Fisher's Iris data set isa multivariate data set introduced by the Britishstatistician and biologist Ronald Fisher in his1936 paper The use of multiple measurementsin taxonomic problems as an example of lineardiscriminant analysis.The data set consists of 50 samples from each ofthree species of Iris (Iris setosa, Iris virginica andIris versicolor). Four features were measuredfrom each sample: the length and the width of thesepals and petals, in centimetres. Based on thecombination of these four features, Fisherdeveloped a linear discriminant model todistinguish the species from each other. Error in evaluation ERROR:The classifier isbeing tested on thesame dataset usedfor training. Hold Out training set test set X-Validation: k-Fold / Leave-One-Out (LOO) test set training set Apply Normalization (Correct!!!) test set training set normalizedtraining set normalizedtest set Error in Apply Normalization training set test set ERROR:In real-world observation(test set), we don't haveinformation to normalizetest data. train aclassificationmodelevaluate a classificationmodelLoad Irisevaluate metricsLoad Iristrain aclassificationmodelevaluate a classificationmodelNode 8evaluate metricstrain aclassificationmodelLoad Irisevaluate metricsevaluate a classificationmodelXXNode 27Node 28Load IrisNode 31Node 32Node 33Node 34Node 35evaluate a classificationmodelevaluatemetricsevaluate a classificationmodelNode 43Load IrisNode 46evaluate metricstrain aclassificationmodelNode 50train aclassificationmodelNode 52 Training Evaluate ARFF Reader Scorer ARFF Reader Training Evaluate Partitioning Scorer Training ARFF Reader Scorer Evaluate X-Partitioner X-Aggregator Normalizer Normalizer (Apply) ARFF Reader Partitioning Statistics Statistics Statistics Statistics Evaluate Scorer Evaluate Partitioning ARFF Reader Normalizer Scorer Training Shuffle Training Shuffle Iris DatasetThe Iris flower data set or Fisher's Iris data set isa multivariate data set introduced by the Britishstatistician and biologist Ronald Fisher in his1936 paper The use of multiple measurementsin taxonomic problems as an example of lineardiscriminant analysis.The data set consists of 50 samples from each ofthree species of Iris (Iris setosa, Iris virginica andIris versicolor). Four features were measuredfrom each sample: the length and the width of thesepals and petals, in centimetres. Based on thecombination of these four features, Fisherdeveloped a linear discriminant model todistinguish the species from each other. Error in evaluation ERROR:The classifier isbeing tested on thesame dataset usedfor training. Hold Out training set test set X-Validation: k-Fold / Leave-One-Out (LOO) test set training set Apply Normalization (Correct!!!) test set training set normalizedtraining set normalizedtest set Error in Apply Normalization training set test set ERROR:In real-world observation(test set), we don't haveinformation to normalizetest data. train aclassificationmodelevaluate a classificationmodelLoad Irisevaluate metricsLoad Iristrain aclassificationmodelevaluate a classificationmodelNode 8evaluate metricstrain aclassificationmodelLoad Irisevaluate metricsevaluate a classificationmodelXXNode 27Node 28Load IrisNode 31Node 32Node 33Node 34Node 35evaluate a classificationmodelevaluatemetricsevaluate a classificationmodelNode 43Load IrisNode 46evaluate metricstrain aclassificationmodelNode 50train aclassificationmodelNode 52 Training Evaluate ARFF Reader Scorer ARFF Reader Training Evaluate Partitioning Scorer Training ARFF Reader Scorer Evaluate X-Partitioner X-Aggregator Normalizer Normalizer (Apply) ARFF Reader Partitioning Statistics Statistics Statistics Statistics Evaluate Scorer Evaluate Partitioning ARFF Reader Normalizer Scorer Training Shuffle Training Shuffle

Nodes

Extensions

Links