Icon

BasketballCleaningHubversion 1.4

Recalculate "minutes" and "seconds" columns into asingle column, counted in seconds.Changes the match up column to clarify if it is a homeor away game. Remove redundant/useless columns,reorder them more intuitively. Sort for only rows that have successful orunsuccessful shots, without unknowns. Sort for only rows that have unknown shots. One CSV isonly shot ID and column for eval, one is the full data for theunknowns. Read in the initial data.csv Afterthought: This converts everything that'snot a number to a number. Another nodecan be attached to convert back. Data cleaning: visualize what we have (statistics), delete rows with missing values, deleterows with outliers of the features selected, normalize data of the features selected, visualizea matriz with linear correlation, partitioning data for training and testing 80-20 Random forest model: number to string to be able to take the shot_made_flag feature as target,apply the random forest learner and predictor to the features (numerical only) and score the results. Magic happens.A bunch of models. Some require doubles, so I recast some variables that way. We should consider dropping some columns based upon thecorrelation matrix above. Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12Node 13Node 14Node 15Node 16Node 17Node 18Node 19Node 20Node 21Node 23Node 24Node 25Node 26Node 27Node 28Node 29Node 30Node 31Node 32Node 33Node 34Node 35Node 36Node 37Node 41Node 42Node 43Node 47Node 48Node 49 CSV Reader Math Formula Cell SplitterBy Position Rule Engine Column Filter Column Resorter Row Filter Row Filter CSV Writer CSV Writer Column Filter CSV Writer Category To Number Statistics Missing Value Numeric Outliers Normalizer Partitioning Linear Correlation Random ForestLearner Random ForestPredictor Number To String Scorer RProp MLP Learner MultiLayerPerceptronPredictor Scorer Column Rename Partitioning Denormalizer Normalizer PNN Learner (DDA) PNN Predictor Scorer Gradient BoostedTrees Learner Gradient BoostedTrees Predictor Scorer Fuzzy Rule Learner Fuzzy RulePredictor Scorer SVM Learner SVM Predictor Scorer Recalculate "minutes" and "seconds" columns into asingle column, counted in seconds.Changes the match up column to clarify if it is a homeor away game. Remove redundant/useless columns,reorder them more intuitively. Sort for only rows that have successful orunsuccessful shots, without unknowns. Sort for only rows that have unknown shots. One CSV isonly shot ID and column for eval, one is the full data for theunknowns. Read in the initial data.csv Afterthought: This converts everything that'snot a number to a number. Another nodecan be attached to convert back. Data cleaning: visualize what we have (statistics), delete rows with missing values, deleterows with outliers of the features selected, normalize data of the features selected, visualizea matriz with linear correlation, partitioning data for training and testing 80-20 Random forest model: number to string to be able to take the shot_made_flag feature as target,apply the random forest learner and predictor to the features (numerical only) and score the results. Magic happens.A bunch of models. Some require doubles, so I recast some variables that way. We should consider dropping some columns based upon thecorrelation matrix above. Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12Node 13Node 14Node 15Node 16Node 17Node 18Node 19Node 20Node 21Node 23Node 24Node 25Node 26Node 27Node 28Node 29Node 30Node 31Node 32Node 33Node 34Node 35Node 36Node 37Node 41Node 42Node 43Node 47Node 48Node 49CSV Reader Math Formula Cell SplitterBy Position Rule Engine Column Filter Column Resorter Row Filter Row Filter CSV Writer CSV Writer Column Filter CSV Writer Category To Number Statistics Missing Value Numeric Outliers Normalizer Partitioning Linear Correlation Random ForestLearner Random ForestPredictor Number To String Scorer RProp MLP Learner MultiLayerPerceptronPredictor Scorer Column Rename Partitioning Denormalizer Normalizer PNN Learner (DDA) PNN Predictor Scorer Gradient BoostedTrees Learner Gradient BoostedTrees Predictor Scorer Fuzzy Rule Learner Fuzzy RulePredictor Scorer SVM Learner SVM Predictor Scorer

Nodes

Extensions

Links