Icon

BasketballCleaningHubversion 1.2

Recalculate "minutes" and "seconds" columns into asingle column, counted in seconds.Changes the match up column to clarify if it is a homeor away game. Remove redundant/useless columns,reorder them more intuitively. Sort for only rows that have successful orunsuccessful shots, without unknowns. Sort for only rows that have unknown shots. One CSV isonly shot ID and column for eval, one is the full data for theunknowns. Read in the initial data.csv Afterthought: This converts everything that'snot a number to a number. Another nodecan be attached to convert back. Data cleaning: visualize what we have (statistics), delete rows with missing values, deleterows with outliers of the features selected, normalize data of the features selected, visualizea matriz with linear correlation, partitioning data for training and testing 80-20 Random forest model: number to string to be able to take the shot_made_flag feature as target,apply the random forest learner and predictor to the features (numerical only) and score the results. Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12Node 13Node 14Node 15Node 16Node 17Node 18Node 19Node 20Node 21Node 23Node 24 CSV Reader Math Formula Cell SplitterBy Position Rule Engine Column Filter Column Resorter Row Filter Row Filter CSV Writer CSV Writer Column Filter CSV Writer Category To Number Statistics Missing Value Numeric Outliers Normalizer Partitioning Linear Correlation Random ForestLearner Random ForestPredictor Number To String Scorer Recalculate "minutes" and "seconds" columns into asingle column, counted in seconds.Changes the match up column to clarify if it is a homeor away game. Remove redundant/useless columns,reorder them more intuitively. Sort for only rows that have successful orunsuccessful shots, without unknowns. Sort for only rows that have unknown shots. One CSV isonly shot ID and column for eval, one is the full data for theunknowns. Read in the initial data.csv Afterthought: This converts everything that'snot a number to a number. Another nodecan be attached to convert back. Data cleaning: visualize what we have (statistics), delete rows with missing values, deleterows with outliers of the features selected, normalize data of the features selected, visualizea matriz with linear correlation, partitioning data for training and testing 80-20 Random forest model: number to string to be able to take the shot_made_flag feature as target,apply the random forest learner and predictor to the features (numerical only) and score the results. Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12Node 13Node 14Node 15Node 16Node 17Node 18Node 19Node 20Node 21Node 23Node 24CSV Reader Math Formula Cell SplitterBy Position Rule Engine Column Filter Column Resorter Row Filter Row Filter CSV Writer CSV Writer Column Filter CSV Writer Category To Number Statistics Missing Value Numeric Outliers Normalizer Partitioning Linear Correlation Random ForestLearner Random ForestPredictor Number To String Scorer

Nodes

Extensions

Links