Icon

TEAM_​08

I load IMDb dataset
CSV Reader
I create the sub dataset while filtering only Horror movies
Rule-based Row Filter
residual = imdb_score - pred_imdb_score
Math Formula
Distribution of IMDb user ratings for Horror movies
Histogram
Handling missin values
Missing Value
Summary statistics of IMDb user ratings grouped by content rating for horror movies
GroupBy
Relationship between movie duration and IMDb user rating for horror movies
Scatter Plot (JavaScript) (legacy)
residual vs predicted
Scatter Plot
residual distribution
Histogram
split data: train 70%/ test 30%, random seed fixed
Table Partitioner
model 1: linear regression scores
Numeric Scorer
model 2: predict on test set
Random Forest Predictor (Regression)
model 2: random forest scores
Numeric Scorer
model 2: train random forest regression (target: imdb_score)
Random Forest Learner (Regression)
Comparing IMDb score with log-transformed budget
Scatter Plot (JavaScript) (legacy)
Relationship between movie budget and IMDb user rating for horror movies
Scatter Plot (JavaScript) (legacy)
model 1: train linear regression (target: imdb_score)
Linear Regression Learner
log budget
Math Formula
model 1: predict on test set
Regression Predictor
imdb_score vs log_budget
Linear Correlation
Missing Value
pred_imb_score vs imdb_score
Scatter Plot
I rename the prediction column in pred_imdb_score
Column Renamer
I select relevant columns for residual analysis
Column Filter
I remove rows with missing predictions
Row Filter

Nodes

Extensions

Links