Regression and Classification Models

Logistic Regression

This workflow is an example of how to build a basic prediction / classification model using logistic regression.

URL: Logistic Regression Node: Algorithm Settings https://youtu.be/AclQdjxpGA0

Nodes

Component Input8 ×
Component Output8 ×
Column Filter7 ×
Expression5 ×
Bar Chart4 ×
Show all 30 nodes

Extensions

No modules found

Download

To use this workflow in KNIME, download it from the below URL and open it in KNIME:

Download Workflow

Created by: kathrinmelcher

Created at: 2017-06-20

On NodePit since: 2025-12-12

Last update: 2026-07-13

Created with KNIME version: v5.8.1

Tags: classificationmachine learningpredictionanalyticsKNIMElogistic regressionlogitdata science

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!

Regression and Classification Models

Regression Model

Purpose:

Estimate approximate lifetime box office revenue after release using early audience signals and known film attributes.

Model performance:

R² ≈ 0.76 indicates the model explains most revenue variation but still has meaningful uncertainty.

Actual vs. predicted revenue:

Closer clustering around the diagonal reflects stronger predictions; spread shows natural revenue volatility.

Residual distribution:

Residuals centered near zero suggest no consistent over or under prediction.

Revenue by month:

Predictions capture seasonal trends, with higher revenues in peak release months.

How to use:

Budget, runtime, release timing, popularity, vote count, and vote average.

Classification Model

Flop

Negative ROI (movie loses money)

Moderate:

Break even to low profit (ROI between 0 and 1)

Hit:

Strong profit (ROI between 1 and 3)

Blockbuster:

Exceptional profit (ROI 3 or higher)

Recall

Of all movies that truly fall into this class, how many the model correctly identified.

(High recall = fewer missed flops or missed blockbusters)

Precision

Of all movies the model labeled as this class, how many truly belong there.

(High precision = fewer false alarms

Model timing: