Icon

03 Transform Data by Removing Duplicates and Outliers

<p><strong>Transform Data: Remove Duplicates and Outliers</strong></p><p>This workflow demonstrates how to <strong>prepare raw data for analysis</strong> by removing recurring information, filtering out unwanted values, and calculating new metrics to sort our results effectively.</p>

URL: KNIME Learning Center https://www.knime.com/learning
URL: KNIME Cheat Sheet: Building a KNIME workflow for beginners https://www.knime.com/cheat-sheets/building-knime-workflow-beginners
URL: KNIME Cheat Sheet: Data wrangling with KNIME Analytics Platform https://www.knime.com/files/data-wrangling-with-knime.pdf

Clean data using a Expression Row Filter node

Step 1: Click on the "Expression Row Filter" node to open the configuration window.

Step 2: Enter the expression below to remove "outliers" or unwanted data.
$["Score"] <= 200

Step 3: Click "Apply and Execute". This keeps only the rows that meet your specified criteria.

Sort results

Step 1: Click on the "Sorter" node to open configuration window.

Step 2: Define one or multiple sorting criteria. In the column dropdown, select your newly created column ("Score_Normalized").

Step 3: Choose the Descending sort order and click "Apply and Execute".

Your data is now cleaned, transformed, and ranked!

Workflow complete!

Keep the momentum going by exploring Just KNIME It! on the Hub to challenge yourself and see how these nodes can be integrated into more complex workflows and use cases.

Transform Data: Remove Duplicates and Outliers


This workflow demonstrates how to prepare raw data for analysis by removing recurring information, filtering out unwanted values, and calculating new metrics to sort our results effectively.

Load rawinput data
Table Creator
Remove duplicates
Duplicate Row Filter
Remove scoreshigher than 200
Expression Row Filter
Divide Score by 100
Expression
Sort by normalized scorein descending order
Sorter

Nodes

Extensions

Links