Icon

EDA Rework

MASTER 842

EDA 6

Prepare and Enrich the Analysis Dataset

This section loads the source files, converts a text date field into a proper date/time format, then builds an extra lookup table from another dataset by finding each unique item and tagging it with a constant flag. That lookup is joined back to the main table to enrich each record, and missing values created by the join are filled in so the result is a cleaner, analysis-ready dataset. Finally, the data is filtered down to the subset of records used for the next EDA steps.

Summarize and Score Category Frequencies

Starting from the filtered dataset, this section creates two frequency summaries for different categorical fields. In each branch, the data is grouped to count how often each category appears, then a percentage/share is calculated so you can compare categories by relative size, not just raw counts. One summary is then sorted to rank categories from most to least common, while the other also adds a simple rule-based label to classify categories before visualization.

Visualize Category Distributions

These two nodes display the results of the earlier summaries as bar charts. One chart shows the ranked frequency share of categories, while the other shows category frequencies after adding a simple rule-based classification. This helps you quickly compare which groups are most common and see how the categories are distributed visually.

CSV Reader
CSV Reader
Rule Engine
CSV Reader
CSV Reader
String to Date&Time
GroupBy
Constant Value Column Appender
Bar Chart
Row Filter
Joiner
Math Formula
Missing Value
Sorter
GroupBy
Column Renamer
Column Renamer
Math Formula
Bar Chart
GroupBy

Nodes

Extensions

Links