Icon

EDA _​ Statistic Supported Binning By Groups

EDA _ Statistic Supported Binning By Groups

EDA _ Statistic Supported Binning By Groups

Workflow created for the KNIME forum ...

https://forum.knime.com/t/contar-valores/36860

DESCRIPTION:
The challenge is to bin the results of a questionary, by statistic supported bounds for each of the questions. The numeric rating answers are initially arranged in columns.

A discontinuous category is assigned for binning and counts of binned answers in each question is also added to the data set. Different Outputs summaries are presented aiming to explore the data.


For any comment or bug realted to the workflow, please do not hesitate to leave your comment.

DISCLAIMER:
The sharing of knowledge in KNIME Hub by using these examples and models, have only demonstration purposes in the advantage of the KNIME community.
These are meant to clarify theoretical background of mentioned subjects in caption.
I will not be held responsibility for any damages arising from the use of these models in your investment/valuation related work, without taking formal advise.

Outputs: Generate Random Data WithAnswers Rating 1 - 5: Estimate Statistic Bounds (by Question) and Create Categorical Ranking Class and Counts : Transform Collected Data to Generated Outputs: Data to LONG formatMid-Lower boundmean-sdLoop12 stepsSELECT 'Qn [Binned]'Count Qn [Binned]Add Qn [Binned].CountcolumnCollectDataSort by Qn Column*Q == Question columnNode 2262DuplicateBinned ColumnClean UpData ResultsGroup byUser ID( ROWS )Data to wideGroup byQn( COLUMNS )Data to wideCount User Scores Analysisby 'bins' (Binnned CLASS)Count Qn Scores Analysisby 'bins' (Binnned CLASS)Your Data12 Column RATING Questions 1-520 User ID Rows [Binned] countsto WIDEColumnValues [Binned]to WIDEAdd [Binned] chunkto WIDE formatAdd [Binned].counts chunkto WIDE formatColumn Statistics ...... to VariableMid-Upper boundmean+sdBin all the data in 3 cathegories:Low Bin : [P1 - (mean-sd) [Middle Bin : [ (mean-sd) - (mean+sd) [High Bin : [ (mean+sd) - P99 ] Overall StatisticsAll PROCESSED dataWIDE format(alphabetically sorted) Unpivoting Java EditVariable (simple) Group Loop Start Column Filter Value Counter Joiner (deprecated) Loop End Sorter Sorter Rule Engine Column Filter GroupBy Pivoting GroupBy Pivoting Column Resorter Column Resorter DATA GENERATION Pivoting Pivoting Joiner (deprecated) Joiner (deprecated) Statistics Table Rowto Variable Java EditVariable (simple) Rule Engine Statistics Column Resorter Outputs: Generate Random Data WithAnswers Rating 1 - 5: Estimate Statistic Bounds (by Question) and Create Categorical Ranking Class and Counts : Transform Collected Data to Generated Outputs: Data to LONG formatMid-Lower boundmean-sdLoop12 stepsSELECT 'Qn [Binned]'Count Qn [Binned]Add Qn [Binned].CountcolumnCollectDataSort by Qn Column*Q == Question columnNode 2262DuplicateBinned ColumnClean UpData ResultsGroup byUser ID( ROWS )Data to wideGroup byQn( COLUMNS )Data to wideCount User Scores Analysisby 'bins' (Binnned CLASS)Count Qn Scores Analysisby 'bins' (Binnned CLASS)Your Data12 Column RATING Questions 1-520 User ID Rows [Binned] countsto WIDEColumnValues [Binned]to WIDEAdd [Binned] chunkto WIDE formatAdd [Binned].counts chunkto WIDE formatColumn Statistics ...... to VariableMid-Upper boundmean+sdBin all the data in 3 cathegories:Low Bin : [P1 - (mean-sd) [Middle Bin : [ (mean-sd) - (mean+sd) [High Bin : [ (mean+sd) - P99 ] Overall StatisticsAll PROCESSED dataWIDE format(alphabetically sorted) Unpivoting Java EditVariable (simple) Group Loop Start Column Filter Value Counter Joiner (deprecated) Loop End Sorter Sorter Rule Engine Column Filter GroupBy Pivoting GroupBy Pivoting Column Resorter Column Resorter DATA GENERATION Pivoting Pivoting Joiner (deprecated) Joiner (deprecated) Statistics Table Rowto Variable Java EditVariable (simple) Rule Engine Statistics Column Resorter

Nodes

Extensions

Links