Icon

ETL_​curve_​analysis

Change automatic column names from aggregation node DO NOT RUN: This is for the entire historical data but we switched to monthly update so please complete only steps below.Fixing database problem with many country names connected to one iso code. Fixing all input id curves attached to one mscreen vendor (in other words getting only effective frequency in the output table) This part of procedures calculates mean and sd for touchpoints, freqencies and their combinations Add 0-1 indicator of curves which are outliers (e.g. they have rmsebigger than mean+5sd) and filter them to not make our thresholdstoo big Recalculate mean and sd for rmse with filtered data for touchpoint, frequency and their combinations Recalculate RMSE and mean+5SD from new data todouble clean outliers RUN IT FOR MONTHLY BASISVERY IMPORTANT: YOU MUST CHANGE FILE PATHS IN INPUT AND OUTPUT NODESFixing database problem with many country names connected to one iso code Fixing all input id curves attached to one mscreen vendor (in other words getting only effective frequency in the output table) CHANGE CHANGE CHANGE Calculate new columns for countring vendor in scenario and per touchpoint. CHANGE Calculate new columns for countring vendor in scenario and per touchpoint. mscreen_curves_all.csvfirstfirstcountcountmin)Node 100by count rowsgroup by tp and freqcalculate rmse for every recordadd mean and sd for freq and tpgroup by freqgroup by tpNode 112add mean and sd for freqadd mean and sd for tpmscreen_curves_all_distinct1 if after 2022-04-05 0 if olderroundingto deduplicate same curvesnow is 1 decimalsmscreen_curves_allmarketmscreen_curves_allfreq ==eff freqmscreen_curves_eff_freq_allrmse 5sdcalculate rmsetime intervalrmse 5sdby rmseNode 1371 if after 2022-04-05 0 if olderNode 139group by tp and freqNode 141rmse 5sdby rmsegroup by tpNode 147calculate rmseadd mean and sd for freq and tpadd mean and sd for freqrmse 5sdgroup by tp and freqcalculate rmsegroup by tp and freqtime intervalrmse 5sdadd mean and sd for tpNode 160group by freqmscreen_curves_all_distincttime interval1 yearsse roundingmintime interval2022mscreen_curves_distinct.csvRMSE 14.76RMSE 10.11time2022Node 171Node 172minNode 174mscreen_curvesNode 176mscreen_curves_eff_freq_allNode 191getcountry_codermcountry codeNode 195marketNode 198rmcountry codeNode 203firstfirstminmincountcount)maxsince firstNode 213since firstbetweenfirst lastmaxmeanmscreen_users_allNode 219Node 220Node 221last monthNode 223getcountry_codemarketlastNode 230marketrmcountry codedatemscreen_curves_allmscreen_curves_allNode 237last monthmarketNode 244firstfirstminminmaxmaxmeancountcount)last monthNode 256Node 257Node 258Node 259datedateNode 262Node 263dateNode 265Node 276rmsefreq & dateNode 284Node 285by count rowsNode 288Node 289roundingto deduplicate same curvesa, b, c,d = 0 decimalsse rounding0 decimalNode 292Node 293Node 294> 3 pointsNode 296has reachpointsmscreen_curves_eff_freq_allby scenario_idNode 301Node 302mscreen_curves_eff_freq_allby scenario_idNode 305by count rowsNode 307mscreen_curves_eff_freq_allNode 309mscreen_plans_allNode 311Node 312Node 313maxNode 315mscreen_curves_eff_freq_allNode 317Node 320Node 321Node 322Node 324Node 325Node 326Node 328Node 329Node 331Node 332Node 333Node 334Node 335mscreen_scenarios_optimize_vendors_all.csvNode 337 CSV Reader GroupBy Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename Sorter GroupBy Math Formula Joiner Column Rename GroupBy GroupBy Column Rename Column Rename Joiner Joiner CSV Writer Rule Engine Math Formula(Multi Column) CSV Writer GroupBy Column Rename CSV Reader Rule-basedRow Filter CSV Writer Math Formula Rule Engine Math Formula Rule-basedRow Filter Sorter Column Filter Rule Engine Sorter GroupBy Sorter Rule-basedRow Filter Sorter GroupBy Column Rename Sorter Math Formula Joiner Joiner Math Formula GroupBy Math Formula Column Rename GroupBy Rule Engine Rule-basedRow Filter Joiner Column Rename Column Filter GroupBy CSV Writer Rule Engine Math Formula(Multi Column) Column Rename(Regex) Rule Engine CSV Reader Rule-basedRow Filter Rule-basedRow Filter Rule-basedRow Filter Column Filter Column Resorter Column Rename(Regex) Sorter CSV Writer Sorter CSV Reader GroupBy Excel Reader Joiner String Manipulation Column Filter Column Filter GroupBy Column Rename Rule-basedRow Filter Column Filter CSV Reader Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Date&TimeDifference String to Date&Time Date&TimeDifference Date&TimeDifference Column Rename(Regex) Column Rename(Regex) CSV Writer Rule Engine Rule Engine Rule Engine CSV Reader Excel Reader Joiner String Manipulation GroupBy Column Rename CSV Writer GroupBy Rule-basedRow Filter Column Rename Column Filter Rule-basedRow Filter CSV Writer CSV Reader Concatenate CSV Reader GroupBy Column Rename GroupBy String To Number Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) CSV Reader GroupBy Sorter GroupBy Sorter Rule-basedRow Filter Rule-basedRow Filter Sorter GroupBy Rule-basedRow Filter String to Date&Time String to Date&Time Rule-basedRow Filter Rule-basedRow Filter String to Date&Time Date&Time-basedRow Filter GroupBy Sorter Column Rename String Manipulation Math Formula(Multi Column) Math Formula(Multi Column) Row Sampling Numeric Binner Auto-Binner Rule-basedRow Filter Histogram Rule-basedRow Filter CSV Reader GroupBy Column Rename Joiner CSV Writer GroupBy Column Rename Sorter Joiner CSV Reader Number To String CSV Writer GroupBy Sorter Column Rename Column Rename(Regex) String to Date&Time CSV Reader String To Number GroupBy GroupBy Column Rename Column Rename Joiner GroupBy Column Rename Rule Engine Joiner GroupBy Rule Engine Column Rename GroupBy Joiner Column Rename Rule-basedRow Filter Column Rename Rule Engine CSV Writer GroupBy Change automatic column names from aggregation node DO NOT RUN: This is for the entire historical data but we switched to monthly update so please complete only steps below.Fixing database problem with many country names connected to one iso code. Fixing all input id curves attached to one mscreen vendor (in other words getting only effective frequency in the output table) This part of procedures calculates mean and sd for touchpoints, freqencies and their combinations Add 0-1 indicator of curves which are outliers (e.g. they have rmsebigger than mean+5sd) and filter them to not make our thresholdstoo big Recalculate mean and sd for rmse with filtered data for touchpoint, frequency and their combinations Recalculate RMSE and mean+5SD from new data todouble clean outliers RUN IT FOR MONTHLY BASISVERY IMPORTANT: YOU MUST CHANGE FILE PATHS IN INPUT AND OUTPUT NODESFixing database problem with many country names connected to one iso code Fixing all input id curves attached to one mscreen vendor (in other words getting only effective frequency in the output table) CHANGE CHANGE CHANGE Calculate new columns for countring vendor in scenario and per touchpoint. CHANGE Calculate new columns for countring vendor in scenario and per touchpoint. mscreen_curves_all.csvfirstfirstcountcountmin)Node 100by count rowsgroup by tp and freqcalculate rmse for every recordadd mean and sd for freq and tpgroup by freqgroup by tpNode 112add mean and sd for freqadd mean and sd for tpmscreen_curves_all_distinct1 if after 2022-04-05 0 if olderroundingto deduplicate same curvesnow is 1 decimalsmscreen_curves_allmarketmscreen_curves_allfreq ==eff freqmscreen_curves_eff_freq_allrmse 5sdcalculate rmsetime intervalrmse 5sdby rmseNode 1371 if after 2022-04-05 0 if olderNode 139group by tp and freqNode 141rmse 5sdby rmsegroup by tpNode 147calculate rmseadd mean and sd for freq and tpadd mean and sd for freqrmse 5sdgroup by tp and freqcalculate rmsegroup by tp and freqtime intervalrmse 5sdadd mean and sd for tpNode 160group by freqmscreen_curves_all_distincttime interval1 yearsse roundingmintime interval2022mscreen_curves_distinct.csvRMSE 14.76RMSE 10.11time2022Node 171Node 172minNode 174mscreen_curvesNode 176mscreen_curves_eff_freq_allNode 191getcountry_codermcountry codeNode 195marketNode 198rmcountry codeNode 203firstfirstminmincountcount)maxsince firstNode 213since firstbetweenfirst lastmaxmeanmscreen_users_allNode 219Node 220Node 221last monthNode 223getcountry_codemarketlastNode 230marketrmcountry codedatemscreen_curves_allmscreen_curves_allNode 237last monthmarketNode 244firstfirstminminmaxmaxmeancountcount)last monthNode 256Node 257Node 258Node 259datedateNode 262Node 263dateNode 265Node 276rmsefreq & dateNode 284Node 285by count rowsNode 288Node 289roundingto deduplicate same curvesa, b, c,d = 0 decimalsse rounding0 decimalNode 292Node 293Node 294> 3 pointsNode 296has reachpointsmscreen_curves_eff_freq_allby scenario_idNode 301Node 302mscreen_curves_eff_freq_allby scenario_idNode 305by count rowsNode 307mscreen_curves_eff_freq_allNode 309mscreen_plans_allNode 311Node 312Node 313maxNode 315mscreen_curves_eff_freq_allNode 317Node 320Node 321Node 322Node 324Node 325Node 326Node 328Node 329Node 331Node 332Node 333Node 334Node 335mscreen_scenarios_optimize_vendors_all.csvNode 337 CSV Reader GroupBy Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename Sorter GroupBy Math Formula Joiner Column Rename GroupBy GroupBy Column Rename Column Rename Joiner Joiner CSV Writer Rule Engine Math Formula(Multi Column) CSV Writer GroupBy Column Rename CSV Reader Rule-basedRow Filter CSV Writer Math Formula Rule Engine Math Formula Rule-basedRow Filter Sorter Column Filter Rule Engine Sorter GroupBy Sorter Rule-basedRow Filter Sorter GroupBy Column Rename Sorter Math Formula Joiner Joiner Math Formula GroupBy Math Formula Column Rename GroupBy Rule Engine Rule-basedRow Filter Joiner Column Rename Column Filter GroupBy CSV Writer Rule Engine Math Formula(Multi Column) Column Rename(Regex) Rule Engine CSV Reader Rule-basedRow Filter Rule-basedRow Filter Rule-basedRow Filter Column Filter Column Resorter Column Rename(Regex) Sorter CSV Writer Sorter CSV Reader GroupBy Excel Reader Joiner String Manipulation Column Filter Column Filter GroupBy Column Rename Rule-basedRow Filter Column Filter CSV Reader Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Date&TimeDifference String to Date&Time Date&TimeDifference Date&TimeDifference Column Rename(Regex) Column Rename(Regex) CSV Writer Rule Engine Rule Engine Rule Engine CSV Reader Excel Reader Joiner String Manipulation GroupBy Column Rename CSV Writer GroupBy Rule-basedRow Filter Column Rename Column Filter Rule-basedRow Filter CSV Writer CSV Reader Concatenate CSV Reader GroupBy Column Rename GroupBy String To Number Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) Column Rename(Regex) CSV Reader GroupBy Sorter GroupBy Sorter Rule-basedRow Filter Rule-basedRow Filter Sorter GroupBy Rule-basedRow Filter String to Date&Time String to Date&Time Rule-basedRow Filter Rule-basedRow Filter String to Date&Time Date&Time-basedRow Filter GroupBy Sorter Column Rename String Manipulation Math Formula(Multi Column) Math Formula(Multi Column) Row Sampling Numeric Binner Auto-Binner Rule-basedRow Filter Histogram Rule-basedRow Filter CSV Reader GroupBy Column Rename Joiner CSV Writer GroupBy Column Rename Sorter Joiner CSV Reader Number To String CSV Writer GroupBy Sorter Column Rename Column Rename(Regex) String to Date&Time CSV Reader String To Number GroupBy GroupBy Column Rename Column Rename Joiner GroupBy Column Rename Rule Engine Joiner GroupBy Rule Engine Column Rename GroupBy Joiner Column Rename Rule-basedRow Filter Column Rename Rule Engine CSV Writer GroupBy

Nodes

Extensions

Links