The company tracks the usage of the website and stores the information about each session.
- Various data are collected, e.g., session start, duration, # clicks, etc., as well as the session satisfaction score (optional)
- The company calculates averaged statistics for each customer, e.g., total # visits, average satisfaction, etc., and updates the "statistics" table on the database
- Session satisfaction score column has missing values which need to be imputed, e.g., with machine learning predictions.
We access the usage data from Hive and personal data (anonymized & updated in sessions 1 & 2) and contracts data from the PostgreSQL database. We perform in-database processing, read the data into Spark, enrich the usage data with the personal and contract data to predict missing values better, and continue working with the relatively big usage data on Spark. We export the final status of the workflow. In the case some processes fail, we notify responsible people via an automated email.
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com, follow @NodePit on Twitter or botsin.space/@nodepit on Mastodon.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.