Challenge description:
You work as a data scientist for an online job listing portal. Job-seekers mostly use the portal to obtain salary predictions based on a few key information they provide (e.g., experience level, employment type, etc.). You have already trained a Simple Regression Tree that predicts the salary of a job according to different features (e.g., title, experience, location, company size, etc.). You have also exported the trained ds_salaries_predictor.model and a table called feature_values.table that contains the features (an all the possible feature values) seen by the model during training.
Your task is to build an interactive and responsive data app that collects user input information and outputs the predicted salary. After providing a salary prediction, make sure to collect also a few sensitive user data (e.g., name, surname, email, etc.), which could be used, for example, to send personalized newsletters with job offers. Lastly, anonymize sensitive user data and store it together with the user input information used to predict salaries in an SQLite database.
Key requirement: you must use the nodes of the Redfield Privacy Nodes extension (https://hub.knime.com/redfield/extensions/se.redfield.arx.feature/latest/) to anonymize the sensitive user data you decided to collect.
Outcome:
An interactive dashboard that predicts salaries according to user input information and, in the back-end, stores it in an SQLite database.
Deliver your solution as a separate workflow and name it: Solution_Round_16_
Teams are strongly encouraged to submit high-quality work in order to improve their chances of getting maximum points. Don't be afraid to go the extra mile! :)
Dataset:
The files ds_salaries_predictor.model and feature_values.table are provided in the challenge folder.
Training dataset: Data Science Salaries 2023 dataset from Kaggle: https://www.kaggle.com/datasets/arnabchaki/data-science-salaries-2023. Note that not all features have been used for training.
Deadline:
March 10, 2024 (submission by 11:59 PM CET) **. Check the calendar of the tournament: https://info.knime.com/game-of-nodes
** We will verify the date and time of the latest edits.
KNIME Game of Nodes:
Rules, Assessment Criteria & FAQs: https://info.knime.com/game-of-nodes
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.