Challenge description:
You work as a data scientist for a recruiting agency specialized in casting actors for movie production companies that require the services of the recruiting agency. Your employer, though, is not like any other recruiting agency. The candidates you work with are superheroes with diverse skills, powers and strengths. A client of the agency, Fantastic Movies Inc., is starting the production of a new superhero movie and needs to cast actors. However, they are still unsure about what kind of superhero profile they need.
As the data scientist at the recruiting agency, your manager asks you to build a clustering analysis pipeline to group all available superheroes into different clusters based on their powers. Take all the preprocessing steps that you deem necessary and pick a clustering algorithm of your choice. Clearly, your manager is interested in obtaining clusters that are as distinctly separated as possible, so the way they are formed must be optimized.
Additionally, you’re required to present the results of your clustering analysis visually, both in an interactive dashboard and in a static PDF report. Consider including additional information, visualizations and metrics that help Fantastic Movies Inc. make their choice (e.g, displaying the top five strengths available across all superheroes, statistics on biometric characteristics, etc.).
Key requirement: your clustering pipeline must include a clustering optimization technique, and the dashboard/PDF must contain at least five different visual insights.
Outcome:
A clustering analysis pipeline to group superheroes by their superpowers and a report (via an interactive dashboard and a static PDF) to display identified clusters.
Deliver your solution as a separate workflow and name it: Solution_Round_8_
Teams are strongly encouraged to submit high-quality work in order to improve their chances of getting maximum points. Don't be afraid to go the extra mile! :)
Dataset:
Marvel Superheroes dataset from Kaggle: https://www.kaggle.com/datasets/dannielr/marvel-superheroes?select=superheroes_power_matrix.csv (in the Kaggle space other datasets on superheroes' characteristics are available)
Deadline:
March 24, 2024 (submission by 11:59 PM CET) **. Check the calendar of the tournament: https://info.knime.com/game-of-nodes
** We will verify the date and time of the latest edits.
KNIME Game of Nodes:
Rules, Assessment Criteria & FAQs: https://info.knime.com/game-of-nodes
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.