The Golden Era of Movies
Description:
You’ve joined a movie analytics team investigating audience preferences, genre dynamics, and the true value of highly-rated films on Letterboxd. Using the Letterboxd Movie Ratings dataset, your goal is to clean and transform movie data, uncover hidden audience trends, and identify which films punch above their popularity level.
Here are four questions your team lead wants you to answer:
Which genre is among the top 5 most popular genres as well as among the top 5 best rated genres?
Categorize movies into three groups based on their runtime: "Short Film" (runtime < 60 minutes), "Standard" (runtime <150 minutes), and "Epic" (runtime >= 150 minutes). Choose your favorite three movie genres and compare their runtime based on the runtime categories.
For each genre, calculate the average rating in each decade. Compare how the ratings changed over time for the genres Action, Crime, Romance, and Music.
Identify the 'Hidden Gems' in our dataset: Movies with a high rating by a reliable group of viewers but typically overlooked by the mainstream audience. You want to reward high ratings and penalize high popularity. Calculate a custom Hidden Gem Score (HGS) for each movie. Which 25 movies rank highest on this custom index?
Note: If you struggle to come up with a HGS, find inspiration from document analysis! TF-IDF uses a logarithmic penalty to prevent incredibly common words from drowning out the unique ones.
Author: Nur Sena Alici
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!