Icon

The Golden Era of Movies

<p><strong>The </strong>Golden Era of Movies</p><p><strong>Description:</strong></p><p>You’ve joined a movie analytics team investigating audience preferences, genre dynamics, and the true value of highly-rated films on Letterboxd. Using the Letterboxd Movie Ratings dataset, your goal is to clean and transform movie data, uncover hidden audience trends, and identify which films punch above their popularity level.</p><p>Here are four questions your team lead wants you to answer:</p><ol><li><p>Which genre is among the <strong>top 5 most popular genres</strong> as well as among the <strong>top 5 best rated genres</strong>?</p></li><li><p>Categorize movies into three groups based on their runtime: "Short Film" (runtime &lt; 60 minutes), "Standard" (runtime &lt;150 minutes), and "Epic" (runtime &gt;= 150 minutes). Choose your favorite three movie genres and <strong>compare their runtime based on the runtime categories</strong>.</p></li><li><p>For each genre, calculate the <strong>average rating in each decade</strong>. Compare how the ratings changed over time for the genres <em>Action</em>, <em>Crime</em>, <em>Romance</em>, and <em>Music</em>.</p></li><li><p>Identify the '<strong>Hidden Gems</strong>' in our dataset: Movies with a high rating by a reliable group of viewers but typically overlooked by the mainstream audience. You want to <strong>reward high ratings</strong> and <strong>penalize high popularity</strong>. Calculate a custom <strong>Hidden Gem Score (HGS)</strong> for each movie. Which 25 movies rank highest on this custom index?<br><strong><em>Note:</em></strong> If you struggle to come up with a HGS, find inspiration from document analysis! TF-IDF uses a logarithmic penalty to prevent incredibly common words from drowning out the unique ones.</p></li></ol><p><strong>Author:</strong> Nur Sena Alici</p>

URL: Dataset https://hub.knime.com/s/7ts8B21h2luVoL8C

Letterboxd Movie Ratings dataset
CSV Reader
Data Prep
1
Component

Nodes

Extensions

Links