Icon

Challenge 22 - Analyzing Top Streaming Artists

<p><strong>Challenge 22: Analyzing Top Streaming Artists</strong></p><p><strong>Level:</strong> Medium</p><p><strong>Description:</strong> You’re part of a music analytics team exploring what drives global listening trends. Using a dataset of most streamed songs up to 2024, your goal is to uncover which artists dominate the charts. Clean and transform the data, and then answer these key questions: (1) Which artists release the most music? (2) Who ranks highest in average track score? (3) Who leads the streams overall?. Finally, discover which artists make the cut on both popularity and consistency.</p><ul><li><p><strong>Beginner-friendly objective(s):</strong> 1.Load the dataset and filter for only these columns: Track, Artist, Release Date, Track Score, and Spotify Streams. 2.Convert necessary data types for accurate analysis, such as transforming string-based numbers to numerical formats. 3.Handle missing values and perform data transformations, such as converting date strings to date formats and extracting specific date parts. 4. Aggregate data to answer the question: How many tracks were generated per year? Visualize the results using a bar chart.</p></li><li><p><strong>Intermediate-friendly objective(s):</strong> 1. Implement filtering operations to identify top-performing artists based on: Top 10 artists with the highest mean track score (minimum of 10 tracks), and Top 10 artists with the highest total streams. 2.Find artists who appear in both of these lists.</p></li></ul><p>How many artists make the final list?</p><p><br><strong>Solution Summary:</strong> The solution involves a comprehensive workflow that processes and analyzes a dataset of Spotify's most streamed songs. It begins with reading and filtering the dataset to focus on key columns. The workflow then converts data types for accurate analysis, handles missing values, and performs data transformations. Aggregation techniques are applied to calculate metrics like track count and total streams per artist. The solution also includes filtering and sorting operations to identify top artists, and it concludes with data visualization and joining datasets for a holistic view.<br><br><strong>Solution Details:</strong> The workflow starts with a CSV Reader node configured to read the "Most Streamed Spotify Songs 2024.csv" file, ensuring the correct handling of headers, delimiters, and encoding. A Column Filter node follows, retaining only essential columns like "Track," "Artist," "Release Date," "Track Score," and "Spotify Streams." The String to Number node converts the "Spotify Streams" column to a Long integer, facilitating numerical analysis. A Missing Value node removes rows with missing data, ensuring data integrity. Next, the String to Date&amp;Time node converts the "Release Date" column to a date format, followed by a Date&amp;Time Part Extractor node that extracts the year from the release date. The GroupBy node aggregates data by artist, calculating metrics like track count, mean track score, and total streams. A Number to String node converts the "Year" column to a string format for consistency. The workflow includes a Row Filter node to retain artists with at least 10 tracks, and a Top k Row Filter node identifies the top 10 artists by total streams and mean track score. A Sorter node arranges data by mean track score in descending order. The workflow concludes with a Joiner node that merges datasets based on row keys, and a Bar Chart node visualizes the track count per year. Finally, a Table View node displays the sorted and filtered data for further exploration.</p>

Challenge 22: Analyzing Top Streaming Artists


Level: Medium

Description: You’re part of a music analytics team exploring what drives global listening trends. Using a dataset of most streamed songs up to 2024, your goal is to uncover which artists dominate the charts. Clean and transform the data, and then answer these key questions: (1) Which artists release the most music? (2) Who ranks highest in average track score? (3) Who leads the streams overall?. Finally, discover which artists make the cut on both popularity and consistency.

  • Beginner-friendly objective(s): 1.Load the dataset and filter for only these columns: Track, Artist, Release Date, Track Score, and Spotify Streams. 2.Convert necessary data types for accurate analysis, such as transforming string-based numbers to numerical formats. 3.Handle missing values and perform data transformations, such as converting date strings to date formats and extracting specific date parts. 4. Aggregate data to answer the question: How many tracks were generated per year? Visualize the results using a bar chart.

  • Intermediate-friendly objective(s): 1. Implement filtering operations to identify top-performing artists based on: Top 10 artists with the highest mean track score (minimum of 10 tracks), and Top 10 artists with the highest total streams. 2.Find artists who appear in both of these lists.

How many artists make the final list?

Three artist from the top 10 highest-streamed on Spotify also appears in the list of top 10 highest mean track scores with a minimum of 10 tracks.

Read Spotify dataset
CSV Reader
Convert Spotify Streams column to number(Long)
String to Number
Top 10 Artists withHighest mean track score
Top k Row Filter
Filter only requiredcolumnsTrack, Artist. Release Date, Track Score, and Spotify Streams
Column Filter
Remove rows with missing values
Missing Value
Sorter
Table View
Bar Chart
Joiner
Convert year to string
Number to String
Color Manager
Get number of tracksreleased per year
GroupBy
Top 10 Artists withHighest total Spotify streams
Top k Row Filter
Extract year fromRelease date
Date&Time Part Extractor
Filter Artists with a minimum of 10 tracks
Row Filter
Get number of tracks, mean track score and total Spotify streams per artist
GroupBy
Number Rounder
Convert Release Dateto Date format
String to Date&Time

Nodes

Extensions

Links