Jan 22, 2022

Comparing my 2021 Spotify Wrapped to my Raw Streaming Data

How I ran basic analytics on my own Spotify streaming data and compared it to my 2021 "wrapped"

While scrolling through my Spotify wrapped for 2021, some things struck me as being a little off. I wanted to get my hands on the raw data to check things for myself. Unfortnately, the Spotify API (at the time of writing this) does not have the ability to pull a user's entire streaming history. You can, however, request an export of your data here. So that's what I did!

A few days after requesting the data from Spotify I received an email with a link to download the exported data. It contained a number of JSON data files, including every stream from the past year. That data looks like this:

[
  {
    "endTime" : "2020-12-14 19:55",
    "artistName" : "Alex Chilton",
    "trackName" : "Oogum Boogum",
    "msPlayed" : 206840
  },  
  {
    "endTime" : "2020-12-14 20:04",
    "artistName" : "The Pastels",
    "trackName" : "Nothing To Be Done",
    "msPlayed" : 232133
  }
]

So I started to play with the data a bit to try and compare it to what Spotify included in my "wrapped". Let's see how the raw data compares to Spotify "Wrapped".

Top 10 Artists by Stream Count

Below is a table showing my calculations and a screenshot from my "wrapped".

Artist Name # of Streams
The Murlocs 126
Shannon & The Clams 114
Devendra Banhart 82
Tennis 70
Daft Punk 69
Alan Jackson 67
The Beatles 61
BRONCHO 58
Khruangbin 58
King Gizzard & The Lizard Wizard 56
Wrapped Screenshot
Spotify Wrapped Top Artists

If you compare the table data to the screenshot, you can see that Shannon & The Clams and Daft Punk are missing from the official "wrapped". What could be the reason for this? Maybe Top Artists for "wrapped" are calculated by listen time rather than # of streams.

Top 10 Artists by Play Time

Artist Name Minutes Streamed
Last Podcast On The Left 753
The Murlocs 398
Shannon & The Clams 295
Devendra Banhart 253
Tennis 236
Khruangbin 222
Alan Jackson 214
King Gizzard & The Lizard Wizard 194
The Beatles 175
Daft Punk 174

Ignoring the podcast in the first row of the table, crunching my raw streaming data is still yielding different results from "wrapped". If "time spent streaming" an artist is how the wrapped rankings work, then Khruangbin would be in my top 5 artists - but they are not.

Top 10 Songs by Stream Count

Here's the data breakdown for top songs with the corresponding screenshot from my Spotify Wrapped

Track Name Artist Name Stream Count
Brother Father Mother Sister Tim Maia 19
Francesca The Murlocs 19
Rolling On The Murlocs 19
The Boy Shannon & The Clams 17
Get in My Car BRONCHO 16
My Lady's On Fire Ty Segall 15
I Want To See The Bright Lights Tonight Richard & Linda Thompson 15
Nighttime in the Switching Yard - 2007 Remaster Warren Zevon 15
Chnam oun Dop-Pram Muy (I'm 16) Ros Serey Sothea 15
Turtles Have Short Legs CAN 14
Wrapped Screenshot
Spotify Wrapped Top Songs

Here we can see that "The Boy" by Shannon & The Clams is missing from the screenshot but according to the raw data, I streamed it 17 times. Also - there appears to be a tie between the top 3 songs (at 19 streams) which does match up well to the official "wrapped"

There also appears to be a tie (at 15 streams) for Chnam oun Dop-Pram Muy (I'm 16), My Lady's On Fire and Nighttime in the Switching Yard. Apparently, Spotify chose Nighttime in the Switching Yard to win that tie.

Top Song

The raw data shows that I listened to Brother Father Mother Sister by Tim Maia 19 times - while the wrapped screenshot shows 18 times. I must have listened to the song 1 time between Spotify creating my "wrapped" and when the raw data export happened.

"Wrapping Up"

Overall - I think that Spotify Wrapped was fairly accurate when comparing to it to the raw data but I don't understand why certain artists and songs are just not represented in my "wrapped" rollup. I clearly listened to Shannon & The Clams and Daft Punk a lot last year so what's the deal? Why does Spotify hate Shannon & The Clams and Daft Punk so much?

Possible explanations:

If you're interested, you can view the source code and Spotify data on GitHub