This study uses R and statistical techniques to analyze and visualize data collected across more than 100,000 Spotify songs and their attributes, with a focus on the danceability index.
Although music is the art of emotional and poetic expression, it is also inherently mathematical; a sequence of tones and / or words that follow certain patterns and are accompanied by a certain harmony and rhythm is what people call music. Dataset publicly available in Kaggle includes many attributes of songs, many of which are indexes created by Spotify for its research and analytics. The following analytic study will first use tests and confidence intervals to understand song data and how different categorical variables affect each other or affect quantitative variables.
The focus will then shift to correlating song attributes, with the ultimate goal of making predictions about song popularity and dance ability. This study will make four important findings: masstamilan mp3 songs are probably more danceable than C songs, decade and tonality seem to be related, explicit songs are probably more danceable than pure songs, and the average danceability for a given can be predicted. of the year .
1. Using tests and confidence intervals to understand the data
Before embarking on benchmarks and benchmarks for exploring song data, it is important to identify and describe the key variables around which this study will focus. This study will use one binary variable: clarity (songs can be “explicit” or “clean”). Categorical variables of interest include decade (20 to 10, including 2020–21) and key (0 = C, 1 = C #… 11 = B). Binary and categorical variable tables are shown below:
Most of the analyzes in this study will include some of these aforementioned variables, but from quantitative variables, danceability (a variable of particular interest) will remain the focus of attention throughout. Except for asymmetry to the left and low loudness spikes, the above distributions appear to be relatively symmetrical with no spikes.
1.1 Tonality and dance ability
At first, these two variables may seem irrelevant in the context of each other. Key describes the grouping of heights that a song follows, while danceability is defined by how easy it is to dance to a given song.
1.2 Decade and key
This general upward trend should motivate the individual to explore the key distribution of songs over the past decades. Below is a diagram of the distribution of song keys over the past ten decades.
Some change in distribution is evident from the stacked histogram, indicating that the chi-square test of independence must be met in order to draw conclusions about any possible relationship between the two categorical values of interest. As seen from this chi-square test of independence, a very low p-value indicates strong evidence to reject the null hypothesis, which means that there is a relationship between the key and the decade. This probably means that the keys that artists and producers use for songs have changed over the years.
1.3 Clarity and danceability
Explicit is a binary variable that can potentially affect danceability. The graph below compares the danceability of the songs depending on whether they are pure or frank.
A low p-value indicates that the null hypothesis can be rejected, which means that the true difference in average dancing ability between songs that are explicit and songs that are pure is> 0. You can also be 99% sure that the true difference in the average danceability of candid songs minus pure songs is greater than 0.1377. Clearly more danceable songs can be attributed to the fact that hip-hop and rap songs, which tend to be expressed more often than other genres, contain beats and beats that, according to the Spotify Index, can contribute to a more danceable song.
2. Matching song attributes and predicting dance ability
As mentioned above, artists and producers are probably getting smarter, or rather, knowledgeable about the industry and what type of songs, beats, rhythms, melodies, etc. is required to create a song that people like. Assuming that danceability was one of those songwriting traits that producers want to improve in their music, then the following analysis would be appropriate.
Aside from understanding the underlying data about Spotify songs, this study made four important findings: F # songs are probably more danceable than C songs, decade and tonality seem to be related, explicit songs are probably more danceable than pure songs. and the average dancing ability for a given year can be predicted. These findings and trends can be generalized to all Spotify songs and thus apply to current artists and producers in the industry.