library(tidyverse)
spotify <- read_csv("~/Downloads/spotify.csv")Spotify Song Characteristics
Introduction
In this project, I will explore a dataset about Song characteristics from Spotify.
Section: Set-up
- I will be using the “Spotify” data set. The data set has 200 observations with 155 unique songs collected from four points during the year 2021 along with the characteristic of each song. The original data was collected by two St. Olaf students as their project. It was collected using data from Spotify in order to understand why certain songs are popular.
spotify|>
distinct(title) |>
count()# A tibble: 1 × 1
n
<int>
1 155
Section A: Two-means
- Explore a quantitative response variable and binary categorical explanatory variable.
I chose valence to be numeric response variable and instrumentalness to be binary categorical explanatory variable. Valence describes the musical positiveness of a track. The more positive a track is, the closer the value is to 1.0. Instrumentalness classifies whether a song is instrumental or not.
Research question: Is there a difference in the valence score of songs classified as instrumental and songs not classified as instrumentl? In other words, does instrumentalness affect a song’s valence score?
- Explore and describe the relationship between the two variables with appropriate summary statistics.
mosaic::favstats(valence ~ instrumentalness, data = spotify)Registered S3 method overwritten by 'mosaic':
method from
fortify.SpatialPolygonsDataFrame ggplot2
instrumentalness min Q1 median Q3 max mean sd n
1 instrumental 0.1000 0.28425 0.4225 0.57175 0.942 0.4495270 0.2273844 74
2 not instrumental 0.0628 0.35700 0.4965 0.69800 0.934 0.5194762 0.2282755 126
missing
1 0
2 0
fitlineA <- lm(valence ~ instrumentalness, data = spotify)
resid_panel(fitlineA)spotify |>
ggplot(aes(y = valence, x = instrumentalness, fill = instrumentalness)) +
geom_boxplot(width = 0.25) +
geom_jitter(width = 0.05, alpha = 0.5) +
theme(legend.position = "none") +
labs(title = "Valence Score - Instrumental vs. Not Instrumental",
x = "Instrumentalness",
y = "Valence Score") + coord_flip()- Instrumental songs have mean valence score of 0.449, and non instrumental songs have mean valence score of 0.519. Not instrumental songs have a slightly higher mean valence score than instrumental songs.
- Perform the appropriate hypothesis test
Hypothesis Test
Null hypothesis: \(H_0: \mu_{i} - \mu_{ni} = 0\). There is no difference in the mean valence score of instrumental and not instrumental songs. The instrumentalness doesn’t affect the songs’ valence score.
Alternative hypothesis: \(H_A: \mu_{i} - \mu_{ni} \ne 0\). There is a difference in the mean valence score of instrumental and not instrumental songs. The instrumentalness does affect the songs’ valence score.
t.test(valence ~ instrumentalness, data = spotify)
Welch Two Sample t-test
data: valence by instrumentalness
t = -2.0974, df = 153.57, p-value = 0.0376
alternative hypothesis: true difference in means between group instrumental and group not instrumental is not equal to 0
95 percent confidence interval:
-0.13583447 -0.00406386
sample estimates:
mean in group instrumental mean in group not instrumental
0.4495270 0.5194762
From the t-test, we have the test statistic of -2.097 and p-value of 0.0376.
The 95% confidence interval is -0.136 and -0.004. We are 95% confident that mean valence scores for instrumental songs are between 0.004 and 0.136 points lower than non instrumental songs. There is no 0 within the interval so we know the difference is significant.
- Check assumptions/conditions for the test.
There are two conditions for the test: Independent and Normality
Independent: Although the method of collecting the sample is not mentioned, we can safely assume that the observations are independent both within and between groups. Knowing one song’s valence score should not impact another song’s valence score.
Normality: Both groups have sample sizes greater than 30, and there seems to be no big outliers, so normality is met.
- Statistical conclusion in context
- From the t-test, we have the test statistic of -2.097 and p-value of 0.0376. Because the p-value is less than 0.05, we reject the null hypothesis in favor of the alternative, and conclude that we have a statistically significant evidence that there is a difference the mean valence score of instrumental and not instrumental songs. The instrumentalness does affect the songs’ valence score. Under the null hypothesis, it is unlikely to see the difference in means that we did. The chance that p-value occurs as or more extreme is 3.76%. The test statistic also confirms this because it is more than two standard deviations away from the null.
- Provide an interpretation of the confidence interval in context
- The 95% confidence interval is -0.136 and -0.004. We are 95% confident that mean valence scores for instrumental songs are between 0.004 and 0.136 points lower than non instrumental songs. There is no 0 within the interval so we know the difference is significant. Because valence describes the musical positiveness of a track, we can also say that instrumental songs are less positive than non instrumental songs by 0.004 to 0.136 points.
Section C: Two Proportions
- Explore two binary categorical variables in the dataset.
I chose top10 to be binary categorical response variable and mode to be binary categorical explanatory variable. top10 tells whether a song is ranked in the top 10 or not. mode indicates whether a track is in a major or minor key.
Research question: Is there a real difference in the proportion of top10 songs with major key compared to those with minor key?
- Explore and describe the relationship between the two variables with appropriate summary statistics.
# Table of counts
table(spotify$mode, spotify$top10) |>
addmargins()
no yes Sum
major 105 22 127
minor 55 18 73
Sum 160 40 200
# Table of proportions
table(spotify$mode, spotify$top10) |>
proportions(margin = 1) |>
round(3)
no yes
major 0.827 0.173
minor 0.753 0.247
# Bar graph
spotify |>
ggplot(aes(x = mode, fill = top10)) +
geom_bar(position = "fill") +
labs(title = "Ranked in top 10 - Major vs. Minor",
x = "Mode",
y = "Proportion",
fill = "top10")- From the proportion table, we see that the proportion of top 10 ranked songs with major key is less than the proportion of top 10 ranked songs with minor key.
- Perform the appropriate hypothesis test
Hypothesis Test
Null hypothesis: \(H_0: p_{major} - p_{minor} = 0\). There is no difference in the proportion of top10 songs with major key compared to those with minor key.
Alternative: \(H_A: p_{major} - p_{minor} \ne 0\). There is a difference in the proportion of top10 songs with major key compared to those with minor key.
prop.test(x = c(22, 18), n = c(127, 73), conf.level = 0.95,
alternative = "two.sided")
2-sample test for equality of proportions with continuity correction
data: c(22, 18) out of c(127, 73)
X-squared = 1.1339, df = 1, p-value = 0.2869
alternative hypothesis: two.sided
95 percent confidence interval:
-0.20291093 0.05621694
sample estimates:
prop 1 prop 2
0.1732283 0.2465753
sqrt(1.1339)[1] 1.064847
- The test gives us the X-squared of 1.1339, taking the square root, we have the test statistic (z score) of 1.065. The p-value is 0.2869. The 95% confidence interval is between -0.203 and 0.056.
- Check assumptions/conditions for the test
There are 2 conditions for the test: Independent and Normal (Success/Failure)
Independent: Although the method of collecting the sample is not mentioned, we can safely assume that the observations are independent both within and between groups. Knowing one song’s rank should not impact another song’s rank.
Normal: We check for success (ranked in top 10) and failure (not in top 10) in each explanatory group. In the major group, there are 22 successes and 105 failures, both greater than 10. In the minor group, there are 18 successes and 55 failures, also greater than 10. Because there are at least 10 successes and failures in major and minor groups, the condition is met.
- Statistical conclusion in context
- We have a z-score of 1.065 and p-value of 0.2869. Because z-score is less than two standard deviations from the null and p-value is greater than 0.05, we fail to reject the null hypothesis and conclude there is not significant evidence that there is a difference in proportion of top10 songs with major key compared to those with minor key. It is likely to have the p-value as or more extreme under the the null hypothesis with the chance of 28.69%.
- Interpretation of confidence interval
- The 95% confidence interval is between -0.203 and 0.056. We are 95% confident that the proportion of top 10 songs with major keys is between 0.203 lower and 0.056 higher than the proportion of top 10 songs with minor keys. Because 0 is included in the interval, it indicates 0 is a plausible value for the difference. As with the hypothesis test, we conclude the difference is not significant.
Section D: Categorical Variables
- Identify a question that can be answered with two categorical variables in the dataset. At least one of these variables will have more than two groups. Clearly state this question for a general audience, and identify the explanatory and response variable.
I chose trend as categorical response variable and mode as categorical explanatory variable. Variable trend describes how a song moved in the rankings since the previous week (down, up, same, or new entry). Variable mode indicates whether a track is in a major or minor key.
Research question: Is there an association between genre and trend?
- Explore and describe the relationship between the two variables with appropriate summary statistics. Provide one plot and one sentence about the relationship (supported by summary stats).
#Table of counts
table(spotify$mode, spotify$trend) |>
addmargins()
MOVE_DOWN MOVE_UP NEW_ENTRY SAME_POSITION Sum
major 56 43 4 24 127
minor 32 22 1 18 73
Sum 88 65 5 42 200
#Table of proportions
table(spotify$mode, spotify$trend) %>%
proportions(margin = 1) %>%
round(3)
MOVE_DOWN MOVE_UP NEW_ENTRY SAME_POSITION
major 0.441 0.339 0.031 0.189
minor 0.438 0.301 0.014 0.247
spotify |>
ggplot(aes(x = mode, fill = trend)) +
geom_bar(position = "fill")+
labs(x = "Mode",
y = "Proportion",
fill = "Trend",
title = "Trend in rankings from previous week - Major vs. Minor ")- From the EDA, it seems that there are less songs with major key staying in the same position of rankings since the previous week than songs with minor key.
** I saw that table of counts have a column for “New_entry” songs. This means there are 5 songs that don’t have a ranking from previous week to have a comparison, so I decided to take out this column out. We have new table of counts, table of proportions and bar graph.
spotify_new <- spotify |>
filter(trend != "NEW_ENTRY")#Table of counts
table(spotify_new$mode, spotify_new$trend) |>
addmargins()
MOVE_DOWN MOVE_UP SAME_POSITION Sum
major 56 43 24 123
minor 32 22 18 72
Sum 88 65 42 195
#Table of proportions
table(spotify_new$mode, spotify_new$trend) %>%
proportions(margin = 1) %>%
round(3)
MOVE_DOWN MOVE_UP SAME_POSITION
major 0.455 0.350 0.195
minor 0.444 0.306 0.250
spotify_new |>
ggplot(aes(x = mode, fill = trend)) +
geom_bar(position = "fill")+
labs(x = "Mode",
y = "Proportion",
fill = "Trend",
title = "Trend in rankings from previous week - Major vs. Minor ")- Perform the appropriate hypothesis test
Hypothesis Test
Null hypothesis: \(H_0:\) There is no association between mode and trend.
Alternative: \(H_A:\) There is an association between mode and trend.
table(spotify_new$mode, spotify_new$trend) %>%
chisq.test()
Pearson's Chi-squared test
data: .
X-squared = 0.91107, df = 2, p-value = 0.6341
- Our chi-square statistic from our data is 0.911. The p-value from the chi-square distribution with 2 degrees of freedom is 0.6341.
- Check assumptions/conditions for the test.
There are 2 conditions for the test: Independent and Expected counts greater than 5.
Independent: Although the method of collecting the sample is not mentioned, we can safely assume that the observations are independent both within and between groups.
Expected Counts: The conditions are met since the expected counts are over 5 for each cell.
table(spotify_new$mode, spotify_new$trend) %>%
chisq.test() %>%
.$expected
MOVE_DOWN MOVE_UP SAME_POSITION
major 55.50769 41 26.49231
minor 32.49231 24 15.50769
- Conclusion
- Our p-value is 0.6341, which is greater than 0.05. Because of that, we fail to reject the null hypothesis and conclude that there is no association between mode and trend, that these two variables are independent. Under the null hypothesis, it is very likely to observe the p-value as or more extreme with the chance of 63.41%. Therefore, we know that the mode of a song does not affect its rankings.
Section E: Conclusion and Figure Caption
Summary:
- Based on the analyses, we explored the relationships between song characteristics on Spotify such as mode, instrumentalness and valence score. First, we observed a statistically significant difference in the mean valence scores between instrumental and non-instrumental songs. Instrumental songs were found to be slightly less positive, with their valence scores estimated to be between 0.004 and 0.136 points lower, based on a 95% confidence interval. This finding suggests that instrumentalness does have a measurable effect on a song’s positiveness. Second, when comparing the proportion of top 10 songs in major keys versus minor keys, we did not find significant evidence of a difference. The results indicated that any variation could be due to chance, and the 95% confidence interval (-0.203 to 0.056) includes 0, reinforcing that the difference is not meaningful. Finally, we found that a song’s mode (major or minor) does not appear to influence its ranking trends. This conclusion is supported by the data, which showed no significant association, suggesting that mode and trend are independent. The second and third analyses have mode as explanatory variable, and in both cases, we fail to reject the null hypothesis, showing that the mode of a song doesn’t influence other characteristics of Spotify songs. While these conclusions are based on statistical evidence, it’s important to recognize potential limitations, such as the sample of songs in the dataset. Instead of picking the top 50 songs, we can randomly pick 50 songs throughout the four points in a year. This way can make sure that observations are indeed independent. We cannot generalize conclusions to a population (all Spotify songs) and make causal conclusions because it is not a random sample and there is no random assignment of explanatory variable.
spotify |>
ggplot(aes(y = valence, x = instrumentalness, fill = instrumentalness)) +
geom_boxplot(width = 0.25) +
geom_jitter(width = 0.05, alpha = 0.5) +
theme(legend.position = "none") +
labs(title = "Valence Score - Instrumental vs. Not Instrumental",
x = "Instrumentalness",
y = "Valence Score") + coord_flip()