Nhi Luong
  • Home
  • SDS 264 Projects
    • Mini-Project 1
    • Choropleth maps
  • Data Science Projects
    • Korean Drama Analysis
    • Airline Reviews Analysis
    • tmcn R Package
  • Statistics Projects
    • Asian American Quality of Life
    • Spotify Song Characteristics
  • Machine Learning Projects
    • Handwritten Digits Classification

On this page

  • USA States Description
    • Load in the dataset
    • Area of Each State (Static)
    • Area of Each State (Interactive)
    • State Division (Static)
    • State Division (Interactive)

Cool States Maps!

USA States Description

  • For the map assignment, I chose the dataset states from the library usa. This dataset gives a description of each US state such as name, region, division and area. I chose area as numerical variable for the first map and division as categorical variable for the second map.

Load in the dataset

  • Let’s take a quick look at the first five states of the dataset.
# A tibble: 6 × 8
  name       abb   fips  region division              area   lat   long
  <chr>      <chr> <chr> <fct>  <fct>                <dbl> <dbl>  <dbl>
1 Alabama    AL    01    South  East South Central  50647.  32.7  -86.8
2 Alaska     AK    02    West   Pacific            571017.  63.4 -153. 
3 Arizona    AZ    04    West   Mountain           113653.  34.3 -112. 
4 Arkansas   AR    05    South  West South Central  52038.  34.9  -92.4
5 California CA    06    West   Pacific            155854.  37.2 -120. 
6 Colorado   CO    08    West   Mountain           103638.  39.0 -106. 
  • I loaded in the mapping dataset sf and performed a left join with the dataset states. I decided to exclude Alaska, Puerto Rico and Hawaii.
Joining with `by = join_by(name)`

Area of Each State (Static)

  • The following map shows the area of each state in square miles. For this static map, I used the sf mapping dataset.
Show the code
states_sf |>
 ggplot() +
  geom_sf(aes(fill = area), colour = "white", linetype = 1) + 
  theme_void() +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  labs(title = "Area of US States",
       fill = "Square miles") +
  theme(plot.title = element_text(hjust = 0.5))

  • Insights: This map shows Texas has the largest area approximately 250,000 square miles. California and Montana followed after Texas to have a large area around 200,000 square miles. States on the east side of the US in both the north and south have much smaller area compared to states in the west side.

Area of Each State (Interactive)

  • Here, I turned the static map into an interactive map where users can hover over each state to view the state name and its corresponding area.
Show the code
area_sf <- states_sf |>
  mutate(labels = str_c(name, ": ", area, " sq mile"))

labels <- lapply(area_sf$labels, HTML)

pal <- colorNumeric(
  palette = c("darkseagreen1", "darkseagreen4"),
  domain = area_sf$area
)

leaflet(area_sf) |>
  setView(-96, 37.8, 4) |>
  addTiles() |>
  addPolygons(
    fillColor = ~pal(area),
    weight = 2,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7,
    highlightOptions = highlightOptions(
      weight = 5,
      color = "#666",
      dashArray = "",
      fillOpacity = 0.7,
      bringToFront = TRUE),
    label = labels,
    labelOptions = labelOptions(
      style = list("font-weight" = "normal", padding = "3px 8px"),
      textsize = "15px",
      direction = "auto")) |>
  addLegend(pal = pal, values = ~area, opacity = 0.7, title = NULL,
    position = "bottomright")

State Division (Static)

  • The following map shows the Census Bureau division of each state. For this static map, I used the polygon mapping dataset.
Show the code
states_polygon <- as_tibble(map_data("state")) |>
  select(region, group, order, lat, long)
  • I defined my own color pallete for each division.
Show the code
distinct_colors <- c(
  "#e41a1c",  
  "#377eb8",
  "#4daf4a",
  "#984ea3",  
  "#ff7f00", 
  "#ffff33", 
  "#a65628", 
  "#f781bf",
  "#66c2a5" 
)
Show the code
states |>
  mutate(name = str_to_lower(name)) |>
  select(-lat, -long) |>
  right_join(states_polygon, join_by(name == region)) |>
  ggplot(mapping = aes(x = long, y = lat,
                          group = group)) +
  geom_polygon(aes(fill = division), color = "black", linewidth = 0.2) +
  theme_void() +
  coord_map() +
  scale_fill_manual(values = distinct_colors) +
  labs(title = "Census Bureau Division of the US",
       fill = NULL) +
  theme(plot.title = element_text(hjust = 0.5))

  • Insights: From the map, I see that the Pacific division consists of three states and this is the smallest number of states in a division. Mountain and South Atlantic divisions seem to have about the same number of states. I think a possible explanation for the number of states in a division could be due to the population of each state.

State Division (Interactive)

  • Here, I turned the static State Division map into an interactive map so users can hover above each state to see state name and its division.
Show the code
division_sf <- states_sf |>
  mutate(labels = str_c("State: ", name,"<br>Division: ", division))

division_labels <- lapply(division_sf$labels, HTML)

pal_division <- colorFactor(
  palette = distinct_colors,
  domain = division_sf$division
)

leaflet(division_sf) |>
  setView(-96, 37.8, 4) |>
  addTiles() |>
  addPolygons(
    fillColor = ~pal_division(division),
    weight = 2,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7,
    highlightOptions = highlightOptions(
      weight = 5,
      color = "#666",
      dashArray = "",
      fillOpacity = 0.7,
      bringToFront = TRUE),
    label = division_labels,
    labelOptions = labelOptions(
      style = list("font-weight" = "normal", padding = "3px 8px"),
      textsize = "15px",
      direction = "auto"))