Tracking the Flow of Bird Scooters Across DC

Bird Scooters

Background

Unless you’ve been living under a rock, you’ve seen the scooters flying around - taking businessmen to work, tourists to monuments, and local kids around the neighborhood. These dockless electric scooters can be unlocked from any smartphone, riden for a low fee depending on duration, and parked anywhere in the city. They are intended to be used as a last-mile transit tool, helping commuters to and from their homes, offices, and local Metro/subway stops.

Bird, Lime, and Skip have emerged as the dominant forces in the DC market, and have ridden a roller coaster of hype and publicity with definitely mixed reactions.

Personally, I’m a huge fan. Scooters are fun, easy to ride, fast (they can hit ~15 mph on a good road), and cheap. The common standard on pricing seems to be a $1 flat fee, plus an additional $0.15/min of ridership. With those rates, a 10-minute ride, which might go one or two miles in the city, costs around $2.50.

My Idea

Something that I noticed after using the scooters for a few weeks was the flow of scooters around the city - migrating towards and concentrating in the downtown area in the morning, and then fanning back out to the more residential areas in the afternoon as people head home from work.

Having used the Capital Bikeshare API in the past to get counts and locations of bikes throughout the city, I began wondering about scooters - would it be possible to get access to the locations of scooters across the city? If so, could I map those scooters? If mappable, could I trace trends and changes throughout the day, to see if my hypothesis was correct?

Obtaining API Access

Unfortunately, none of the major scooter providers (Bird, Lime, Skip) have publicly accessible APIs. However… it seems that some people much smarter than me have been able to reverse-engineer the private Bird API used to populate the maps in their iOS and Android applications, and have captured the essential elements of their REST API.

Using the details found here, I was able to configure a script using Python and the Requests framework to take the following actions:

  1. “Log in” to the Bird API and receive an authentication token
  2. Retrieve a listing all Bird scooters in the DC area with their location, unique ID, and battery level
  3. Write the listing to a CSV

Once I got the code working and output formatted properly, I set a job in cron to launch the script every five minutes.

Visualizing Bird Location Data

Now armed with oodles of data about the city’s Birds, I set about finding a way to map these scooters.

Leaflet Visualization

The first thing I tried was to visualize the data on top of a Leaflet map. To do so, I created a custom map centered around the heart of DC, passed through a day’s worth of Bird location information, selected a simple and uncluttered background, and added a heatmap on top.

Leaflet Map

Leaflet Code

# Create the heatmap
m <- leaflet(data = tbl) %>%
  setView(lng = -77.0322,
  lat = 38.9012,
  zoom = 12) %>%
  addProviderTiles(providers$CartoDB.PositronNoLabels) %>%
  addHeatmap(blur = 20, max = 0.05, radius = 15)
# View the map
m

Switching to ggmap

While I really liked the Leaflet map’s appearance and interactivity, I realized that I wasn’t going to be able to animate it across a time series. With this in mind, I went looking for other ways of mapping my data, and soon stumbled upon the ggmap library. ggmap integrates directly with other ggplot2 tools, so I knew I would be able to create heatmaps and should be able to find a way to animate the plots.

Workflow

  • Import a specific CSV (one of the point-in-time records being captured every five minutes)
  • Read that CSV into a dataframe
  • Create a Google Maps basemap, with minimal styling
  • Add a binned layer on top of the map, counting the frequency of items within those latitude and longitude bins
  • Apply a color gradient from yellow to red to visualize the heat of each bin

Point-in-Time Map

Heatmap of Birds in DC using ggmap

ggmap Code

# Get streamlined Google map of dc in black and white
g <- ggmap(get_googlemap(
	center = c(-77.04136, 38.90573), 
	zoom = 13, 
	maptype = "roadmap", 
	color = "bw", 
	style = c(feature = "all", element = "labels", visibility = "off")
	)
)

# Creates a map plot for a given set of bird locations
# Input: dataframe of location data, subtitle for the map
# Returns: altered ggmap with heatmap
createPlot <- function(data, title) {
  time <- paste("Time: ", substring(title, 1, 2), ":", substring(title, 4, 5), sep = "")
  g + 
    stat_bin2d(
    aes(x = longitude, y = latitude),
    size = 1,
    bins = 40,
    alpha = 0.5,
    data = data
  ) +
    scale_fill_continuous(low="yellow", high="red") +
    labs(
      title = "Tracking the flow of Bird scooters across DC",
      subtitle = time,
      x = "",
      y = "",
      caption = "conormclaughlin.net"
    ) +
    theme_minimal() +
    theme(
      legend.position = "none",
      axis.title.x = element_blank(),
      axis.title.y = element_blank(),
      axis.ticks = element_blank(),
      axis.text.y = element_blank(),
      axis.text.x = element_blank(),
      plot.title = element_text(size = 16, face = "bold"),
      plot.subtitle = element_text(size = 12)
    )
}

Animating Plots

Intially, I had hoped to use the gganimate package to animate my maps along the time axis. However, I had issues installing and configuring the package using Devtools, and decided to look for different alternatives.

Luckily, I was able to find a helpful guide for how to use the ImageMagick program and its bindings for R (magick) to create GIFs from a collection of image files. I figured that if I could create a plot for each time period I was interested in, I could use magick to sync the images together into an animated GIF.

Workflow

  • For each point in time within the desired time window, make a heatmap
    • In my case, I chose to make a map in half-hour increments from 5 AM to 11 PM
  • Save all the heatmaps to a folder
  • Use magick to stitch together all of the heatmaps in that folder into an animated GIF

Animated Heatmap of Birds

Animated Map of DC with Heatmap of Birds throughout the Day

Animation Code

# Create a map plot for each data file
setwd("data")
for (i in 1:length(files)) {
  # Read file to dataframe
  file <- files[i]
  temp <- read_csv(file)
  # Create plot
  createPlot(temp, substring(file, 21, 25))
  # Save the latest plot to a .png file
  ggsave(
  	path = PLOT_PATH, paste(file, ".png", sep = ""), 
  	width = 6, 
  	height = 6, 
  	unit = "in"
  	)
}
# Set the working directory back to the root of the project
setwd(PROJECT_ROOT_PATH)
  
# Step 2: List those Plots, Read them in, and then make animation
list.files(path = "plots", pattern = "00|30", full.names = T) %>% 
  map(image_read) %>% # reads each path file
  image_join() %>% # joins image
  image_animate(fps=2) %>% # animates, can opt for number of loops
  image_write("birds.gif") # write to current dir

Conclusions

  • Just finding out how to get the Bird API working was a battle - I’m not sure how legitimate the method I adopted to scrape the Bird locations was, but it was very fulfilling to get the data extract working properly. Kudos to those who did the tough work on this
  • It looks like my hypothesis about traffic patters was correct - the scooters definitely congregate in the middle of Downtown DC starting at 8:30 AM or so before begining to spread out in the afternoon
  • It seems like Birds really do provide a viable transit option for urban commuters - their density is very high along the U St/Logan Circle/14th St/Dupont neighborhoods, which don’t have great Metro connectivity to Downtown
  • Density seems highest in the areas above and the Wilson/Clarendon corridor of Arlington, but not as much in Georgetown, Chevy Chase, Capital Hill, or Navy Yard. Perhaps this is because car ownership is higher in those neighborhoods, and fewer residents have the type of walking/transit-centered commute that scooters fit well into