26 Tracking data: Plotting tracks and reviewing tabular data

Analyses outlined in this chapter were performed in R version 4.3.2 (2023-10-31 ucrt)

This chapter was last updated on 2024-02-23


26.1 What this chapter covers:

  • Tabular data review: Review basic details about your data. i.e. what information is in each column of data

  • Spatial data review: Plot your data and perform some basic checks to make sure it is formatted correctly for further analyses

  • Consideration: When to remove or salvage data from a tracked individual(s)


26.2 Where you can get example data for the chapter:

This tutorial uses example data from a project led by the BirdLife International partner in Croatia: BIOM

  • The citation for this data is: Zec et al. 2023

  • Example data is available upon request

  • A description of the example data is given in a separate chapter



26.3 Load packages

Load required R packages for use with codes in this chapter:

If the package(s) fails to load, you will need to install the relevant package(s).

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Load libraries --------------------------------------------------------------
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## sf package for spatial data analyses (i.e. vector files such as points, lines, polygons)
library(sf)
## Tidyverse for data manipulation
library(tidyverse)
## ggplot2 for plotting opionts
library(ggplot2)
## rnaturalearth package for geographic basemaps in R
library(rnaturalearth)
## leaflet package for interactive maps in R
library(leaflet)
## lubridate for date time
library(lubridate)
## track2kba for the analysis of important site identification
library(track2KBA)
## speed filter
library(trip)
## linear interpolation
library(adehabitatLT)
##
library(raster)
##
library(viridis)
##
library(readxl)
library(xlsx)


26.4 Define object names for chapter

Typically, if your data follows the same format as the examples in the chapter (and previous chapters), then below should be the only thing(s) you need to change.

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Specify projections / store needed CRS definitions as variables ----
## SEE: https://epsg.io/
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## world - unprojected coordinates
wgs84 <- st_crs("EPSG:4326")


## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Source a relevant basemap (download / or load your own)
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## Source a world map from the rnaturalearth R package
## see details of function to change the type of map you get
## If you can't download this map - you may need to load a separate shapefile
## depicting a suitable basemap
worldmap <- rnaturalearth::ne_download(scale = "large",
                                       type = "countries",
                                       category = "cultural",
                                       destdir = tempdir(),
                                       load = TRUE,
                                       returnclass = "sf")
## Reading layer `ne_10m_admin_0_countries' from data source 
##   `C:\Users\jonathan.handley\AppData\Local\Temp\Rtmp6L566o\ne_10m_admin_0_countries.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 258 features and 168 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -180 ymin: -90 xmax: 180 ymax: 83.63410065
## Geodetic CRS:  WGS 84

26.5 Load file (or file created from previous chapter)

## Read the csv file of merged tracking data that follows the output / download format
## of the Seabird Tracking Database
df_stdb_output <- read.csv("./data-testing/tracking-data/Puffinus-yelkouan-Z-tracking-STDB-output.csv")

26.5.1 Simplify object names if need be

Sometimes, you may have created different object names in previous scripts (R codes) and you may wish to simplify the names to a shorter name for the purpose of a new script.

## Copy the object and give it a new name
df.stdb <- df_stdb_output

## Remove the old object from your R environment
rm(df_stdb_output)


26.6 Explore the tabular data

This step can be particularly useful when you have not only combined data from a single species and colony, but when you have combined many datasets from many species and colonies.

Before you plot any data, it can be a good idea to broadly explore the data.

While you might know which species you tracked, and from which colonies, and from which years, it can often be worth checking over these (and other) aspects of your data.

Checking the data helps refresh your view on what data you have, and also helps you pick up any errors that may have arisen when inputting data.

## Reminder on what the data looks like so far
head(data.frame(df.stdb),2)
##                   dataset_id   scientific_name         common_name   site_name
## 1 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
## 2 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
##   colony_name lat_colony lon_colony device         bird_id        track_id
## 1           Z  42.774893  16.875649    GPS 19_Tag17600_Z-9 19_Tag17600_Z-9
## 2           Z  42.774893  16.875649    GPS 19_Tag17600_Z-9 19_Tag17600_Z-9
##            original_track_id   age     sex   breed_stage
## 1 populated-upon-upload-STDB adult unknown chick-rearing
## 2 populated-upon-upload-STDB adult unknown chick-rearing
##                 breed_status   date_gmt time_gmt  latitude longitude
## 1 populated-upon-upload-STDB 2019-05-24 00:49:09 42.811528 16.885531
## 2 populated-upon-upload-STDB 2019-05-24 01:09:03 42.812029 16.886907
##   argos_quality equinox
## 1            NA      NA
## 2            NA      NA
## Review the main columns of data separately. This helps check for errors associated 
## with data entry. E.g. perhaps you typed chick-rearing and CHICK-rearing. Because
## of the difference in lower-case vs. upper-case text, you might accidentally consider
## these as separate components of your dataset.
## the table function is useful to check the unique number of entries per unique input
table(df.stdb$scientific_name)
## 
## Puffinus yelkouan 
##             11354
table(df.stdb$site_name)
## 
## Lastovo SPA 
##       11354
table(df.stdb$colony_name)
## 
##     Z 
## 11354
table(df.stdb$breed_status)
## 
## populated-upon-upload-STDB 
##                      11354
table(df.stdb$breed_stage)
## 
## chick-rearing 
##         11354
table(df.stdb$age)
## 
## adult 
## 11354
table(df.stdb$sex)
## 
## unknown 
##   11354
## Summarise the data by species, site_name, colony_name, year, breed_status (if you have this), breed_stage, age, sex.
## First we add a new year column by splitting the date column so we can get information about years
df_overview <- df.stdb %>% mutate(year = year(date_gmt)) %>% 
  ## then we group the data by relevant columns
  group_by(scientific_name, 
           site_name, 
           colony_name, 
           year,
           #breed_status, # if you downloaded data from the STDB, you should have this info.
           breed_stage,
           age, 
           sex) %>% 
  ## then we continue to summarise by the distinct number of entries per group
  summarise(n_birds = n_distinct(bird_id),
            n_tracks = n_distinct(track_id))

## review the summary output
df_overview
## # A tibble: 2 × 9
## # Groups:   scientific_name, site_name, colony_name, year,
## #   breed_stage, age [2]
##   scientific_name   site_name  colony_name  year breed_stage age   sex   n_birds
##   <chr>             <chr>      <chr>       <dbl> <chr>       <chr> <chr>   <int>
## 1 Puffinus yelkouan Lastovo S… Z            2019 chick-rear… adult unkn…      19
## 2 Puffinus yelkouan Lastovo S… Z            2020 chick-rear… adult unkn…      15
## # ℹ 1 more variable: n_tracks <int>

26.7 Review of summary output

From the summary output above we can see the following:

  • scientific_name: we have tracking data from one species
  • site_name: we have tracking data from one general site
  • colony_name: we have tracking data from one colony
  • year: data comes from between 2019 and 2020
  • breed_stage: all data relates to birds during the chick-rearing life-cycle stage.
  • age and sex: data is from adult birds of unknown sex
  • n_birds, n_tracks: because n_birds = n_tracks, it indicates that:
    • either the tracking data from each individual bird has not been separated into unique trips (in which case it needs to be), or
    • the tracking data from each individual bird is only representative of a single trip to sea (during the breeding period when birds may be exhibiting central place foraging behaviour)


26.8 Arrange data and remove duplicate entries

Once you have formatted your data into a standardised format and ensured that parts of your data are inputted correctly, it is also worth ensuring your data is ordered (arranged) correctly chronologically. An artifact of manipulating spatial data is that sometimes the data can become un-ordered with respect to time, or, given the way various devices interact with satellites, you can also end up with duplicated entries according to timestamps.

This can be a first problem, causing your track to represent unrealistic movement patterns of the animal.

We need to ensure our data is ordered correctly and also remove any duplicate timestamps.

## review your OVERALL data again
head(data.frame(df.stdb),2)
##                   dataset_id   scientific_name         common_name   site_name
## 1 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
## 2 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
##   colony_name lat_colony lon_colony device         bird_id        track_id
## 1           Z  42.774893  16.875649    GPS 19_Tag17600_Z-9 19_Tag17600_Z-9
## 2           Z  42.774893  16.875649    GPS 19_Tag17600_Z-9 19_Tag17600_Z-9
##            original_track_id   age     sex   breed_stage
## 1 populated-upon-upload-STDB adult unknown chick-rearing
## 2 populated-upon-upload-STDB adult unknown chick-rearing
##                 breed_status   date_gmt time_gmt  latitude longitude
## 1 populated-upon-upload-STDB 2019-05-24 00:49:09 42.811528 16.885531
## 2 populated-upon-upload-STDB 2019-05-24 01:09:03 42.812029 16.886907
##   argos_quality equinox
## 1            NA      NA
## 2            NA      NA
(str(df.stdb))
## 'data.frame':    11354 obs. of  21 variables:
##  $ dataset_id       : chr  "populated-upon-upload-STDB" "populated-upon-upload-STDB" "populated-upon-upload-STDB" "populated-upon-upload-STDB" ...
##  $ scientific_name  : chr  "Puffinus yelkouan" "Puffinus yelkouan" "Puffinus yelkouan" "Puffinus yelkouan" ...
##  $ common_name      : chr  "Yelkouan Shearwater" "Yelkouan Shearwater" "Yelkouan Shearwater" "Yelkouan Shearwater" ...
##  $ site_name        : chr  "Lastovo SPA" "Lastovo SPA" "Lastovo SPA" "Lastovo SPA" ...
##  $ colony_name      : chr  "Z" "Z" "Z" "Z" ...
##  $ lat_colony       : num  42.8 42.8 42.8 42.8 42.8 ...
##  $ lon_colony       : num  16.9 16.9 16.9 16.9 16.9 ...
##  $ device           : chr  "GPS" "GPS" "GPS" "GPS" ...
##  $ bird_id          : chr  "19_Tag17600_Z-9" "19_Tag17600_Z-9" "19_Tag17600_Z-9" "19_Tag17600_Z-9" ...
##  $ track_id         : chr  "19_Tag17600_Z-9" "19_Tag17600_Z-9" "19_Tag17600_Z-9" "19_Tag17600_Z-9" ...
##  $ original_track_id: chr  "populated-upon-upload-STDB" "populated-upon-upload-STDB" "populated-upon-upload-STDB" "populated-upon-upload-STDB" ...
##  $ age              : chr  "adult" "adult" "adult" "adult" ...
##  $ sex              : chr  "unknown" "unknown" "unknown" "unknown" ...
##  $ breed_stage      : chr  "chick-rearing" "chick-rearing" "chick-rearing" "chick-rearing" ...
##  $ breed_status     : chr  "populated-upon-upload-STDB" "populated-upon-upload-STDB" "populated-upon-upload-STDB" "populated-upon-upload-STDB" ...
##  $ date_gmt         : chr  "2019-05-24" "2019-05-24" "2019-05-24" "2019-05-24" ...
##  $ time_gmt         : chr  "00:49:09" "01:09:03" "01:29:03" "01:49:09" ...
##  $ latitude         : num  42.8 42.8 42.8 42.8 42.8 ...
##  $ longitude        : num  16.9 16.9 16.9 16.9 16.9 ...
##  $ argos_quality    : logi  NA NA NA NA NA NA ...
##  $ equinox          : logi  NA NA NA NA NA NA ...
## NULL
## merge the date and time columns
df.stdb$dttm <- with(df.stdb, ymd(date_gmt) + hms(time_gmt))

## first check how many duplicate entries you may have. If there are many, it
## is worth exploring your data further to understand why.
n_duplicates <- df.stdb %>% 
  group_by(bird_id, track_id) %>% 
  arrange(dttm) %>% 
  dplyr::filter(duplicated(dttm) == T)

## review how many duplicate entries you may have. Print the message:
print(paste("you have ",nrow(n_duplicates), " duplicate records in a dataset of ",
            nrow(df.stdb), " records.", sep =""))
## [1] "you have 11 duplicate records in a dataset of 11354 records."
## remove duplicates entries if no further exploration is deemed necessary
df.stdb <- df.stdb %>% 
  ## first group data by individual animals and unique track_ids
  group_by(bird_id, track_id) %>% 
  ## then arrange by timestamp
  arrange(dttm) %>% 
  ## then if a timestamp is duplicated (TRUE), then don't select this data entry.
  ## only select entries where timestamps are not duplicated (i.e. FALSE)
  dplyr::filter(duplicated(dttm) == F)


26.9 Visualise all the location data

Using the leaflet package in R, you can easily visualise your tracking data interactively within RStudio. Note though, if you have very large datasets, this option may not always work smoothly.

What should you look for when visualising the raw data? * Are your locations in realistic places? * Have you perhaps mixed up the latitude and longitude columns? * Does your data cross the international date line? Do you know how to deal with this? * Will you need to remove sections of the data that do not represent a time when the animal was tagged? (e.g. perhaps you set the device to start recording locations before deploying on the animal. So the tag might have recorded while you were travelling to the deployment location. Therefore, removing these sections of the track will facilitate your overall analysis.)

## review your OVERALL data again
#head(data.frame(df.stdb),2)

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## visualise all data ----
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## number of datapoints
nrow(df.stdb)
## [1] 11343
## interactive plot
map.alldata <- leaflet() %>% ## start leaflet plot
  ## select background imagery
  addProviderTiles(providers$Esri.WorldImagery, group = "World Imagery") %>% 
  ## plot the points. Note: leaflet automatically finds lon / lat colonies
  addCircleMarkers(data = df.stdb,
                   ## size of points
                   radius = 3,
                   ## colour of points
                   fillColor = "cyan",
                   ## transparency of points
                   fillOpacity = 0.5, 
                   ## set stroke = F to remove borders around points
                   stroke = F) 

## generate the plot
map.alldata

26.10 Review of overall plot for all data points

Based on the interactive plot, you can see that generally the data looks good. Generally, all the locations are in the Adriatic Sea area (something we would anticipate based on what we know about Yelkouan Shearwaters breeding in Croatia). We can conclude the following:


  • Locations appear to be in realistic places.
  • It’s unlikely that we have mixed up the latitude and longitude columns.
  • The data does not cross the international date line.


Regarding removing sections of the data that do not represent a time when the animal was tagged: Later filtering steps may remove these parts of the track if locations are near the vicinity of the colony (see details of the tripSplit() function). However, if there are broader location data associated with these types of locations, you will need to remove these sections of the track.

  • For example, if the device remained on and you continued to receive location data while you were travelling away from a site. You might need to remove this data with some more manual data cleaning steps.


26.11 Save all the location data as a shapefile

Visualising all the location data in R can be a simpler starting point. You may also want to save this data as a shapefile (.shp) for viewing in GIS software such as QGIS or ArcGIS.

Note: saving all data as a single shapefile can be a memory intensive task (i.e. if you have a lot of data, then your computer might take a long time to save the file, or the file will be big and slow to work with)

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## First add a simplified unique id and create the sf spatial object ----
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## Review data
head(data.frame(df.stdb),2)
##                   dataset_id   scientific_name         common_name   site_name
## 1 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
## 2 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
##   colony_name lat_colony lon_colony device         bird_id        track_id
## 1           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
## 2           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
##            original_track_id   age     sex   breed_stage
## 1 populated-upon-upload-STDB adult unknown chick-rearing
## 2 populated-upon-upload-STDB adult unknown chick-rearing
##                 breed_status   date_gmt time_gmt  latitude longitude
## 1 populated-upon-upload-STDB 2019-05-01 21:40:41 42.815077 16.890582
## 2 populated-upon-upload-STDB 2019-05-01 22:00:41 42.837505 16.897495
##   argos_quality equinox                dttm
## 1            NA      NA 2019-05-01 21:40:41
## 2            NA      NA 2019-05-01 22:00:41
## add a simplified animal ID column - a simple number for each unique animal tracked
df.stdb$bird_id_num <- as.numeric(factor(df.stdb$bird_id, levels = unique(df.stdb$bird_id)))

## Review data again (tail function prints the end of the dataframe so you can
## check if the last unique number matches the number of animals you tracked.)
head(data.frame(df.stdb),2)
##                   dataset_id   scientific_name         common_name   site_name
## 1 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
## 2 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
##   colony_name lat_colony lon_colony device         bird_id        track_id
## 1           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
## 2           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
##            original_track_id   age     sex   breed_stage
## 1 populated-upon-upload-STDB adult unknown chick-rearing
## 2 populated-upon-upload-STDB adult unknown chick-rearing
##                 breed_status   date_gmt time_gmt  latitude longitude
## 1 populated-upon-upload-STDB 2019-05-01 21:40:41 42.815077 16.890582
## 2 populated-upon-upload-STDB 2019-05-01 22:00:41 42.837505 16.897495
##   argos_quality equinox                dttm bird_id_num
## 1            NA      NA 2019-05-01 21:40:41           1
## 2            NA      NA 2019-05-01 22:00:41           1
tail(data.frame(df.stdb),2)
##                       dataset_id   scientific_name         common_name
## 11342 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater
## 11343 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater
##         site_name colony_name lat_colony lon_colony device           bird_id
## 11342 Lastovo SPA           Z  42.774893  16.875649    GPS 20_Tag40118_Z-175
## 11343 Lastovo SPA           Z  42.774893  16.875649    GPS 20_Tag40118_Z-175
##                track_id          original_track_id   age     sex   breed_stage
## 11342 20_Tag40118_Z-175 populated-upon-upload-STDB adult unknown chick-rearing
## 11343 20_Tag40118_Z-175 populated-upon-upload-STDB adult unknown chick-rearing
##                     breed_status   date_gmt time_gmt  latitude longitude
## 11342 populated-upon-upload-STDB 2020-06-10 16:32:49 43.236809 16.543461
## 11343 populated-upon-upload-STDB 2020-06-10 16:52:51 43.190585 16.343547
##       argos_quality equinox                dttm bird_id_num
## 11342            NA      NA 2020-06-10 16:32:49          27
## 11343            NA      NA 2020-06-10 16:52:51          27
## create the sf spatial object
df.stdb_sf <- df.stdb %>% 
  ## first create new columns of lon and lat again so you keep this location 
  ## information in tabular format.
  mutate(lon_device = longitude,
         lat_device = latitude) %>% 
  ## then convert object to sf spatial object
  st_as_sf(coords = c("longitude", "latitude"), crs = wgs84)
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Save raw tracking data as shapefile for viewing in GIS software ----
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## Option allows for multispecies data
## Or the loop will only run once if you have single species data

for(i in unique(df.stdb$scientific_name)){
  
  ## subset the data taking the track information for each unique species
  temp_species <- df.stdb_sf %>% dplyr::filter(scientific_name == i)
  
  ## create new folder within current working directory where you will save data
  ## first create the name of the species and the file path you need
  ## also use gsub to replace spaces within character strings (words) with a "-"
  species_name <- gsub(" ", "-", temp_species$scientific_name[1]) 
  
  ## print the name for checking
  print(species_name)
  
  ## then create the new folder within current working directory
  path_to_folder <- paste("./data-testing/tracking-data/",
                          species_name,"-shapefiles-all-tracks",
                          sep="")
  
  ## print the file path name for checking
  print(path_to_folder)
  
  ## Check if folder exists, and if it does not, then make a new folder
    if (!file.exists(path_to_folder)) {
    # If it does not exist, create a new folder
    dir.create(path_to_folder)
    print(paste("Created folder:", path_to_folder))
    } else {
    # do nothing, but let us know the folder exists already
    print(paste("Folder already exists:", path_to_folder))
    }
  
  ## write the spatial data as a shapefile
  ## NOTE: For some GIS software, column names will be abbreviated upon saving
  ## NOTE: If you have very long file paths, this operation may fail. One solution
  ## is to save the shapefile elsewhere. Another solution is to instead save the file
  ## as a geopackage (.gpkg): simply replace the .shp text below with .gpkg
  st_write(df.stdb_sf, paste(path_to_folder,"/",
                             species_name,
                             "_AllTracks.shp", 
                             sep = ""),
           delete_layer = TRUE)
  
    ## remove the temporary file at the end of each loop
  rm(temp_species)
}

26.12 Save all the location data as a plot

This is a simple plot to look at all the point location data.

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Save raw tracking data as simple plot ----
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## Option allows for multispecies data
## Or the loop will only run once if you have single species data

for(i in unique(df.stdb$scientific_name)){
  
  ## subset the data taking the track information for each unique species
  temp_species <- df.stdb_sf %>% dplyr::filter(scientific_name == i)
  
  ## create new folder within current working directory where you will save data
  ## first create the name of the species and the file path you need
  ## also use gsub to replace spaces within character strings (words) with a "-"
  species_name <- gsub(" ", "-", temp_species$scientific_name[1]) 
  
  ## print the name for checking
  print(species_name)
  
  ## then create the new folder within current working directory
  path_to_folder <- paste("./data-testing/tracking-data/",
                          species_name,"-plots-all-tracks",
                          sep="")
  
  ## print the file path name for checking
  print(path_to_folder)
  
  ## Check if folder exists, and if it does not, then make a new folder
    if (!file.exists(path_to_folder)) {
    # If it does not exist, create a new folder
    dir.create(path_to_folder)
    print(paste("Created folder:", path_to_folder))
    } else {
    # do nothing, but let us know the folder exists already
    print(paste("Folder already exists:", path_to_folder))
    }
  
  
  ## plot track information for each unique species
  plot_alltracks <- ggplot() +
  ## Use the world map data as the underlying basemap
  geom_sf(data = worldmap, fill = "grey") +
  ## Add the point data as transparent cyan circles
  geom_point(data = df.stdb_sf, aes(x = lon_device, y = lat_device), color = "cyan", alpha = 0.5) +
  ## plot the basemap again, but this time superimpose only the country borders over the point data
  ## this is to help you see better which points might obviously be over land.
  geom_sf(data = worldmap, fill = NA, color = "black") +
  ## Set the bounding box to only include the point locations
  coord_sf(xlim = range(df.stdb_sf$lon_device), ylim = range(df.stdb_sf$lat_device)) +
  ## Customize the x and y axis labels
  labs(x = "Longitude", y = "Latitude") +
  ## add a title to the plot
  ggtitle(paste(species_name, "\n",
                "points-all-animals",sep="")) +
  theme(plot.title = element_text(hjust = 0.5))
  
  ## the plot
  plot_alltracks
  
  ## save the plot
  ggsave(paste(path_to_folder, "/",
               species_name,
               "_all-points.png", 
               sep = ""), 
         plot_alltracks, 
         ## when units in mm, then 
         width = 160, height = 160, dpi = 300, units = "mm")
  
  ## remove the temporary file at the end of each loop
  rm(temp_species)
}


26.13 Visualise individual animal tracks

Once you have reviewed the overall status of the tracking data you collected, it can be worth assessing the tracks of individual animals.

This can give you a better idea of the quality of the data for each individual.

Visualising tracking data from individual animals can help you understand which data you might remove, or which data you might try and salvage.

Depending on the amount of data you have, you can often initially perform a static exploration of tracks from each individual (i.e. a simple plot of tracks from each individual), followed by an interactive exploration of tracks from all individuals, or only data from those individuals where interactive exploration is deemed necessary.

Below, outlines options for visualising individual animal tracks.


26.13.1 Denote beginning and end of tracks for individual animals entire track

## reminder on data structure
head(data.frame(df.stdb_sf),2)
##                   dataset_id   scientific_name         common_name   site_name
## 1 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
## 2 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
##   colony_name lat_colony lon_colony device         bird_id        track_id
## 1           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
## 2           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
##            original_track_id   age     sex   breed_stage
## 1 populated-upon-upload-STDB adult unknown chick-rearing
## 2 populated-upon-upload-STDB adult unknown chick-rearing
##                 breed_status   date_gmt time_gmt argos_quality equinox
## 1 populated-upon-upload-STDB 2019-05-01 21:40:41            NA      NA
## 2 populated-upon-upload-STDB 2019-05-01 22:00:41            NA      NA
##                  dttm bird_id_num lon_device lat_device
## 1 2019-05-01 21:40:41           1  16.890582  42.815077
## 2 2019-05-01 22:00:41           1  16.897495  42.837505
##                      geometry
## 1 POINT (16.890582 42.815077)
## 2 POINT (16.897495 42.837505)
head(data.frame(df.stdb),2)
##                   dataset_id   scientific_name         common_name   site_name
## 1 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
## 2 populated-upon-upload-STDB Puffinus yelkouan Yelkouan Shearwater Lastovo SPA
##   colony_name lat_colony lon_colony device         bird_id        track_id
## 1           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
## 2           Z  42.774893  16.875649    GPS 19_Tag17652_Z-2 19_Tag17652_Z-2
##            original_track_id   age     sex   breed_stage
## 1 populated-upon-upload-STDB adult unknown chick-rearing
## 2 populated-upon-upload-STDB adult unknown chick-rearing
##                 breed_status   date_gmt time_gmt  latitude longitude
## 1 populated-upon-upload-STDB 2019-05-01 21:40:41 42.815077 16.890582
## 2 populated-upon-upload-STDB 2019-05-01 22:00:41 42.837505 16.897495
##   argos_quality equinox                dttm bird_id_num
## 1            NA      NA 2019-05-01 21:40:41           1
## 2            NA      NA 2019-05-01 22:00:41           1
#head(data.frame(df.stdb2),2)

## add a column indicating start and end of tracks for each individual animal
df.stdb_sf <- df.stdb_sf %>% 
  group_by(bird_id_num) %>% 
  mutate(nlocs = 1:length(bird_id_num)) %>% 
  mutate(track_segment = if_else(nlocs <= 10, "track.start","track.journey")) %>% 
  ## note: if you have a track with less than 20 points, then you will overwrite 
  ## some of the previous data.
  mutate(track_segment = if_else(nlocs %in% (length(bird_id_num)-9):(length(bird_id_num)),"track.end",track_segment)) %>%
  ## add a column indicating colour for start and end of tracks
  ## colours from: https://colorbrewer2.org/#type=qualitative&scheme=Set2&n=3
  mutate(track_colour = if_else(nlocs <= 10, "#66c2a5","#8da0cb")) %>% 
  mutate(track_colour = if_else(nlocs %in% (length(bird_id_num)-9):(length(bird_id_num)),"#fc8d62",track_colour))

26.13.2 Save individual tracks as static plots

A simple plot to look at all the point location data for each individual tracked.

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Save raw tracking data for each individual as a static plot ----
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## reminder on data structure
head(data.frame(df.stdb_sf),2)

for(i in 1:max(df.stdb_sf$bird_id_num)){
  
  ## subset the data taking the track information for each unique bird tagged
  temp_individual <- df.stdb_sf %>% dplyr::filter(bird_id_num == i)
  
  ## create new folder (if needed) within current working directory where you will save data
  ## first create the name of the species and the file path you need
  ## also use gsub to replace spaces within character strings (words) with a "-"
  species_name <- gsub(" ", "-", temp_individual$scientific_name[1]) 
  
  ## print the name for checking
  print(species_name)
  
  ## then create the new folder within current working directory
  path_to_folder <- paste("./data-testing/tracking-data/",
                          species_name, "-plots-individual-tracks",
                          sep="")
  
  ## print the file path name for checking
  print(path_to_folder)
  
  ## Check if folder exists, and if it does not, then make a new folder
    if (!file.exists(path_to_folder)) {
    # If it does not exist, create a new folder
    dir.create(path_to_folder)
    print(paste("Created folder:", path_to_folder))
    } else {
    # do nothing, but let us know the folder exists already
    print(paste("Folder already exists:", path_to_folder))
    }
  
  ## get animal id for naming plots
  animal_id <- gsub(" ", "-", temp_individual$bird_id[1]) 
  
  
  ## plot track information for each unique species
  plot_individual_tracks <- ggplot() +
  ## Use the world map data as the underlying basemap
  geom_sf(data = worldmap, fill = "grey") +
  ## Add the point data as transparent cyan circles
  #geom_point(data = temp_individual, aes(x = lon_device, y = lat_device), color = "cyan", alpha = 0.5) +
    
  ## Add the point data - get colours from object
  #geom_point(data = temp_individual, aes(x = lon_device, y = lat_device, color = track_colour), alpha = 0.5) +  
  
  
  ## Add the journey locations
  geom_point(data = subset(temp_individual, track_segment == "track.journey"), 
             aes(x = lon_device, y = lat_device, color = track_colour), alpha = 0.5) +
  ## Add the start locations
  geom_point(data = subset(temp_individual, track_segment == "track.start"), 
             aes(x = lon_device, y = lat_device, color = track_colour), alpha = 0.5) +
  ## Add the end locations
  geom_point(data = subset(temp_individual, track_segment == "track.end"), 
             aes(x = lon_device, y = lat_device, color = track_colour), alpha = 0.5) +
  
  ## plot the basemap again, but this time superimpose only the country borders over the point data
  ## this is to help you see better which points might obviously be over land.
  geom_sf(data = worldmap, fill = NA, color = "black") +
  ## Set the bounding box to only include the point locations
  coord_sf(xlim = range(temp_individual$lon_device), ylim = range(temp_individual$lat_device)) +
  ## Customize the x and y axis labels
  labs(x = "Longitude", y = "Latitude") +
  ## add a title to the plot
  ggtitle(paste("points-individual:","\n",
                animal_id, 
                sep="")) +
  theme(plot.title = element_text(hjust = 0.5)) +
  ## remove legend
  theme(legend.position = "none")
  
  ## the plot
  plot_individual_tracks
  
  ## save the plot
  ggsave(paste(path_to_folder, "/",
               animal_id,
               "_points.png", 
               sep = ""), 
         plot_individual_tracks, 
         ## when units in mm, then 
         width = 160, height = 160, dpi = 300, units = "mm")
  
  ## print a loop progress message
  print(paste("Loop ", i, " of ", max(df.stdb_sf$bird_id_num), sep = ""))
    
  ## remove the temporary file at the end of each loop
  rm(temp_individual)
}


26.13.3 Save individual tracks as shapefiles

## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Save raw tracking data for each individual as shapefile for viewing in GIS software ----
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

## reminder on data structure
head(data.frame(df.stdb_sf),2)

for(i in 1:max(df.stdb_sf$bird_id_num)){
  
  ## subset the data taking the track information for each unique bird tagged
  temp_individual <- df.stdb_sf %>% dplyr::filter(bird_id_num == i)
  
  ## create new folder (if needed) within current working directory where you will save data
  ## first create the name of the species and the file path you need
  ## also use gsub to replace spaces within character strings (words) with a "-"
  species_name <- gsub(" ", "-", temp_individual$scientific_name[1]) 
  
  ## print the name for checking
  print(species_name)
  
  ## then create the new folder within current working directory
  path_to_folder <- paste("./data-testing/tracking-data/",
                          species_name,"-shapefiles-individual-tracks",
                          sep="")
  
  ## print the file path name for checking
  print(path_to_folder)
  
  ## Check if folder exists, and if it does not, then make a new folder
    if (!file.exists(path_to_folder)) {
    # If it does not exist, create a new folder
    dir.create(path_to_folder)
    print(paste("Created folder:", path_to_folder))
    } else {
    # do nothing, but let us know the folder exists already
    print(paste("Folder already exists:", path_to_folder))
    }
  
  ## write the spatial data. Label it by species and bird_id  
  st_write(temp_individual, 
           paste(path_to_folder, "/tracks-individual-animals",
                 species_name, "_",
                 temp_individual$bird_id[1],
                 ".shp", 
                 sep = ""), 
           delete_layer = T)
  
  ## print a loop progress message
  print(paste("Loop ", i, " of ", max(df.stdb_sf$bird_id_num), sep = ""))
    
  ## remove the temporary file at the end of each loop
  rm(temp_individual)
}

26.14 When to remove or salvage data from a tracked individual(s)

We would welcome further examples and guidance on this topic.

In some cases, an entire track may be worth disregarding or trying to salvage. However, it often might be the case that only certain trips from the entire period an animal was tracked may be worth removing.

Ultimately, which data you keep or remove for a respective analysis can be somewhat subjective.

Critically: it is important to ensure that the data you do keep / use for an analysis is suited to the type of question you may be investigating with your data

For example:

  • If you were just looking at any potential area an animal might visit, then you may wish to keep all your data and simply provide a descriptive summary of the data. (so long as it has been cleaned for erroneous / incorrect locations)

  • However, if you were investigating detailed movement patterns of an animal, then if you have many poor quality tracks recorded you may wish to remove these (e.g. big gaps in timestamps between consecutive data points, or many likely incomplete tracks).

  • Ultimately, it may be the case that further data collection is required depending on the type of question you wish to answer of your data.

A suitably designed sampling strategy and programming of tracking devices prior to deployment will ensure that you are able to answer the best possible question(s) of your data.