Invasive range calls were taken through a pre-processing workflow similar to previously published native range calls[1]. Previously published invasive range calls for 2004 were provided by Susannah Buhrman-Deever[2], and were pre-processed in the same way. See the appendix of the associated article in Animal Behavior for details on pre-processing steps. The package warbleR[3] was used for pre-processing, making extended selection tables and spectrograms, and calculating acoustic measurements for downstream analyses. Some function names and other code in this script (and subsequent scripts) may need to be updated in order to reproduce results with more recent versions of the packages used. Here, we generated summary statistics of the combined native and invasive range datasets, then analyzed differences in nest and social group size estimates between ranges.
Install packages needed to run analyses in this script and the subsequent 3 scripts. Packages MASS, parallel, and grid are provided with the base R installation.
# Linux users may need to run the following terminal commands to install devtools and CRAN packages:
# sudo apt-get install libcurl4-openssl-dev
# sudo apt-get install libssl-dev
# sudo apt-get install libxml2-dev
# sudo apt install libgeos-dev
# sudo apt install libgdal-dev
# sudo apt install libudunits2-dev
# Install packages from CRAN
# Note that until orddom is back on CRAN, you may have to download an earlier version of the package from source
X <- c("ggplot2", "pbapply", "data.table", "tidyverse", "scales", "egg", "ggplotify", "grid", "rgdal", "ranger", "caret", "rgeos", "Rmisc", "Rraven", "lubridate", "knitr", "coin", "cowplot", "dplyr", "corrplot", "MLmetrics", "e1071", "gbm", "pdp", "pracma", "gridExtra", "facetscales", "grDevices", "gtable", "rstatix", "orddom", "IDmeasurer", "utils")
is_installed <- function(p) is.element(p, installed.packages()[,1])
lapply(1:length(X), function(x){
if(!is_installed(X[x])){
install.packages(X[x], repos = "http://lib.stat.cmu.edu/R/CRAN")
}
})
# Install the most recent version of the warbleR package from GitHub
# Install devtools if not installed
if (!"devtools" %in% installed.packages()[,"Package"]) install.packages("devtools")
# Linux users may have to install fftw:
# sudo apt-get install libfftw3-dev libfftw3-doc
# as well as cmake for bioacoustics to be installed:
# sudo apt-get install cmake
library(devtools)
install_github("maRce10/warbleR")
install_github("zmjones/edarf", subdir = "pkg")
# facetscales has a bug when making boxplots, we did not use this package to make boxplots
install_github("zeehio/facetscales")
# To install a previous version of orddom from source:
# May need to install package psych from source depending on R version
pkgurl <- "https://cran.r-project.org/src/contrib/Archive/psych/psych_2.0.12.tar.gz"
install.packages(pkgurl, repos = NULL, type="source")
pkgurl <- "https://cran.r-project.org/src/contrib/Archive/orddom/orddom_3.1.tar.gz"
install.packages(pkgurl, repos = NULL, type="source")
Load packages and external functions.
rm(list = ls())
X <- c("warbleR", "pbapply", "Rraven", "tidyverse", "ggplot2", "data.table", "lubridate", "knitr", "coin", "rgdal", "rgeos", "cowplot", "egg", "ggplotify")
invisible(lapply(X, library, character.only = TRUE))
path <- "/media/gsvidaurre/MYIOPSITTA/R/VocalLearning_PostInvasion/Data"
gpath <- "/home/gsvidaurre/Desktop/MANUSCRIPTS/SimplerSignatures_PostInvasion/FIGURES"
seed <- 401
cores <- parallel::detectCores() - 2
# Call R code to make raincloud plots, downloaded from https://github.com/RainCloudPlots/RainCloudPlots
# Loading these scripts will install packages if these are not already on your machine
source("/home/gsvidaurre/Desktop/Software/RainCloudPlots-master/tutorial_R/R_rainclouds.R")
source("/home/gsvidaurre/Desktop/Software/RainCloudPlots-master/tutorial_R/summarySE.R")
source("/home/gsvidaurre/Desktop/Software/RainCloudPlots-master/tutorial_R/simulateData.R")
Read in the extended selection table that contains metadata and wave objects for pre-processed native and invasive range calls. Throughout this script, you can reproduce analyses by reading in the file “nat_inv_indiv_site_seltbl.csv”, which is a selection table that contains the metadata employed here.
nat_inv_est <- readRDS(file.path(path, "nat_inv_indiv_site_EST.RDS"))
glimpse(nat_inv_est)
## Rows: 1,596
## Columns: 26
## $ sound.files <chr> "2017_07_17_EMBR_INDIV-RAW_MZ000199_1.WAV", "2017…
## $ selec <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ start <dbl> 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0…
## $ end <dbl> 0.2646363, 0.2310324, 0.2509257, 0.2378781, 0.220…
## $ date <date> 2017-07-17, 2017-07-17, 2017-07-17, 2017-07-17, …
## $ range <fct> Native, Native, Native, Native, Native, Native, N…
## $ social_scale <chr> "Individual", "Individual", "Individual", "Indivi…
## $ site <chr> "EMBR", "EMBR", "EMBR", "EMBR", "EMBR", "EMBR", "…
## $ Bird_ID <chr> "NAT-RAW", "NAT-RAW", "NAT-RAW", "NAT-RAW", "NAT-…
## $ lat <dbl> -34.44437, -34.44437, -34.44437, -34.44437, -34.4…
## $ lon <dbl> -57.72845, -57.72845, -57.72845, -57.72845, -57.7…
## $ Visual_quality_score <fct> H, M, H, H, M, M, M, H, M, H, H, H, M, M, H, H, H…
## $ Overlapping_signal <fct> N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N…
## $ Truncated <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N",…
## $ tailored <fct> y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y…
## $ prev_SNR <dbl> 20.10350, 14.90409, 18.55063, 18.49727, 18.74672,…
## $ SNR_margin_works <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y",…
## $ SNR_before_only <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ old.sound.file.name <chr> "2017_07_17_EMBR_INDIV-RAW_MZ000199_1.WAV", "2017…
## $ SNR <dbl> 23.45415, 22.14815, 15.36660, 19.25200, 21.86770,…
## $ region <chr> "West", "West", "West", "West", "West", "West", "…
## $ country <chr> "Uruguay", "Uruguay", "Uruguay", "Uruguay", "Urug…
## $ site_year <chr> "EMBR_2017", "EMBR_2017", "EMBR_2017", "EMBR_2017…
## $ year <chr> "2017", "2017", "2017", "2017", "2017", "2017", "…
## $ dept_state <chr> "Colonia", "Colonia", "Colonia", "Colonia", "Colo…
## $ invasive_city <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
# Write out selection table for sharing data (not in extended format, just a .csv)
# nat_inv_est %>%
# as_tibble %>%
# write.csv(., file.path(path, "nat_inv_indiv_site_seltbl.csv"), row.names = FALSE)
Total number of repeatedly sampled individual calls between ranges and by range.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
dplyr::count() %>%
kable(align = rep("c", ncol(.)))
| n |
|---|
| 229 |
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range) %>%
dplyr::count() %>%
kable(align = rep("c", ncol(.)))
| range | n |
|---|---|
| Native | 97 |
| Invasive | 132 |
Number of known repeatedly sampled individuals across ranges.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range) %>%
dplyr::summarise(count = n_distinct(Bird_ID)) %>%
kable(align = rep("c", ncol(.)))
| range | count |
|---|---|
| Native | 8 |
| Invasive | 9 |
Number of sites at which we sampled known repeatedly sampled individuals across ranges.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range) %>%
dplyr::summarise(count = n_distinct(site)) %>%
kable(align = rep("c", ncol(.)))
| range | count |
|---|---|
| Native | 3 |
| Invasive | 7 |
Number of calls per known repeatedly sampled individual across ranges, and recording dates. Used for a table in the appendix.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range, Bird_ID, lat, lon, site, dept_state) %>%
dplyr::count() %>%
arrange(range, site) %>%
inner_join(
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range, Bird_ID) %>%
distinct(date),
by = c("range", "Bird_ID")
) %>%
ungroup() %>%
dplyr::mutate(
lat = round(lat, 3),
lon = round(lon, 3)
) %>%
dplyr::select(Bird_ID, site, dept_state, lat, lon, n, date) %>%
kable(align = rep("c", ncol(.)))
| Bird_ID | site | dept_state | lat | lon | n | date |
|---|---|---|---|---|---|---|
| NAT-AAT | 1145 | Colonia | -34.376 | -57.500 | 12 | 2017-07-29 |
| NAT-UM1 | 1145 | Colonia | -34.375 | -57.502 | 25 | 2017-07-28 |
| NAT-UM2 | 1145 | Colonia | -34.375 | -57.502 | 23 | 2017-07-24 |
| NAT-UM3 | 1145 | Colonia | -34.375 | -57.502 | 5 | 2017-07-24 |
| NAT-UM4 | 1145 | Colonia | -34.376 | -57.500 | 13 | 2017-07-26 |
| NAT-UM5 | CHAC | Colonia | -34.413 | -57.843 | 7 | 2017-08-21 |
| NAT-RAW | EMBR | Colonia | -34.444 | -57.728 | 4 | 2017-07-17 |
| NAT-ZW8 | EMBR | Colonia | -34.444 | -57.728 | 8 | 2017-07-21 |
| INV-UM6 | ASCA | Texas | 31.754 | -106.405 | 25 | 2019-03-10 |
| INV-UM1 | BART | Texas | 30.305 | -97.695 | 23 | 2011-02-15 |
| INV-UM19 | CAME | Louisiana | 30.022 | -90.065 | 12 | 2004-03-30 |
| INV-UM7 | ELEM | Texas | 30.260 | -97.718 | 28 | 2019-08-06 |
| INV-UM9 | ELEM | Texas | 30.260 | -97.718 | 5 | 2019-08-06 |
| INV-UM10 | INTR | Texas | 30.317 | -97.728 | 6 | 2019-08-08 |
| INV-UM5 | ROBE | Louisiana | 30.021 | -90.069 | 20 | 2011-02-18 |
| INV-UM16 | SOCC | Texas | 30.270 | -97.761 | 8 | 2019-08-09 |
| INV-UM17 | SOCC | Texas | 30.270 | -97.761 | 5 | 2019-08-09 |
Median and range of the number of calls per known repeatedly sampled individual across ranges.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range, Bird_ID) %>%
dplyr::summarise(num_calls = length(Bird_ID)) %>%
dplyr::summarise(
median_calls = round(median(num_calls), 2),
range_calls = paste(round(range(num_calls), 2), collapse = " - ")
) %>%
kable(align = rep("c", ncol(.)))
## `summarise()` has grouped output by 'range'. You can override using the `.groups` argument.
| range | median_calls | range_calls |
|---|---|---|
| Native | 10 | 4 - 25 |
| Invasive | 12 | 5 - 28 |
Median and range of signal to noise ratio (SNR) across ranges.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Individual") %>%
group_by(range) %>%
filter(!is.na(SNR)) %>%
dplyr::summarise(
median_SNR = round(median(SNR), 2),
range_SNR = paste(round(range(SNR), 2), collapse = " - ")
) %>%
kable(align = rep("c", ncol(.)))
| range | median_SNR | range_SNR |
|---|---|---|
| Native | 16.80 | 9.69 - 26.6 |
| Invasive | 17.23 | 7.5 - 29.11 |
As described in the main text of the associated publication, this larger dataset contained one call per unique individual (most birds recorded for this dataset were unmarked). Get the total number of sites sampled across ranges for this dataset.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
dplyr::summarise(count = n_distinct(site_year)) %>%
kable(align = rep("c", ncol(.)))
| count |
|---|
| 63 |
Number of site-years across ranges (e.g. unique sites).
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range) %>%
dplyr::summarise(count = n_distinct(site_year)) %>%
kable(align = rep("c", ncol(.)))
| range | count |
|---|---|
| Native | 37 |
| Invasive | 26 |
Number of sites across ranges and years.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range, year) %>%
dplyr::summarise(count = n_distinct(site_year)) %>%
kable(align = rep("c", ncol(.)))
## `summarise()` has grouped output by 'range'. You can override using the `.groups` argument.
| range | year | count |
|---|---|---|
| Native | 2017 | 37 |
| Invasive | 2004 | 11 |
| Invasive | 2011 | 8 |
| Invasive | 2018 | 1 |
| Invasive | 2019 | 6 |
Total number of calls per site-year across ranges.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range) %>%
dplyr::summarise(num_calls = length(range)) %>%
kable(align = rep("c", ncol(.)))
| range | num_calls |
|---|---|
| Native | 610 |
| Invasive | 757 |
Median and range of the number of calls per site-year.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range, site_year) %>%
dplyr::summarise(num_calls = length(site_year)) %>%
dplyr::summarise(
median_calls = round(median(num_calls), 2),
range_calls = paste(round(range(num_calls), 2), collapse = " - ")
) %>%
kable(align = rep("c", ncol(.)))
## `summarise()` has grouped output by 'range'. You can override using the `.groups` argument.
| range | median_calls | range_calls |
|---|---|---|
| Native | 15.0 | 5 - 53 |
| Invasive | 15.5 | 5 - 93 |
Median and range of SNR of site-years across ranges, by range.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range) %>%
filter(!is.na(SNR)) %>%
dplyr::summarise(
median_SNR = round(median(SNR), 2),
range_SNR = paste(round(range(SNR), 2), collapse = " - ")
) %>%
kable(align = rep("c", ncol(.)))
| range | median_SNR | range_SNR |
|---|---|---|
| Native | 18.77 | 7.98 - 37.19 |
| Invasive | 13.48 | 7.02 - 33.96 |
Number of calls, sampling dates and geographic coordinates per site-year across ranges. Used for appendix tables as well.
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range, site, year, dept_state, invasive_city, lat, lon) %>%
distinct(date) %>%
arrange(range, site, dept_state, year) %>%
ungroup() %>%
inner_join(
nat_inv_est %>%
as_tibble() %>%
filter(social_scale == "Site") %>%
group_by(range, site, date, lat, lon) %>%
dplyr::summarise(
n_calls = length(sound.files)
),
by = c("range", "site", "date", "lat", "lon")
) %>%
dplyr::mutate(
lat = round(lat, 3),
lon = round(lon, 3)
) %>%
dplyr::select(range, site, dept_state, invasive_city, lat, lon, n_calls, date) %>%
kable(align = rep("c", ncol(.)))
## `summarise()` has grouped output by 'range', 'site', 'date', 'lat'. You can override using the `.groups` argument.
| range | site | dept_state | invasive_city | lat | lon | n_calls | date |
|---|---|---|---|---|---|---|---|
| Native | 1145 | Colonia | NA | -34.375 | -57.502 | 6 | 2017-07-26 |
| Native | 1145 | Colonia | NA | -34.376 | -57.500 | 7 | 2017-07-26 |
| Native | 1145 | Colonia | NA | -34.375 | -57.502 | 2 | 2017-07-24 |
| Native | 1145 | Colonia | NA | -34.375 | -57.502 | 1 | 2017-07-28 |
| Native | 1145 | Colonia | NA | -34.376 | -57.500 | 1 | 2017-07-29 |
| Native | ARAP | Salto | NA | -30.946 | -57.520 | 12 | 2017-05-07 |
| Native | ARAZ | San Jose | NA | -34.535 | -56.812 | 15 | 2017-11-03 |
| Native | BAGU | Montevideo | NA | -34.848 | -56.384 | 20 | 2017-10-09 |
| Native | BCAR | Montevideo | NA | -34.788 | -56.223 | 13 | 2017-10-20 |
| Native | CEME | Montevideo | NA | -34.913 | -56.187 | 6 | 2017-10-18 |
| Native | CHAC | Colonia | NA | -34.413 | -57.843 | 12 | 2017-08-21 |
| Native | CISN | Maldonado | NA | -34.861 | -55.150 | 28 | 2017-09-13 |
| Native | ECIL | San Jose | NA | -34.361 | -57.060 | 4 | 2017-07-28 |
| Native | ECIL | San Jose | NA | -34.360 | -57.060 | 13 | 2017-07-28 |
| Native | ELTE | Maldonado | NA | -34.889 | -54.863 | 23 | 2017-09-13 |
| Native | EMBR | Colonia | NA | -34.444 | -57.728 | 22 | 2017-07-21 |
| Native | EMBR | Colonia | NA | -34.444 | -57.728 | 1 | 2017-07-17 |
| Native | FAGR | Montevideo | NA | -34.838 | -56.219 | 7 | 2017-09-05 |
| Native | GOLF | Montevideo | NA | -34.923 | -56.164 | 22 | 2017-11-20 |
| Native | HIPE | Maldonado | NA | -34.825 | -55.010 | 5 | 2017-09-12 |
| Native | INBR | Canelones | NA | -34.668 | -56.330 | 19 | 2017-09-03 |
| Native | INES-01 | Colonia | NA | -34.349 | -57.727 | 12 | 2017-07-03 |
| Native | INES-03 | Colonia | NA | -34.336 | -57.668 | 15 | 2017-07-11 |
| Native | INES-04 | Colonia | NA | -34.335 | -57.668 | 9 | 2017-07-11 |
| Native | INES-05 | Colonia | NA | -34.340 | -57.690 | 6 | 2017-07-15 |
| Native | INES-06 | Colonia | NA | -34.344 | -57.708 | 6 | 2017-07-13 |
| Native | INES-07 | Colonia | NA | -34.346 | -57.710 | 9 | 2017-07-13 |
| Native | INES-08 | Colonia | NA | -34.345 | -57.733 | 27 | 2017-07-13 |
| Native | KIYU | San Jose | NA | -34.607 | -56.715 | 8 | 2017-11-03 |
| Native | LENA | Colonia | NA | -34.411 | -57.838 | 19 | 2017-10-23 |
| Native | OJOS | Rocha | NA | -33.804 | -53.506 | 23 | 2017-11-16 |
| Native | PAVO | San Jose | NA | -34.442 | -56.967 | 25 | 2017-10-17 |
| Native | PEIX | Montevideo | NA | -34.765 | -56.279 | 19 | 2017-10-06 |
| Native | PFER | Colonia | NA | -34.468 | -57.831 | 34 | 2017-06-19 |
| Native | PFER | Colonia | NA | -34.465 | -57.827 | 19 | 2017-06-21 |
| Native | PIED | Colonia | NA | -34.413 | -57.849 | 21 | 2017-10-25 |
| Native | PLVE | Maldonado | NA | -34.870 | -55.264 | 11 | 2017-05-21 |
| Native | PROO | Canelones | NA | -34.855 | -56.022 | 12 | 2017-09-14 |
| Native | QUEB | Maldonado | NA | -34.834 | -55.260 | 16 | 2017-09-13 |
| Native | RIAC | Colonia | NA | -34.437 | -57.706 | 17 | 2017-06-28 |
| Native | RIAC | Colonia | NA | -34.436 | -57.706 | 8 | 2017-06-28 |
| Native | ROSA | Colonia | NA | -34.338 | -57.336 | 15 | 2017-07-27 |
| Native | SAUC | Maldonado | NA | -34.857 | -55.041 | 6 | 2017-09-12 |
| Native | SEMI | Colonia | NA | -34.326 | -57.680 | 11 | 2017-07-25 |
| Native | VALI | Rocha | NA | -34.334 | -53.803 | 23 | 2017-11-16 |
| Invasive | AIRP | Texas | Austin | 30.285 | -97.705 | 9 | 2019-08-07 |
| Invasive | AUDU | Connecticut | Milford | 41.176 | -73.102 | 17 | 2004-03-30 |
| Invasive | BALL | Louisiana | New Orleans | 30.013 | -90.132 | 26 | 2004-03-30 |
| Invasive | BAPT | Florida | Miami | 25.688 | -80.338 | 40 | 2004-03-30 |
| Invasive | BUCK | Connecticut | Milford | 41.217 | -73.038 | 60 | 2004-03-30 |
| Invasive | CANA | Louisiana | New Orleans | 29.981 | -90.110 | 13 | 2004-03-30 |
| Invasive | COMM | Texas | Austin | 30.404 | -97.705 | 11 | 2004-03-30 |
| Invasive | ELEM | Texas | Austin | 30.260 | -97.718 | 12 | 2011-02-15 |
| Invasive | ELEM | Texas | Austin | 30.260 | -97.718 | 61 | 2019-08-06 |
| Invasive | FOLS | Louisiana | New Orleans | 30.027 | -90.205 | 10 | 2004-03-30 |
| Invasive | GILB | Arizona | Gilbert | 33.331 | -111.791 | 16 | 2018-04-09 |
| Invasive | INTR | Texas | Austin | 30.316 | -97.719 | 15 | 2011-02-15 |
| Invasive | INTR | Texas | Austin | 30.317 | -97.728 | 82 | 2019-08-08 |
| Invasive | LAKE | Louisiana | New Orleans | 30.029 | -90.077 | 6 | 2011-02-18 |
| Invasive | LAWT | Texas | Dallas | 32.820 | -96.730 | 9 | 2011-02-20 |
| Invasive | MANO | Texas | Austin | 30.299 | -97.686 | 5 | 2019-08-09 |
| Invasive | MART | Texas | Austin | 30.253 | -97.731 | 14 | 2011-02-15 |
| Invasive | MART | Texas | Austin | 30.251 | -97.731 | 50 | 2019-08-10 |
| Invasive | MEAD | Connecticut | Milford | 41.210 | -73.071 | 28 | 2004-03-30 |
| Invasive | ROBE | Louisiana | New Orleans | 30.021 | -90.069 | 24 | 2011-02-18 |
| Invasive | SHAK | Connecticut | Stratford | 41.184 | -73.126 | 50 | 2004-03-30 |
| Invasive | SOCC | Texas | Austin | 30.272 | -97.767 | 77 | 2004-03-30 |
| Invasive | SOCC | Texas | Austin | 30.270 | -97.761 | 93 | 2019-08-09 |
| Invasive | SOFT | Texas | Austin | 30.281 | -97.725 | 14 | 2011-02-15 |
| Invasive | VALL | Texas | Austin | 30.261 | -97.711 | 5 | 2004-03-30 |
| Invasive | VALL | Texas | Austin | 30.261 | -97.711 | 10 | 2011-02-15 |
Read in nest estimates to get nest density estimates at recording sites per range. Used for a table in the appendix.
nests <- read.csv(file.path(path, "NestEstimates.csv")) %>%
dplyr::select(c(Range, Year, Department_orCityState, Site_Code, Estimated_Nests))
Get median and interquartile range of number of estimated nests per range.
nests %>%
group_by(Range) %>%
dplyr::mutate(
Range = factor(Range, levels = c("Native", "Invasive"))
) %>%
dplyr::summarise(
median_nests = round(median(Estimated_Nests), 2),
IQR_nests = round(IQR(Estimated_Nests), 2)
) %>%
kable(align = rep("c", ncol(.)))
| Range | median_nests | IQR_nests |
|---|---|---|
| Native | 19.5 | 24.25 |
| Invasive | 4.0 | 6.00 |
Were nest number estimates significantly different between ranges? First check normality of nest data using histograms and a Shapiro-Wilks test.
hist(nests %>%
pull(Estimated_Nests))
shapiro.test(nests %>%
pull(Estimated_Nests))
##
## Shapiro-Wilk normality test
##
## data: nests %>% pull(Estimated_Nests)
## W = 0.45358, p-value = 1.39e-10
The nest data was skewed, so I asked whether the population distributions were identical using a Mann-Whitney-Wilcoxon test (Wilcoxon rank sum test). This test does not assume the data are normally distributed, but does assume the samples are independent. In order to meet the assumption that samples are independent of each other, I dropped some invasive range site-years for which we had nest estimates in more than one year.
# 4 invasive range sites had counts for 2 years
n2drop <- nests %>%
group_by(Site_Code) %>%
dplyr::summarise(n = n()) %>%
filter(n > 1) %>%
pull(Site_Code) %>%
as.character()
n2drop
## [1] "AIRP" "ELEM" "INTR" "MART"
# Drop the 2011 estimates for these sites
nests2 <- nests %>%
dplyr::mutate(site_year = paste(Site_Code, Year, sep = "-")) %>%
filter(!site_year %in% paste(n2drop, "2011", sep = "-")) %>%
droplevels() %>%
dplyr::mutate(
Range = factor(Range, levels = c("Native", "Invasive"))
)
nrow(nests) - nrow(nests2)
## [1] 4
# Checking, looks good
# nests2 %>%
# group_by(Site_Code) %>%
# dplyr::summarise(n = n()) %>%
# pull(n) %>%
# unique()
# Still non-normal after samples above were dropped
shapiro.test(nests2 %>%
pull(Estimated_Nests))
##
## Shapiro-Wilk normality test
##
## data: nests2 %>% pull(Estimated_Nests)
## W = 0.46941, p-value = 8.462e-10
The median and interquartile range of these filtered data per range were the same as the full dataset.
nests2 %>%
group_by(Range) %>%
dplyr::summarise(
median_nests = round(median(Estimated_Nests), 2),
IQR_nests = round(IQR(Estimated_Nests), 2)
) %>%
kable(align = rep("c", ncol(.)))
| Range | median_nests | IQR_nests |
|---|---|---|
| Native | 19.5 | 24.25 |
| Invasive | 4.0 | 6.00 |
Mann-Whitney test with independent nest estimates. Computed exact p-values and 95% CIs to avoid using normal approximations. stats::wilcox.test uses the normal approximation for data with ties, as is the case here. Used coin::wilcox_test instead, which handles data with ties and can either get the conditional null distribution by Monte Carlo resampling, or the exact distribution. The confidence intervals are for the difference in location reported below.
levels(nests2$Range)
## [1] "Native" "Invasive"
mw_res <- coin::wilcox_test(Estimated_Nests ~ Range, data = nests2, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")
mw_res
##
## Exact Wilcoxon-Mann-Whitney Test
##
## data: Estimated_Nests by Range (Native, Invasive)
## Z = 4.2055, p-value = 2.872e-06
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
## 7 26
## sample estimates:
## difference in location
## 14
Read in shapefiles for nesting sites obtained by tracing site areas on Google My Maps.
# Read in polygons, no z-dimension saved but that doesn't matter here
nest_site_polys <- readOGR(file.path(path, "URY_US_Polygons.kml"), "URY_US_Polygons")
## OGR data source with driver: LIBKML
## Source: "/media/gsvidaurre/MYIOPSITTA/R/VocalLearning_PostInvasion/Data/URY_US_Polygons.kml", layer: "URY_US_Polygons"
## with 25 features
## It has 11 fields
## Warning in readOGR(file.path(path, "URY_US_Polygons.kml"), "URY_US_Polygons"):
## Z-dimension discarded
# str(nest_site_polys)
length(nest_site_polys)
## [1] 25
# Repoject to EPSG 3857, a projected coordinate system used by Google Maps that has units in meters
nest_site_polys <- spTransform(nest_site_polys, CRSobj = CRS("+init=epsg:3857"))
# Looks good
proj4string(nest_site_polys)
## [1] "+init=epsg:3857 +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs"
bbox(nest_site_polys)
## min max
## x -12444548 -6123569
## y -4146383 3939279
# Get area in meters squared per polyon
ns_areas <- gArea(nest_site_polys, byid = TRUE)
ns_areas
## 1 2 3 4 5 6
## 320484.6326 4799.3819 7724.8855 31521.7993 4357.3265 8459.8991
## 7 8 9 10 11 12
## 5254.0449 13691.6860 11709.7432 45114.1232 7733.4930 3597.5193
## 13 14 15 16 17 18
## 214495.9551 12704.6108 11368.1638 8720.9933 3764.1841 701.7397
## 19 20 21 22 23 24
## 10549.7286 2819.3939 21274.8657 670.4065 28646.5755 6449.2127
## 25
## 2228.0292
# Convert to km squared (1 meter squared = 1e-6 km squared)
# ns_km2 <- ns_areas/1e6
# ns_km2
# Convert to hectares (1 square meter = 0.0001 hectares)
ns_hect <- ns_areas/1e4
# Check out nest site names
# nest_site_polys$Name
# Convert to character and replace the new line symbol in PLVE
# Doesn't disappear after correction on Google My Maps
nest_site_polys$Name <- as.character(nest_site_polys$Name)
nest_site_polys$Name
## [1] "INTR_2011" "CHAC_2017" "PIED_2017" "RIAC_2017" "SEMI_2017"
## [6] "INES-08_2017" "INES-07_2017" "INES-06_2017" "INES-03_2017" "1145_2017"
## [11] "INBR_2017" "BCAR_2017" "ECIL_2017" "PLVE_2017" "QUEB_2017"
## [16] "CISN_2017" "HIPE_2017" "GILB_2018" "INTR_2019" "AIRP_2019"
## [21] "SOCC_2019" "MANO_2019" "MART_2019" "SOFT_2011" "LAWT_2011"
# Add nest areas to the data frame with nest numbers
# Calculate nest density at each site (total nests / total area)
nests2 <- nests %>%
dplyr::mutate(
Site_Year = paste(Site_Code, Year, sep = "_")
) %>%
left_join(
data.frame(
Site_Year = nest_site_polys$Name,
Nest_Site_Area = ns_hect
) %>%
# Use the same nest site polygon for MART 2011 and 2019
bind_rows(
data.frame(
Site_Year = "MART_2011",
Nest_Site_Area = ns_hect[grep("MART", nest_site_polys$Name)]
)
),
by = "Site_Year"
) %>%
dplyr::mutate(
Nest_Density = Estimated_Nests/Nest_Site_Area
)
nests2 %>%
dplyr::mutate(
Nest_Site_Area = round(Nest_Site_Area, 2),
Nest_Density = round(Nest_Density, 2)
) %>%
dplyr::select(c(Range, Year, Department_orCityState, Site_Code, Estimated_Nests, Nest_Site_Area, Nest_Density)) %>%
kable(align = rep("c", ncol(.)))
| Range | Year | Department_orCityState | Site_Code | Estimated_Nests | Nest_Site_Area | Nest_Density |
|---|---|---|---|---|---|---|
| Native | 2017 | Maldonado | PLVE | 10 | 1.27 | 7.87 |
| Native | 2017 | Colonia | RIAC | 109 | 3.15 | 34.58 |
| Native | 2017 | San Jose | ECIL | 247 | 21.45 | 11.52 |
| Native | 2017 | Colonia | INES-01 | 10 | NA | NA |
| Native | 2017 | Colonia | SEMI | 29 | 0.44 | 66.55 |
| Native | 2017 | Colonia | INES-03 | 50 | 1.17 | 42.70 |
| Native | 2017 | Colonia | INES-07 | 15 | 0.53 | 28.55 |
| Native | 2017 | Colonia | INES-06 | 20 | 1.37 | 14.61 |
| Native | 2017 | Colonia | INES-08 | 25 | 0.85 | 29.55 |
| Native | 2017 | Colonia | INES-05 | 6 | NA | NA |
| Native | 2017 | Colonia | 1145 | 8 | 4.51 | 1.77 |
| Native | 2017 | Colonia | ROSA | 41 | NA | NA |
| Native | 2017 | Colonia | CHAC | 19 | 0.48 | 39.59 |
| Native | 2017 | Canelones | INBR | 20 | 0.77 | 25.86 |
| Native | 2017 | Montevideo | BCAR | 33 | 0.36 | 91.73 |
| Native | 2017 | Maldonado | HIPE | 15 | 0.38 | 39.85 |
| Native | 2017 | Maldonado | QUEB | 10 | 1.14 | 8.80 |
| Native | 2017 | Maldonado | CISN | 9 | 0.87 | 10.32 |
| Native | 2017 | Colonia | PIED | 38 | 0.77 | 49.19 |
| Native | 2017 | Rocha | VALI | 13 | NA | NA |
| Invasive | 2018 | Gilbert AZ | GILB | 3 | 0.07 | 42.75 |
| Invasive | 2019 | Austin TX | INTR | 13 | 1.05 | 12.32 |
| Invasive | 2019 | Austin TX | ELEM | 1 | NA | NA |
| Invasive | 2019 | Austin TX | AIRP | 5 | 0.28 | 17.73 |
| Invasive | 2019 | Austin TX | SOCC | 12 | 2.13 | 5.64 |
| Invasive | 2019 | Austin TX | MANO | 4 | 0.07 | 59.67 |
| Invasive | 2019 | Austin TX | MART | 8 | 2.86 | 2.79 |
| Invasive | 2011 | Austin TX | MART | 6 | 2.86 | 2.09 |
| Invasive | 2011 | Austin TX | VALL | 2 | NA | NA |
| Invasive | 2011 | Austin TX | ELEM | 4 | NA | NA |
| Invasive | 2011 | Austin TX | SOFT | 6 | 0.64 | 9.30 |
| Invasive | 2011 | Austin TX | AIRP | 2 | NA | NA |
| Invasive | 2011 | Austin TX | BART | 1 | NA | NA |
| Invasive | 2011 | Austin TX | INTR | 20 | 32.05 | 0.62 |
| Invasive | 2011 | New Orleans LA | ROBE | 8 | NA | NA |
| Invasive | 2011 | New Orleans LA | LAKE | 2 | NA | NA |
| Invasive | 2011 | Dallas TX | LAWT | 4 | 0.22 | 17.95 |
# Summary statistics
nests2 %>%
dplyr::mutate(
Nest_Site_Area = round(Nest_Site_Area, 2),
Nest_Density = round(Nest_Density, 2)
) %>%
dplyr::select(c(Range, Estimated_Nests, Nest_Site_Area, Nest_Density)) %>%
pivot_longer(
cols = c("Estimated_Nests", "Nest_Site_Area", "Nest_Density"),
names_to = "statistic",
values_to = "values"
) %>%
dplyr::mutate(
Range = factor(Range, levels = c("Native", "Invasive")),
statistic = factor(statistic, levels = c("Estimated_Nests", "Nest_Site_Area", "Nest_Density"))
) %>%
group_by(Range, statistic) %>%
dplyr::summarise(
median_nests = round(median(values, na.rm = TRUE), 2),
IQR_nests = round(IQR(values, na.rm = TRUE), 2)
) %>%
kable(align = rep("c", ncol(.)))
## `summarise()` has grouped output by 'Range'. You can override using the `.groups` argument.
| Range | statistic | median_nests | IQR_nests |
|---|---|---|---|
| Native | Estimated_Nests | 19.50 | 24.25 |
| Native | Nest_Site_Area | 0.86 | 0.78 |
| Native | Nest_Density | 29.05 | 29.34 |
| Invasive | Estimated_Nests | 4.00 | 6.00 |
| Invasive | Nest_Site_Area | 0.84 | 2.44 |
| Invasive | Nest_Density | 10.81 | 14.39 |
Statistical tests of the difference in nest site area and density per range.
# Only 2 invasive range sites (INTR, MART) had counts for 2 years, as expected
n2drop <- nests2 %>%
# Remove sites for which area was not obtained
filter(!is.na(Nest_Site_Area)) %>%
group_by(Site_Code) %>%
dplyr::summarise(n = n()) %>%
filter(n > 1) %>%
pull(Site_Code) %>%
as.character()
n2drop
## [1] "INTR" "MART"
# Drop the 2011 estimates for these sites
nests3 <- nests2 %>%
# Remove sites for which area was not obtained
filter(!is.na(Nest_Site_Area)) %>%
dplyr::mutate(site_year = paste(Site_Code, Year, sep = "-")) %>%
filter(!site_year %in% paste(n2drop, "2011", sep = "-")) %>%
droplevels() %>%
dplyr::mutate(
Range = factor(Range, levels = c("Native", "Invasive"))
)
nrow(nests2) - nrow(nests3)
## [1] 13
# Checking, looks good
# nests3 %>%
# group_by(Site_Code) %>%
# dplyr::summarise(n = n()) %>%
# pull(n) %>%
# unique()
16 native and 8 invasive range sites remain for statistical tests of differences in nest site area and density. Sample sizes were not matched between ranges.
nests3 %>%
group_by(Range) %>%
dplyr::summarise(n = n())
## # A tibble: 2 x 2
## Range n
## <fct> <int>
## 1 Native 16
## 2 Invasive 8
Check normality of nest site areas using histograms and Shapiro-Wilks test.
hist(nests3 %>%
pull(Nest_Site_Area))
shapiro.test(nests3 %>%
pull(Nest_Site_Area))
##
## Shapiro-Wilk normality test
##
## data: nests3 %>% pull(Nest_Site_Area)
## W = 0.39782, p-value = 6.579e-09
Repeated for nest densities.
hist(nests3 %>%
pull(Nest_Density))
shapiro.test(nests3 %>%
pull(Nest_Density))
##
## Shapiro-Wilk normality test
##
## data: nests3 %>% pull(Nest_Density)
## W = 0.89756, p-value = 0.01912
Both nest variables were skewed (not normally distributed). As for total nests above, I asked whether the population distributions per nest variable were identical using a Mann-Whitney-Wilcoxon test (Wilcoxon rank sum test) as above.
levels(nests3$Range)
## [1] "Native" "Invasive"
mw_nestArea_res <- coin::wilcox_test(Nest_Site_Area ~ Range, data = nests3, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")
mw_nestArea_res
##
## Exact Wilcoxon-Mann-Whitney Test
##
## data: Nest_Site_Area by Range (Native, Invasive)
## Z = 1.4697, p-value = 0.1531
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
## -0.2851693 1.0666424
## sample estimates:
## difference in location
## 0.3671253
mw_nestDensity_res <- coin::wilcox_test(Nest_Density ~ Range, data = nests3, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")
mw_nestDensity_res
##
## Exact Wilcoxon-Mann-Whitney Test
##
## data: Nest_Density by Range (Native, Invasive)
## Z = 0.91856, p-value = 0.3826
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
## -7.633161 28.938786
## sample estimates:
## difference in location
## 8.42497
1. Smith-Vidaurre, G., Araya-Salas, M., and T.F. Wright. 2020. Individual signatures outweigh social group identity in contact calls of a communally nesting parrot. Behavioral Ecology 31(2), 448-458.
2. Buhrman-Deever, S.C., Rappaport, A.R. and J.W. Bradbury. 2007. Geographic variation in contact calls of feral North American populations of the monk parakeet. The Condor 109(2), 389-398.
3. Araya‐Salas, M., and G. Smith‐Vidaurre. 2017. warbleR: An R package to streamline analysis of animal acoustic signals. Methods in Ecology and Evolution 8(2), 184-191.
The session info printed here represents the environment used for the final RMarkdown knitting. The software and package versions employed for main results are reported in the appendix of the associated article in Animal Behavior.
sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] plyr_1.8.6 ggplotify_0.0.7 egg_0.4.5 gridExtra_2.3
## [5] cowplot_1.1.1 rgeos_0.5-5 rgdal_1.5-23 sp_1.4-5
## [9] coin_1.4-1 survival_3.2-11 lubridate_1.7.10 data.table_1.14.0
## [13] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.6 purrr_0.3.4
## [17] readr_1.4.0 tidyr_1.1.3 tibble_3.1.2 ggplot2_3.3.3
## [21] tidyverse_1.3.1 Rraven_1.0.13 pbapply_1.4-3 warbleR_1.1.27
## [25] NatureSounds_1.0.4 knitr_1.33 seewave_2.1.6 tuneR_1.3.3
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-7 matrixStats_0.59.0 fs_1.5.0
## [4] httr_1.4.2 tools_4.1.0 backports_1.2.1
## [7] utf8_1.2.1 R6_2.5.0 DBI_1.1.1
## [10] colorspace_2.0-1 withr_2.4.2 tidyselect_1.1.1
## [13] compiler_4.1.0 cli_2.5.0 rvest_1.0.0
## [16] xml2_1.3.2 sandwich_3.0-1 scales_1.1.1
## [19] mvtnorm_1.1-1 proxy_0.4-25 dtw_1.22-3
## [22] digest_0.6.27 rmarkdown_2.8 pkgconfig_2.0.3
## [25] htmltools_0.5.1.1 highr_0.9 dbplyr_2.1.1
## [28] rlang_0.4.11 readxl_1.3.1 rstudioapi_0.13
## [31] gridGraphics_0.5-1 generics_0.1.0 zoo_1.8-9
## [34] jsonlite_1.7.2 RCurl_1.98-1.3 magrittr_2.0.1
## [37] modeltools_0.2-23 Matrix_1.3-4 Rcpp_1.0.6
## [40] munsell_0.5.0 fansi_0.5.0 lifecycle_1.0.0
## [43] stringi_1.6.2 multcomp_1.4-17 yaml_2.2.1
## [46] MASS_7.3-54 grid_4.1.0 parallel_4.1.0
## [49] crayon_1.4.1 lattice_0.20-44 haven_2.4.1
## [52] splines_4.1.0 hms_1.1.0 pillar_1.6.1
## [55] rjson_0.2.20 fftw_1.0-6 codetools_0.2-18
## [58] stats4_4.1.0 reprex_2.0.0 glue_1.4.2
## [61] evaluate_0.14 BiocManager_1.30.15 modelr_0.1.8
## [64] vctrs_0.3.8 cellranger_1.1.0 gtable_0.3.0
## [67] assertthat_0.2.1 xfun_0.23 libcoin_1.0-8
## [70] broom_0.7.6 signal_0.7-7 rvcheck_0.1.8
## [73] TH.data_1.0-10 ellipsis_0.3.2
Social group sizes
We obtained the largest social group sizes observed during fieldwork in the native and invasive ranges. Some native range fieldwork observations were made during trapping for an independent study of genetic variation. All group sizes greater than or equal to 100 were set as estimates of 100 birds. This was an underestimate in most cases and was therefore more conservative. Group sizes were observed during staging, foraging, and flying by in each range.
Calculate median and interquartile range of the largest social groups estimated in each range.
Were estimated flock sizes equivalent between ranges? Checked normality of social group size data from our fieldwork in each range.
The social group size data was not normally distributed, and some flock sizes were obtained at the same sites in the same or different years. I retained one unique flock size estimate per site, then asked whether the population distributions were identical using a Mann-Whitney-Wilcoxon test (Wilcoxon rank sum test).
The distributions of estimated flock sizes per range were not identical. Native range estimated flock sizes were significantly larger. Note that the 95% CI could not be computed, likely due to the small sample size per range.
Figure 1
Made raincloud plots and saved these to an image file for Figure 1. Added asterisks for variables that were significantly different between ranges.