Invasive range calls were taken through a pre-processing workflow similar to previously published native range calls^[1]. Previously published invasive range calls for 2004 were provided by Susannah Buhrman-Deever^[2], and were pre-processed in the same way. See the appendix of the associated article in Animal Behavior for details on pre-processing steps. The package warbleR^[3] was used for pre-processing, making extended selection tables and spectrograms, and calculating acoustic measurements for downstream analyses. Some function names and other code in this script (and subsequent scripts) may need to be updated in order to reproduce results with more recent versions of the packages used. Here, we generated summary statistics of the combined native and invasive range datasets, then analyzed differences in nest and social group size estimates between ranges.

Install packages needed to run analyses in this script and the subsequent 3 scripts. Packages MASS, parallel, and grid are provided with the base R installation.

# Linux users may need to run the following terminal commands to install devtools and CRAN packages: 
# sudo apt-get install libcurl4-openssl-dev 
# sudo apt-get install libssl-dev
# sudo apt-get install libxml2-dev
# sudo apt install libgeos-dev
# sudo apt install libgdal-dev
# sudo apt install libudunits2-dev

# Install packages from CRAN
# Note that until orddom is back on CRAN, you may have to download an earlier version of the package from source
X <- c("ggplot2", "pbapply", "data.table", "tidyverse", "scales", "egg", "ggplotify", "grid", "rgdal", "ranger", "caret", "rgeos", "Rmisc", "Rraven", "lubridate", "knitr", "coin", "cowplot", "dplyr", "corrplot", "MLmetrics", "e1071", "gbm", "pdp", "pracma", "gridExtra", "facetscales", "grDevices", "gtable", "rstatix", "orddom", "IDmeasurer", "utils")

is_installed <- function(p) is.element(p, installed.packages()[,1])

lapply(1:length(X), function(x){
  if(!is_installed(X[x])){
    install.packages(X[x], repos = "http://lib.stat.cmu.edu/R/CRAN")
  }
})

# Install the most recent version of the warbleR package from GitHub
# Install devtools if not installed
if (!"devtools" %in% installed.packages()[,"Package"]) install.packages("devtools")

# Linux users may have to install fftw:
# sudo apt-get install libfftw3-dev libfftw3-doc
# as well as cmake for bioacoustics to be installed:
# sudo apt-get install cmake
library(devtools)
install_github("maRce10/warbleR")
install_github("zmjones/edarf", subdir = "pkg")

# facetscales has a bug when making boxplots, we did not use this package to make boxplots
install_github("zeehio/facetscales")

# To install a previous version of orddom from source:

# May need to install package psych from source depending on R version
pkgurl <- "https://cran.r-project.org/src/contrib/Archive/psych/psych_2.0.12.tar.gz"
install.packages(pkgurl, repos = NULL, type="source")

pkgurl <- "https://cran.r-project.org/src/contrib/Archive/orddom/orddom_3.1.tar.gz"
install.packages(pkgurl, repos = NULL, type="source")

Load packages and external functions.

rm(list = ls())

X <- c("warbleR", "pbapply", "Rraven", "tidyverse", "ggplot2", "data.table", "lubridate", "knitr", "coin", "rgdal", "rgeos", "cowplot", "egg", "ggplotify")

invisible(lapply(X, library, character.only = TRUE))

path <- "/media/gsvidaurre/MYIOPSITTA/R/VocalLearning_PostInvasion/Data"
gpath <- "/home/gsvidaurre/Desktop/MANUSCRIPTS/SimplerSignatures_PostInvasion/FIGURES"

seed <- 401

cores <- parallel::detectCores() - 2

# Call R code to make raincloud plots, downloaded from https://github.com/RainCloudPlots/RainCloudPlots
# Loading these scripts will install packages if these are not already on your machine
source("/home/gsvidaurre/Desktop/Software/RainCloudPlots-master/tutorial_R/R_rainclouds.R")
source("/home/gsvidaurre/Desktop/Software/RainCloudPlots-master/tutorial_R/summarySE.R")
source("/home/gsvidaurre/Desktop/Software/RainCloudPlots-master/tutorial_R/simulateData.R")

Read in the extended selection table that contains metadata and wave objects for pre-processed native and invasive range calls. Throughout this script, you can reproduce analyses by reading in the file “nat_inv_indiv_site_seltbl.csv”, which is a selection table that contains the metadata employed here.

nat_inv_est <- readRDS(file.path(path, "nat_inv_indiv_site_EST.RDS"))
glimpse(nat_inv_est)

## Rows: 1,596
## Columns: 26
## $ sound.files          <chr> "2017_07_17_EMBR_INDIV-RAW_MZ000199_1.WAV", "2017…
## $ selec                <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ start                <dbl> 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0…
## $ end                  <dbl> 0.2646363, 0.2310324, 0.2509257, 0.2378781, 0.220…
## $ date                 <date> 2017-07-17, 2017-07-17, 2017-07-17, 2017-07-17, …
## $ range                <fct> Native, Native, Native, Native, Native, Native, N…
## $ social_scale         <chr> "Individual", "Individual", "Individual", "Indivi…
## $ site                 <chr> "EMBR", "EMBR", "EMBR", "EMBR", "EMBR", "EMBR", "…
## $ Bird_ID              <chr> "NAT-RAW", "NAT-RAW", "NAT-RAW", "NAT-RAW", "NAT-…
## $ lat                  <dbl> -34.44437, -34.44437, -34.44437, -34.44437, -34.4…
## $ lon                  <dbl> -57.72845, -57.72845, -57.72845, -57.72845, -57.7…
## $ Visual_quality_score <fct> H, M, H, H, M, M, M, H, M, H, H, H, M, M, H, H, H…
## $ Overlapping_signal   <fct> N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N…
## $ Truncated            <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N",…
## $ tailored             <fct> y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y…
## $ prev_SNR             <dbl> 20.10350, 14.90409, 18.55063, 18.49727, 18.74672,…
## $ SNR_margin_works     <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y",…
## $ SNR_before_only      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ old.sound.file.name  <chr> "2017_07_17_EMBR_INDIV-RAW_MZ000199_1.WAV", "2017…
## $ SNR                  <dbl> 23.45415, 22.14815, 15.36660, 19.25200, 21.86770,…
## $ region               <chr> "West", "West", "West", "West", "West", "West", "…
## $ country              <chr> "Uruguay", "Uruguay", "Uruguay", "Uruguay", "Urug…
## $ site_year            <chr> "EMBR_2017", "EMBR_2017", "EMBR_2017", "EMBR_2017…
## $ year                 <chr> "2017", "2017", "2017", "2017", "2017", "2017", "…
## $ dept_state           <chr> "Colonia", "Colonia", "Colonia", "Colonia", "Colo…
## $ invasive_city        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…

# Write out selection table for sharing data (not in extended format, just a .csv)
# nat_inv_est %>%
#   as_tibble %>%
#   write.csv(., file.path(path, "nat_inv_indiv_site_seltbl.csv"), row.names = FALSE)

Repeatedly sampled individuals

Total number of repeatedly sampled individual calls between ranges and by range.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  dplyr::count() %>%
  kable(align = rep("c", ncol(.)))

n
229

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  group_by(range) %>%
  dplyr::count() %>%
  kable(align = rep("c", ncol(.)))

range	n
Native	97
Invasive	132

Number of known repeatedly sampled individuals across ranges.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  group_by(range) %>%
  dplyr::summarise(count = n_distinct(Bird_ID)) %>%
  kable(align = rep("c", ncol(.)))

range	count
Native	8
Invasive	9

Number of sites at which we sampled known repeatedly sampled individuals across ranges.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  group_by(range) %>%
  dplyr::summarise(count = n_distinct(site)) %>%
  kable(align = rep("c", ncol(.)))

range	count
Native	3
Invasive	7

Number of calls per known repeatedly sampled individual across ranges, and recording dates. Used for a table in the appendix.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  group_by(range, Bird_ID, lat, lon, site, dept_state) %>%
  dplyr::count() %>%
  arrange(range, site) %>%
  inner_join(
    nat_inv_est %>%
      as_tibble() %>%
      filter(social_scale == "Individual") %>%
      group_by(range, Bird_ID) %>%
      distinct(date),
    by = c("range", "Bird_ID")
  ) %>%
  ungroup() %>%
  dplyr::mutate(
    lat = round(lat, 3),
    lon = round(lon, 3)
  ) %>%
  dplyr::select(Bird_ID, site, dept_state, lat, lon, n, date) %>%
  kable(align = rep("c", ncol(.)))

Bird_ID	site	dept_state	lat	lon	n	date
NAT-AAT	1145	Colonia	-34.376	-57.500	12	2017-07-29
NAT-UM1	1145	Colonia	-34.375	-57.502	25	2017-07-28
NAT-UM2	1145	Colonia	-34.375	-57.502	23	2017-07-24
NAT-UM3	1145	Colonia	-34.375	-57.502	5	2017-07-24
NAT-UM4	1145	Colonia	-34.376	-57.500	13	2017-07-26
NAT-UM5	CHAC	Colonia	-34.413	-57.843	7	2017-08-21
NAT-RAW	EMBR	Colonia	-34.444	-57.728	4	2017-07-17
NAT-ZW8	EMBR	Colonia	-34.444	-57.728	8	2017-07-21
INV-UM6	ASCA	Texas	31.754	-106.405	25	2019-03-10
INV-UM1	BART	Texas	30.305	-97.695	23	2011-02-15
INV-UM19	CAME	Louisiana	30.022	-90.065	12	2004-03-30
INV-UM7	ELEM	Texas	30.260	-97.718	28	2019-08-06
INV-UM9	ELEM	Texas	30.260	-97.718	5	2019-08-06
INV-UM10	INTR	Texas	30.317	-97.728	6	2019-08-08
INV-UM5	ROBE	Louisiana	30.021	-90.069	20	2011-02-18
INV-UM16	SOCC	Texas	30.270	-97.761	8	2019-08-09
INV-UM17	SOCC	Texas	30.270	-97.761	5	2019-08-09

Median and range of the number of calls per known repeatedly sampled individual across ranges.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  group_by(range, Bird_ID) %>%
  dplyr::summarise(num_calls = length(Bird_ID)) %>%
  dplyr::summarise(
    median_calls = round(median(num_calls), 2),
    range_calls = paste(round(range(num_calls), 2), collapse = " - ")
  ) %>%
  kable(align = rep("c", ncol(.)))

## `summarise()` has grouped output by 'range'. You can override using the `.groups` argument.

range	median_calls	range_calls
Native	10	4 - 25
Invasive	12	5 - 28

Median and range of signal to noise ratio (SNR) across ranges.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Individual") %>%
  group_by(range) %>%
  filter(!is.na(SNR)) %>%
  dplyr::summarise(
    median_SNR = round(median(SNR), 2),
    range_SNR = paste(round(range(SNR), 2), collapse = " - ")
  ) %>%
  kable(align = rep("c", ncol(.)))

range	median_SNR	range_SNR
Native	16.80	9.69 - 26.6
Invasive	17.23	7.5 - 29.11

Call dataset of broader geographic resolution

As described in the main text of the associated publication, this larger dataset contained one call per unique individual (most birds recorded for this dataset were unmarked). Get the total number of sites sampled across ranges for this dataset.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  dplyr::summarise(count = n_distinct(site_year)) %>%
  kable(align = rep("c", ncol(.)))

count
63

Number of site-years across ranges (e.g. unique sites).

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  group_by(range) %>%
  dplyr::summarise(count = n_distinct(site_year)) %>%
  kable(align = rep("c", ncol(.)))

range	count
Native	37
Invasive	26

Number of sites across ranges and years.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  group_by(range, year) %>%
  dplyr::summarise(count = n_distinct(site_year)) %>%
  kable(align = rep("c", ncol(.)))

## `summarise()` has grouped output by 'range'. You can override using the `.groups` argument.

range	year	count
Native	2017	37
Invasive	2004	11
Invasive	2011	8
Invasive	2018	1
Invasive	2019	6

Total number of calls per site-year across ranges.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  group_by(range) %>%
  dplyr::summarise(num_calls = length(range)) %>%
  kable(align = rep("c", ncol(.)))

range	num_calls
Native	610
Invasive	757

Median and range of the number of calls per site-year.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  group_by(range, site_year) %>%
  dplyr::summarise(num_calls = length(site_year)) %>%
  dplyr::summarise(
    median_calls = round(median(num_calls), 2),
    range_calls = paste(round(range(num_calls), 2), collapse = " - ")
  ) %>%
  kable(align = rep("c", ncol(.)))

## `summarise()` has grouped output by 'range'. You can override using the `.groups` argument.

range	median_calls	range_calls
Native	15.0	5 - 53
Invasive	15.5	5 - 93

Median and range of SNR of site-years across ranges, by range.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  group_by(range) %>%
  filter(!is.na(SNR)) %>%
  dplyr::summarise(
    median_SNR = round(median(SNR), 2),
    range_SNR = paste(round(range(SNR), 2), collapse = " - ")
  ) %>%
  kable(align = rep("c", ncol(.)))

range	median_SNR	range_SNR
Native	18.77	7.98 - 37.19
Invasive	13.48	7.02 - 33.96

Number of calls, sampling dates and geographic coordinates per site-year across ranges. Used for appendix tables as well.

nat_inv_est %>%
  as_tibble() %>%
  filter(social_scale == "Site") %>%
  group_by(range, site, year, dept_state, invasive_city, lat, lon) %>%
  distinct(date) %>%
  arrange(range, site, dept_state, year) %>%
  ungroup() %>%
  inner_join(
    nat_inv_est %>%
      as_tibble() %>%
      filter(social_scale == "Site") %>%
      group_by(range, site, date, lat, lon) %>%
      dplyr::summarise(
        n_calls = length(sound.files)
      ),
  by = c("range", "site", "date", "lat", "lon")
  ) %>%
  dplyr::mutate(
    lat = round(lat, 3),
    lon = round(lon, 3)
  ) %>%
  dplyr::select(range, site, dept_state, invasive_city, lat, lon, n_calls, date) %>%
  kable(align = rep("c", ncol(.)))

## `summarise()` has grouped output by 'range', 'site', 'date', 'lat'. You can override using the `.groups` argument.

range	site	dept_state	invasive_city	lat	lon	n_calls	date
Native	1145	Colonia	NA	-34.375	-57.502	6	2017-07-26
Native	1145	Colonia	NA	-34.376	-57.500	7	2017-07-26
Native	1145	Colonia	NA	-34.375	-57.502	2	2017-07-24
Native	1145	Colonia	NA	-34.375	-57.502	1	2017-07-28
Native	1145	Colonia	NA	-34.376	-57.500	1	2017-07-29
Native	ARAP	Salto	NA	-30.946	-57.520	12	2017-05-07
Native	ARAZ	San Jose	NA	-34.535	-56.812	15	2017-11-03
Native	BAGU	Montevideo	NA	-34.848	-56.384	20	2017-10-09
Native	BCAR	Montevideo	NA	-34.788	-56.223	13	2017-10-20
Native	CEME	Montevideo	NA	-34.913	-56.187	6	2017-10-18
Native	CHAC	Colonia	NA	-34.413	-57.843	12	2017-08-21
Native	CISN	Maldonado	NA	-34.861	-55.150	28	2017-09-13
Native	ECIL	San Jose	NA	-34.361	-57.060	4	2017-07-28
Native	ECIL	San Jose	NA	-34.360	-57.060	13	2017-07-28
Native	ELTE	Maldonado	NA	-34.889	-54.863	23	2017-09-13
Native	EMBR	Colonia	NA	-34.444	-57.728	22	2017-07-21
Native	EMBR	Colonia	NA	-34.444	-57.728	1	2017-07-17
Native	FAGR	Montevideo	NA	-34.838	-56.219	7	2017-09-05
Native	GOLF	Montevideo	NA	-34.923	-56.164	22	2017-11-20
Native	HIPE	Maldonado	NA	-34.825	-55.010	5	2017-09-12
Native	INBR	Canelones	NA	-34.668	-56.330	19	2017-09-03
Native	INES-01	Colonia	NA	-34.349	-57.727	12	2017-07-03
Native	INES-03	Colonia	NA	-34.336	-57.668	15	2017-07-11
Native	INES-04	Colonia	NA	-34.335	-57.668	9	2017-07-11
Native	INES-05	Colonia	NA	-34.340	-57.690	6	2017-07-15
Native	INES-06	Colonia	NA	-34.344	-57.708	6	2017-07-13
Native	INES-07	Colonia	NA	-34.346	-57.710	9	2017-07-13
Native	INES-08	Colonia	NA	-34.345	-57.733	27	2017-07-13
Native	KIYU	San Jose	NA	-34.607	-56.715	8	2017-11-03
Native	LENA	Colonia	NA	-34.411	-57.838	19	2017-10-23
Native	OJOS	Rocha	NA	-33.804	-53.506	23	2017-11-16
Native	PAVO	San Jose	NA	-34.442	-56.967	25	2017-10-17
Native	PEIX	Montevideo	NA	-34.765	-56.279	19	2017-10-06
Native	PFER	Colonia	NA	-34.468	-57.831	34	2017-06-19
Native	PFER	Colonia	NA	-34.465	-57.827	19	2017-06-21
Native	PIED	Colonia	NA	-34.413	-57.849	21	2017-10-25
Native	PLVE	Maldonado	NA	-34.870	-55.264	11	2017-05-21
Native	PROO	Canelones	NA	-34.855	-56.022	12	2017-09-14
Native	QUEB	Maldonado	NA	-34.834	-55.260	16	2017-09-13
Native	RIAC	Colonia	NA	-34.437	-57.706	17	2017-06-28
Native	RIAC	Colonia	NA	-34.436	-57.706	8	2017-06-28
Native	ROSA	Colonia	NA	-34.338	-57.336	15	2017-07-27
Native	SAUC	Maldonado	NA	-34.857	-55.041	6	2017-09-12
Native	SEMI	Colonia	NA	-34.326	-57.680	11	2017-07-25
Native	VALI	Rocha	NA	-34.334	-53.803	23	2017-11-16
Invasive	AIRP	Texas	Austin	30.285	-97.705	9	2019-08-07
Invasive	AUDU	Connecticut	Milford	41.176	-73.102	17	2004-03-30
Invasive	BALL	Louisiana	New Orleans	30.013	-90.132	26	2004-03-30
Invasive	BAPT	Florida	Miami	25.688	-80.338	40	2004-03-30
Invasive	BUCK	Connecticut	Milford	41.217	-73.038	60	2004-03-30
Invasive	CANA	Louisiana	New Orleans	29.981	-90.110	13	2004-03-30
Invasive	COMM	Texas	Austin	30.404	-97.705	11	2004-03-30
Invasive	ELEM	Texas	Austin	30.260	-97.718	12	2011-02-15
Invasive	ELEM	Texas	Austin	30.260	-97.718	61	2019-08-06
Invasive	FOLS	Louisiana	New Orleans	30.027	-90.205	10	2004-03-30
Invasive	GILB	Arizona	Gilbert	33.331	-111.791	16	2018-04-09
Invasive	INTR	Texas	Austin	30.316	-97.719	15	2011-02-15
Invasive	INTR	Texas	Austin	30.317	-97.728	82	2019-08-08
Invasive	LAKE	Louisiana	New Orleans	30.029	-90.077	6	2011-02-18
Invasive	LAWT	Texas	Dallas	32.820	-96.730	9	2011-02-20
Invasive	MANO	Texas	Austin	30.299	-97.686	5	2019-08-09
Invasive	MART	Texas	Austin	30.253	-97.731	14	2011-02-15
Invasive	MART	Texas	Austin	30.251	-97.731	50	2019-08-10
Invasive	MEAD	Connecticut	Milford	41.210	-73.071	28	2004-03-30
Invasive	ROBE	Louisiana	New Orleans	30.021	-90.069	24	2011-02-18
Invasive	SHAK	Connecticut	Stratford	41.184	-73.126	50	2004-03-30
Invasive	SOCC	Texas	Austin	30.272	-97.767	77	2004-03-30
Invasive	SOCC	Texas	Austin	30.270	-97.761	93	2019-08-09
Invasive	SOFT	Texas	Austin	30.281	-97.725	14	2011-02-15
Invasive	VALL	Texas	Austin	30.261	-97.711	5	2004-03-30
Invasive	VALL	Texas	Austin	30.261	-97.711	10	2011-02-15

Nest estimates

Read in nest estimates to get nest density estimates at recording sites per range. Used for a table in the appendix.

nests <- read.csv(file.path(path, "NestEstimates.csv")) %>%
  dplyr::select(c(Range, Year, Department_orCityState, Site_Code, Estimated_Nests))

Get median and interquartile range of number of estimated nests per range.

nests %>%
  group_by(Range) %>%
  dplyr::mutate(
    Range = factor(Range, levels = c("Native", "Invasive"))
  ) %>%
  dplyr::summarise(
    median_nests = round(median(Estimated_Nests), 2),
    IQR_nests = round(IQR(Estimated_Nests), 2)
  ) %>%
  kable(align = rep("c", ncol(.)))

Range	median_nests	IQR_nests
Native	19.5	24.25
Invasive	4.0	6.00

Were nest number estimates significantly different between ranges? First check normality of nest data using histograms and a Shapiro-Wilks test.

hist(nests %>%
  pull(Estimated_Nests))

shapiro.test(nests %>%
  pull(Estimated_Nests))

## 
##  Shapiro-Wilk normality test
## 
## data:  nests %>% pull(Estimated_Nests)
## W = 0.45358, p-value = 1.39e-10

The nest data was skewed, so I asked whether the population distributions were identical using a Mann-Whitney-Wilcoxon test (Wilcoxon rank sum test). This test does not assume the data are normally distributed, but does assume the samples are independent. In order to meet the assumption that samples are independent of each other, I dropped some invasive range site-years for which we had nest estimates in more than one year.

# 4 invasive range sites had counts for 2 years
n2drop <- nests %>%
  group_by(Site_Code) %>%
  dplyr::summarise(n = n()) %>%
  filter(n > 1) %>%
  pull(Site_Code) %>%
  as.character()

n2drop

## [1] "AIRP" "ELEM" "INTR" "MART"

# Drop the 2011 estimates for these sites
nests2 <- nests %>%
  dplyr::mutate(site_year = paste(Site_Code, Year, sep = "-")) %>%
  filter(!site_year %in% paste(n2drop, "2011", sep = "-")) %>%
  droplevels() %>%
  dplyr::mutate(
    Range = factor(Range, levels = c("Native", "Invasive"))
  )
nrow(nests) - nrow(nests2)

## [1] 4

# Checking, looks good
# nests2 %>%
#   group_by(Site_Code) %>%
#   dplyr::summarise(n = n()) %>%
#   pull(n) %>%
#   unique()

# Still non-normal after samples above were dropped
shapiro.test(nests2 %>%
  pull(Estimated_Nests))

## 
##  Shapiro-Wilk normality test
## 
## data:  nests2 %>% pull(Estimated_Nests)
## W = 0.46941, p-value = 8.462e-10

The median and interquartile range of these filtered data per range were the same as the full dataset.

nests2 %>%
  group_by(Range) %>%
  dplyr::summarise(
    median_nests = round(median(Estimated_Nests), 2),
    IQR_nests = round(IQR(Estimated_Nests), 2)
  ) %>%
  kable(align = rep("c", ncol(.)))

Range	median_nests	IQR_nests
Native	19.5	24.25
Invasive	4.0	6.00

Mann-Whitney test with independent nest estimates. Computed exact p-values and 95% CIs to avoid using normal approximations. stats::wilcox.test uses the normal approximation for data with ties, as is the case here. Used coin::wilcox_test instead, which handles data with ties and can either get the conditional null distribution by Monte Carlo resampling, or the exact distribution. The confidence intervals are for the difference in location reported below.

levels(nests2$Range)

## [1] "Native"   "Invasive"

mw_res <- coin::wilcox_test(Estimated_Nests ~ Range, data = nests2, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")

mw_res

## 
##  Exact Wilcoxon-Mann-Whitney Test
## 
## data:  Estimated_Nests by Range (Native, Invasive)
## Z = 4.2055, p-value = 2.872e-06
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
##   7 26
## sample estimates:
## difference in location 
##                     14

Nest densities

Read in shapefiles for nesting sites obtained by tracing site areas on Google My Maps.

# Read in polygons, no z-dimension saved but that doesn't matter here
nest_site_polys <- readOGR(file.path(path, "URY_US_Polygons.kml"), "URY_US_Polygons")

## OGR data source with driver: LIBKML 
## Source: "/media/gsvidaurre/MYIOPSITTA/R/VocalLearning_PostInvasion/Data/URY_US_Polygons.kml", layer: "URY_US_Polygons"
## with 25 features
## It has 11 fields

## Warning in readOGR(file.path(path, "URY_US_Polygons.kml"), "URY_US_Polygons"):
## Z-dimension discarded

# str(nest_site_polys)
length(nest_site_polys)

## [1] 25

# Repoject to EPSG 3857, a projected coordinate system used by Google Maps that has units in meters
nest_site_polys <- spTransform(nest_site_polys, CRSobj = CRS("+init=epsg:3857"))

# Looks good
proj4string(nest_site_polys)

## [1] "+init=epsg:3857 +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs"

bbox(nest_site_polys)

##         min      max
## x -12444548 -6123569
## y  -4146383  3939279

# Get area in meters squared per polyon
ns_areas <- gArea(nest_site_polys, byid = TRUE)
ns_areas

##           1           2           3           4           5           6 
## 320484.6326   4799.3819   7724.8855  31521.7993   4357.3265   8459.8991 
##           7           8           9          10          11          12 
##   5254.0449  13691.6860  11709.7432  45114.1232   7733.4930   3597.5193 
##          13          14          15          16          17          18 
## 214495.9551  12704.6108  11368.1638   8720.9933   3764.1841    701.7397 
##          19          20          21          22          23          24 
##  10549.7286   2819.3939  21274.8657    670.4065  28646.5755   6449.2127 
##          25 
##   2228.0292

# Convert to km squared (1 meter squared = 1e-6 km squared)
# ns_km2 <- ns_areas/1e6
# ns_km2

# Convert to hectares (1 square meter = 0.0001 hectares)
ns_hect <- ns_areas/1e4

# Check out nest site names
# nest_site_polys$Name

# Convert to character and replace the new line symbol in PLVE
# Doesn't disappear after correction on Google My Maps
nest_site_polys$Name <- as.character(nest_site_polys$Name)
nest_site_polys$Name

##  [1] "INTR_2011"    "CHAC_2017"    "PIED_2017"    "RIAC_2017"    "SEMI_2017"   
##  [6] "INES-08_2017" "INES-07_2017" "INES-06_2017" "INES-03_2017" "1145_2017"   
## [11] "INBR_2017"    "BCAR_2017"    "ECIL_2017"    "PLVE_2017"    "QUEB_2017"   
## [16] "CISN_2017"    "HIPE_2017"    "GILB_2018"    "INTR_2019"    "AIRP_2019"   
## [21] "SOCC_2019"    "MANO_2019"    "MART_2019"    "SOFT_2011"    "LAWT_2011"

# Add nest areas to the data frame with nest numbers
# Calculate nest density at each site (total nests / total area)
nests2 <- nests %>%
  dplyr::mutate(
    Site_Year = paste(Site_Code, Year, sep = "_")
  ) %>%
  left_join(
    data.frame(
      Site_Year = nest_site_polys$Name,
      Nest_Site_Area = ns_hect
    ) %>%
    # Use the same nest site polygon for MART 2011 and 2019
    bind_rows(
      data.frame(
        Site_Year = "MART_2011",
        Nest_Site_Area = ns_hect[grep("MART", nest_site_polys$Name)]
      )
    ),
    by = "Site_Year"
  ) %>%
  dplyr::mutate(
    Nest_Density = Estimated_Nests/Nest_Site_Area
  ) 

nests2 %>%
  dplyr::mutate(
    Nest_Site_Area = round(Nest_Site_Area, 2),
    Nest_Density = round(Nest_Density, 2)
  ) %>%
  dplyr::select(c(Range, Year, Department_orCityState, Site_Code, Estimated_Nests, Nest_Site_Area, Nest_Density)) %>%
  kable(align = rep("c", ncol(.)))

Range	Year	Department_orCityState	Site_Code	Estimated_Nests	Nest_Site_Area	Nest_Density
Native	2017	Maldonado	PLVE	10	1.27	7.87
Native	2017	Colonia	RIAC	109	3.15	34.58
Native	2017	San Jose	ECIL	247	21.45	11.52
Native	2017	Colonia	INES-01	10	NA	NA
Native	2017	Colonia	SEMI	29	0.44	66.55
Native	2017	Colonia	INES-03	50	1.17	42.70
Native	2017	Colonia	INES-07	15	0.53	28.55
Native	2017	Colonia	INES-06	20	1.37	14.61
Native	2017	Colonia	INES-08	25	0.85	29.55
Native	2017	Colonia	INES-05	6	NA	NA
Native	2017	Colonia	1145	8	4.51	1.77
Native	2017	Colonia	ROSA	41	NA	NA
Native	2017	Colonia	CHAC	19	0.48	39.59
Native	2017	Canelones	INBR	20	0.77	25.86
Native	2017	Montevideo	BCAR	33	0.36	91.73
Native	2017	Maldonado	HIPE	15	0.38	39.85
Native	2017	Maldonado	QUEB	10	1.14	8.80
Native	2017	Maldonado	CISN	9	0.87	10.32
Native	2017	Colonia	PIED	38	0.77	49.19
Native	2017	Rocha	VALI	13	NA	NA
Invasive	2018	Gilbert AZ	GILB	3	0.07	42.75
Invasive	2019	Austin TX	INTR	13	1.05	12.32
Invasive	2019	Austin TX	ELEM	1	NA	NA
Invasive	2019	Austin TX	AIRP	5	0.28	17.73
Invasive	2019	Austin TX	SOCC	12	2.13	5.64
Invasive	2019	Austin TX	MANO	4	0.07	59.67
Invasive	2019	Austin TX	MART	8	2.86	2.79
Invasive	2011	Austin TX	MART	6	2.86	2.09
Invasive	2011	Austin TX	VALL	2	NA	NA
Invasive	2011	Austin TX	ELEM	4	NA	NA
Invasive	2011	Austin TX	SOFT	6	0.64	9.30
Invasive	2011	Austin TX	AIRP	2	NA	NA
Invasive	2011	Austin TX	BART	1	NA	NA
Invasive	2011	Austin TX	INTR	20	32.05	0.62
Invasive	2011	New Orleans LA	ROBE	8	NA	NA
Invasive	2011	New Orleans LA	LAKE	2	NA	NA
Invasive	2011	Dallas TX	LAWT	4	0.22	17.95

# Summary statistics
nests2 %>%
  dplyr::mutate(
    Nest_Site_Area = round(Nest_Site_Area, 2),
    Nest_Density = round(Nest_Density, 2)
  ) %>%
  dplyr::select(c(Range, Estimated_Nests, Nest_Site_Area, Nest_Density)) %>%
  pivot_longer(
    cols = c("Estimated_Nests", "Nest_Site_Area", "Nest_Density"),
    names_to = "statistic",
    values_to = "values"
  ) %>%
  dplyr::mutate(
    Range = factor(Range, levels = c("Native", "Invasive")),
    statistic = factor(statistic, levels = c("Estimated_Nests", "Nest_Site_Area", "Nest_Density"))
  ) %>%
  group_by(Range, statistic) %>%
  dplyr::summarise(
    median_nests = round(median(values, na.rm = TRUE), 2),
    IQR_nests = round(IQR(values, na.rm = TRUE), 2)
  ) %>%
  kable(align = rep("c", ncol(.)))

## `summarise()` has grouped output by 'Range'. You can override using the `.groups` argument.

Range	statistic	median_nests	IQR_nests
Native	Estimated_Nests	19.50	24.25
Native	Nest_Site_Area	0.86	0.78
Native	Nest_Density	29.05	29.34
Invasive	Estimated_Nests	4.00	6.00
Invasive	Nest_Site_Area	0.84	2.44
Invasive	Nest_Density	10.81	14.39

Statistical tests of the difference in nest site area and density per range.

# Only 2 invasive range sites (INTR, MART) had counts for 2 years, as expected
n2drop <- nests2 %>%
  # Remove sites for which area was not obtained
  filter(!is.na(Nest_Site_Area)) %>%
  group_by(Site_Code) %>%
  dplyr::summarise(n = n()) %>%
  filter(n > 1) %>%
  pull(Site_Code) %>%
  as.character()

n2drop

## [1] "INTR" "MART"

# Drop the 2011 estimates for these sites
nests3 <- nests2 %>%
  # Remove sites for which area was not obtained
  filter(!is.na(Nest_Site_Area)) %>%
  dplyr::mutate(site_year = paste(Site_Code, Year, sep = "-")) %>%
  filter(!site_year %in% paste(n2drop, "2011", sep = "-")) %>%
  droplevels() %>%
  dplyr::mutate(
    Range = factor(Range, levels = c("Native", "Invasive"))
  )
nrow(nests2) - nrow(nests3)

## [1] 13

# Checking, looks good
# nests3 %>%
#   group_by(Site_Code) %>%
#   dplyr::summarise(n = n()) %>%
#   pull(n) %>%
#   unique()

16 native and 8 invasive range sites remain for statistical tests of differences in nest site area and density. Sample sizes were not matched between ranges.

nests3 %>%
  group_by(Range) %>%
  dplyr::summarise(n = n())

## # A tibble: 2 x 2
##   Range        n
##   <fct>    <int>
## 1 Native      16
## 2 Invasive     8

Check normality of nest site areas using histograms and Shapiro-Wilks test.

hist(nests3 %>%
  pull(Nest_Site_Area))

shapiro.test(nests3 %>%
  pull(Nest_Site_Area))

## 
##  Shapiro-Wilk normality test
## 
## data:  nests3 %>% pull(Nest_Site_Area)
## W = 0.39782, p-value = 6.579e-09

Repeated for nest densities.

hist(nests3 %>%
  pull(Nest_Density))

shapiro.test(nests3 %>%
  pull(Nest_Density))

## 
##  Shapiro-Wilk normality test
## 
## data:  nests3 %>% pull(Nest_Density)
## W = 0.89756, p-value = 0.01912

Both nest variables were skewed (not normally distributed). As for total nests above, I asked whether the population distributions per nest variable were identical using a Mann-Whitney-Wilcoxon test (Wilcoxon rank sum test) as above.

levels(nests3$Range)

## [1] "Native"   "Invasive"

mw_nestArea_res <- coin::wilcox_test(Nest_Site_Area ~ Range, data = nests3, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")

mw_nestArea_res

## 
##  Exact Wilcoxon-Mann-Whitney Test
## 
## data:  Nest_Site_Area by Range (Native, Invasive)
## Z = 1.4697, p-value = 0.1531
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
##  -0.2851693  1.0666424
## sample estimates:
## difference in location 
##              0.3671253

mw_nestDensity_res <- coin::wilcox_test(Nest_Density ~ Range, data = nests3, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")

mw_nestDensity_res

## 
##  Exact Wilcoxon-Mann-Whitney Test
## 
## data:  Nest_Density by Range (Native, Invasive)
## Z = 0.91856, p-value = 0.3826
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
##  -7.633161 28.938786
## sample estimates:
## difference in location 
##                8.42497

Social group sizes

We obtained the largest social group sizes observed during fieldwork in the native and invasive ranges. Some native range fieldwork observations were made during trapping for an independent study of genetic variation. All group sizes greater than or equal to 100 were set as estimates of 100 birds. This was an underestimate in most cases and was therefore more conservative. Group sizes were observed during staging, foraging, and flying by in each range.

sgs <- read.csv(file.path(path, "SocialGroupSize_Observations.csv"))
# glimpse(sgs)

# unique(sgs$Estimated_Flock_Size)
unique(sgs$Behavioral_Context)

## [1] "Staging; flying"             "Foraging"                   
## [3] "Foraging; staging; flying"   "Foraging; flying"           
## [5] "Perched"                     "Flying"                     
## [7] "Foraging in flock on ground" "Flock in stadium lights"

unique(sgs$Observer)

## [1] "GSV" "VP"  "TW"

# Set all estimates of 100 birds or more to a cap of 100
sgs2 <- sgs %>%
  dplyr::mutate(
    Estimated_Flock_Size = gsub("Several hundred|Over 100", "100", Estimated_Flock_Size), 
    Estimated_Flock_Size = gsub("Over |More than |About ", "", Estimated_Flock_Size),
    Estimated_Flock_Size = as.numeric(Estimated_Flock_Size),
    Range = factor(Range, levels = c("Native", "Invasive"))
  )

glimpse(sgs2)

## Rows: 14
## Columns: 14
## $ Row_ID               <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
## $ Range                <fct> Native, Native, Native, Native, Native, Native, N…
## $ Country              <chr> "Uruguay", "Uruguay", "Uruguay", "Uruguay", "Urug…
## $ Dept_State           <chr> "Colonia", "Colonia", "Colonia", "Colonia", "Colo…
## $ Estimated_Flock_Size <dbl> 48, 100, 25, 50, 50, 100, 100, 100, 15, 30, 30, 3…
## $ Behavioral_Context   <chr> "Staging; flying", "Foraging", "Foraging; staging…
## $ Observer             <chr> "GSV", "GSV", "GSV", "GSV", "GSV", "GSV", "GSV", …
## $ Year                 <int> 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2…
## $ Date                 <chr> "04-June-2017", "04-June-2017", "05-June-2017", "…
## $ Site_Code            <chr> "CHAC", "DOMA", "DOMA", "DOMA", "DOMA", "ELGE", "…
## $ Site_Name            <chr> "La Chacra de los Olivos", "Estancia Dona Manuela…
## $ Notebook             <chr> "GSV Notebook 1", "GSV Notebook 1", "GSV Notebook…
## $ Notebook_page        <int> 8, 9, 11, 16, 21, 23, 42, 74, 8, 20, 25, 21, 25, …
## $ Notes                <chr> "", "", "", "", "", "", "", "", "", "", "", "Scan…

sgs2 %>%
  dplyr::select(Range, Country, Dept_State, Estimated_Flock_Size, Behavioral_Context, Date, Site_Code) %>%
  kable(align = rep("c", ncol(.)))

Range	Country	Dept_State	Estimated_Flock_Size	Behavioral_Context	Date	Site_Code
Native	Uruguay	Colonia	48	Staging; flying	04-June-2017	CHAC
Native	Uruguay	Colonia	100	Foraging	04-June-2017	DOMA
Native	Uruguay	Colonia	25	Foraging; staging; flying	05-June-2017	DOMA
Native	Uruguay	Colonia	50	Foraging; staging; flying	07-June-2017	DOMA
Native	Uruguay	Colonia	50	Foraging	08-June-2017	DOMA
Native	Uruguay	Colonia	100	Foraging; flying	08-June-2017	ELGE
Native	Uruguay	Colonia	100	Foraging; flying	12-June-2017	ELGE
Native	Uruguay	Colonia	100	Foraging	19-June-2017	SOLE
Invasive	U.S.	TX	15	Perched	06-August-2019	INTR
Invasive	U.S.	TX	30	Flying	09-August-2019	SOCC
Invasive	U.S.	TX	30	Flying	10-August-2019	MART
Invasive	U.S.	TX	37	Foraging in flock on ground	15-February-2011	MART
Invasive	U.S.	TX	30	Foraging in flock on ground	15-February-2011	INTR
Invasive	U.S.	LA	14	Flock in stadium lights	17-February-2011	JEFF

Calculate median and interquartile range of the largest social groups estimated in each range.

sgs2 %>%
  group_by(Range) %>%
  dplyr::summarise(
    median_vals = median(Estimated_Flock_Size),
    IQR_vals = IQR(Estimated_Flock_Size)
  )

## # A tibble: 2 x 3
##   Range    median_vals IQR_vals
##   <fct>          <dbl>    <dbl>
## 1 Native            75     50.5
## 2 Invasive          30     11.2

Were estimated flock sizes equivalent between ranges? Checked normality of social group size data from our fieldwork in each range.

hist(sgs2 %>%
  pull(Estimated_Flock_Size))

shapiro.test(sgs2 %>%
  pull(Estimated_Flock_Size))

## 
##  Shapiro-Wilk normality test
## 
## data:  sgs2 %>% pull(Estimated_Flock_Size)
## W = 0.81595, p-value = 0.007915

The social group size data was not normally distributed, and some flock sizes were obtained at the same sites in the same or different years. I retained one unique flock size estimate per site, then asked whether the population distributions were identical using a Mann-Whitney-Wilcoxon test (Wilcoxon rank sum test).

# 4 observations should remain per range
unique(as.character(sgs2$Site_Code))

# Randomly select one observation per site
set.seed(seed)
sgs3 <- sgs2 %>%
  dplyr::select(-c(Notebook, Notebook_page, Notes), Site_Name) %>%
  group_by(Site_Code) %>%
  nest() %>%
  ungroup() %>%
  # Randomly sample 1 observation (if just 1, then this single observation will be returned) 
  dplyr::mutate(
    rsfs = purrr::map2(data, 1, sample_n, replace = FALSE)
  ) %>%
  dplyr::select(-data) %>%
  unnest(rsfs)

glimpse(sgs3)

# Results I compute here on my local machine are different to those computed when knitting RMarkdown output. I think it has to do with setting the seed because the randomly sampled data points by RMarkdown are also different. I wrote out a new .csv with the randomly sampled social group size data to account for this
write.csv(sgs3, file.path(path, "SocialGroupSizes_randomSampling.csv"), row.names = FALSE)

sgs3 <- read.csv(file.path(path, "SocialGroupSizes_randomSampling.csv")) %>%
  dplyr::mutate(
    Range = factor(Range, levels = c("Native", "Invasive"))
  )

# Print the data frame
sgs3 %>%
  dplyr::select(Range, Country, Dept_State, Estimated_Flock_Size, Behavioral_Context, Date, Site_Code) %>%
  kable(align = rep("c", ncol(.)))

Range	Country	Dept_State	Estimated_Flock_Size	Behavioral_Context	Date	Site_Code
Native	Uruguay	Colonia	48	Staging; flying	04-June-2017	CHAC
Native	Uruguay	Colonia	50	Foraging	08-June-2017	DOMA
Native	Uruguay	Colonia	100	Foraging; flying	12-June-2017	ELGE
Native	Uruguay	Colonia	100	Foraging	19-June-2017	SOLE
Invasive	U.S.	TX	30	Foraging in flock on ground	15-February-2011	INTR
Invasive	U.S.	TX	30	Flying	09-August-2019	SOCC
Invasive	U.S.	TX	30	Flying	10-August-2019	MART
Invasive	U.S.	LA	14	Flock in stadium lights	17-February-2011	JEFF

# Perform the Mann-Whitney test with this reduced dataset to meet non-independence assumption
levels(sgs3$Range)

## [1] "Native"   "Invasive"

mw_flockSize_res <- coin::wilcox_test(Estimated_Flock_Size ~ Range, data = sgs3, conf.int = TRUE, conf.level = 0.95, ties.method = "mid-ranks", alternative = "two.sided", distribution = "exact")

mw_flockSize_res

## Warning in cci(alpha): cannot compute confidence interval

## 
##  Exact Wilcoxon-Mann-Whitney Test
## 
## data:  Estimated_Flock_Size by Range (Native, Invasive)
## Z = 2.3814, p-value = 0.02857
## alternative hypothesis: true mu is not equal to 0
## 95 percent confidence interval:
##  NA NA
## sample estimates:
## difference in location 
##                     53

The distributions of estimated flock sizes per range were not identical. Native range estimated flock sizes were significantly larger. Note that the 95% CI could not be computed, likely due to the small sample size per range.

Figure 1

Made raincloud plots and saved these to an image file for Figure 1. Added asterisks for variables that were significantly different between ranges.

# Combine total nests estimated, areas of nest sites, nest densities, estimated flock sizes per range
social_sumstats <- nests2 %>%
  dplyr::select(Range, Year, Site_Code, Estimated_Nests, Nest_Site_Area, Nest_Density) %>%
  full_join(
    sgs2 %>%
      dplyr::select(Range, Year, Site_Code, Estimated_Flock_Size),
    by = c("Range", "Site_Code", "Year")
  ) %>%
  pivot_longer(
    cols = c("Estimated_Nests", "Nest_Site_Area", "Nest_Density", "Estimated_Flock_Size"),
    names_to = "statistic",
    values_to = "values"
  ) %>%
  dplyr::mutate(
    Range = factor(Range, levels = c("Native", "Invasive")),
    # Rename summary statistics for plotting
    statistic = recode(
      statistic,
      `Estimated_Nests` = "Total nests",
      `Nest_Site_Area` = "Nest site area\n (hectares)",
      `Nest_Density` = "Nest density\n (nests/hectare)",
      `Estimated_Flock_Size` = "Maximum\n flock sizes" 
    ),
    statistic = factor(statistic, levels = c("Total nests", "Nest site area\n (hectares)", "Nest density\n (nests/hectare)", "Maximum\n flock sizes"))
  ) 
    
social_sumstats  


cols <- scales::alpha(c("navy", "orange"), 0.65)

# for superscript: `Nest_Site_Area` = "Nest~density~(nests/km^{2})
# with labeller = label_parsed inside facet_wrap

topms <- levels(social_sumstats$statistic)
topms

# Make a list of plots
# Doing this to customize y-axes of each plot, which isn't possible for plots in a single row with facetscales
gg_list <- list()

# i <- 3
invisible(pblapply(1:length(topms), function(i){
  
  tmp_df <- social_sumstats %>%
    # Drop NAs
    filter(!is.na(values)) %>%
    filter(statistic == topms[i]) %>%
    dplyr::mutate(
      statistic = as.character(statistic),
      statistic = factor(statistic, levels = unique(statistic))
    ) 
  
  # Get ymin and ymax values, add a buffer of 2 evenly spaced values before and after 
  ymin <- min(tmp_df$values)
  ymax <- max(tmp_df$values)
  
  # If the maximum value is over or equal to 90, set the max o 100 and use 25 as breaks for y-axis. If less than 90, use 10 as the breaks. Use this rule to set buffers too
  if(ymax >= 90 & ymax <= 110){
    ymax <- 100
    y_brks <- 25
    buf <- 10
  } else if(ymax > 110){
    y_brks <- 25
    buf <- 10
  } else if(ymax < 90){
    y_brks <- 10
    buf <- 10
  }
  
  # If on the first plot, add the y-axis label
  if(i == 1){
    yal <- "Values"
  } else{
    yal <- ""
  }
  
  # Initialize plot margins (top, right, bottom, left)
  tm <- 0 # -0.5
  rm <- 0
  bm <- 1
  lm <- 0
  
  # If on the first plot, make the left margin larger
  if(i == 1){
    lm <- 0.5
  # If on last plot, make the right margin larger
  } else if(i == length(topms)){
    rm <- 0.5
  }
  
  # Raincloud plot: Kernel smoothed with boxplot
  gg <- ggplot(tmp_df, aes(x = Range, y = values, fill = Range, color = Range)) + 
    geom_flat_violin(position = position_nudge(x = 0.1, y = 0), adjust = 1, size = 0.25)+
    geom_point(position = position_jitter(width = 0.05), size = 1, stroke = 0.5) +
    geom_boxplot(aes(x = as.numeric(Range) - 0.22, y = values), outlier.shape = NA, alpha = 0.55, width = 0.2, colour = "black", size = 0.25) +
    facet_wrap(~ statistic) +
    scale_fill_manual(values = cols) +
    scale_color_manual(values = cols) +
    guides(fill = FALSE, color = FALSE) +
    theme_bw() +
    xlab("") + ylab(yal) +
    scale_y_continuous(limits = round(c(0, ymax + buf)), breaks = round(seq(0, (ymax + buf), y_brks))) +
    theme(
      axis.title = element_text(size = 14),
      axis.text.y = element_text(size = 11),
      axis.text.x = element_text(size = 11),
      strip.text = element_text(size = 12, margin = margin(0.5, 1, 0.5, 1, "lines")),
      panel.grid.major = element_blank(),
      panel.grid.minor = element_blank(),
      axis.ticks = element_line(size = 0.25),
      plot.margin = unit(c(tm, rm, bm, lm), "lines")
    )
  
  # If on total nests or social group size estimates, add an asterisk to denote statistical significance
  
  if(any(grepl("Total|flock", topms[i]))){
    
    gg <- gg +
      geom_text(aes(label = "*", x = 1.5, y = ymax - (ymax/10)), size = 10, color = "black")
  }
  
  gg
  
  # Return the given plot
  gg_list[[i]] <<- gg
  
}))

gg_list

cols <- scales::alpha(c("navy", "orange"), 0.75)

# Make a legend
gg_leg <- gtable::gtable_filter(ggplot_gtable(ggplot_build(
  social_sumstats %>%
    ggplot(aes(x = Range, y = values)) +
    geom_errorbar(aes(ymin = values, ymax = values, color = Range), size = 2, width = 0.25) +
    scale_color_manual(values = cols) +
    guides(color = guide_legend(title = "", override.aes = list(size = 2))) +
    theme_bw() +
    theme(
      legend.text = element_text(size = 12),
      legend.position = "top",
      legend.margin = margin(0, 0, 0, 0),
      legend.box.margin = margin(-10, -10, -10, -10),
      legend.key.width = unit(3, "lines")
    )
)), "guide-box")

dev.off()

# Arrange the legend and plots into a single image file
# Animal Behavior double column figure = 190 mm or 7.48 in
jpeg(file.path(gpath, "Figure01_NestFlockSizeEstimates_final.jpeg"), units = "in", width = 7.48, height = 4, res = 1000)

ggarrange(
  as.ggplot(gg_leg),
  as.ggplot(ggarrange(
    as.ggplot(gg_list[[1]]),
    as.ggplot(gg_list[[2]]),
    as.ggplot(gg_list[[3]]),
    as.ggplot(gg_list[[4]]),
    nrow = 1
  )),
  nrow = 2,
  heights = c(1, 9)
)

dev.off()

References

1. Smith-Vidaurre, G., Araya-Salas, M., and T.F. Wright. 2020. Individual signatures outweigh social group identity in contact calls of a communally nesting parrot. Behavioral Ecology 31(2), 448-458.

2. Buhrman-Deever, S.C., Rappaport, A.R. and J.W. Bradbury. 2007. Geographic variation in contact calls of feral North American populations of the monk parakeet. The Condor 109(2), 389-398.

3. Araya‐Salas, M., and G. Smith‐Vidaurre. 2017. warbleR: An R package to streamline analysis of animal acoustic signals. Methods in Ecology and Evolution 8(2), 184-191.

The session info printed here represents the environment used for the final RMarkdown knitting. The software and package versions employed for main results are reported in the appendix of the associated article in Animal Behavior.

sessionInfo()

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] plyr_1.8.6         ggplotify_0.0.7    egg_0.4.5          gridExtra_2.3     
##  [5] cowplot_1.1.1      rgeos_0.5-5        rgdal_1.5-23       sp_1.4-5          
##  [9] coin_1.4-1         survival_3.2-11    lubridate_1.7.10   data.table_1.14.0 
## [13] forcats_0.5.1      stringr_1.4.0      dplyr_1.0.6        purrr_0.3.4       
## [17] readr_1.4.0        tidyr_1.1.3        tibble_3.1.2       ggplot2_3.3.3     
## [21] tidyverse_1.3.1    Rraven_1.0.13      pbapply_1.4-3      warbleR_1.1.27    
## [25] NatureSounds_1.0.4 knitr_1.33         seewave_2.1.6      tuneR_1.3.3       
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7        matrixStats_0.59.0  fs_1.5.0           
##  [4] httr_1.4.2          tools_4.1.0         backports_1.2.1    
##  [7] utf8_1.2.1          R6_2.5.0            DBI_1.1.1          
## [10] colorspace_2.0-1    withr_2.4.2         tidyselect_1.1.1   
## [13] compiler_4.1.0      cli_2.5.0           rvest_1.0.0        
## [16] xml2_1.3.2          sandwich_3.0-1      scales_1.1.1       
## [19] mvtnorm_1.1-1       proxy_0.4-25        dtw_1.22-3         
## [22] digest_0.6.27       rmarkdown_2.8       pkgconfig_2.0.3    
## [25] htmltools_0.5.1.1   highr_0.9           dbplyr_2.1.1       
## [28] rlang_0.4.11        readxl_1.3.1        rstudioapi_0.13    
## [31] gridGraphics_0.5-1  generics_0.1.0      zoo_1.8-9          
## [34] jsonlite_1.7.2      RCurl_1.98-1.3      magrittr_2.0.1     
## [37] modeltools_0.2-23   Matrix_1.3-4        Rcpp_1.0.6         
## [40] munsell_0.5.0       fansi_0.5.0         lifecycle_1.0.0    
## [43] stringi_1.6.2       multcomp_1.4-17     yaml_2.2.1         
## [46] MASS_7.3-54         grid_4.1.0          parallel_4.1.0     
## [49] crayon_1.4.1        lattice_0.20-44     haven_2.4.1        
## [52] splines_4.1.0       hms_1.1.0           pillar_1.6.1       
## [55] rjson_0.2.20        fftw_1.0-6          codetools_0.2-18   
## [58] stats4_4.1.0        reprex_2.0.0        glue_1.4.2         
## [61] evaluate_0.14       BiocManager_1.30.15 modelr_0.1.8       
## [64] vctrs_0.3.8         cellranger_1.1.0    gtable_0.3.0       
## [67] assertthat_0.2.1    xfun_0.23           libcoin_1.0-8      
## [70] broom_0.7.6         signal_0.7-7        rvcheck_0.1.8      
## [73] TH.data_1.0-10      ellipsis_0.3.2

Additional Materials 01 :

Summary Statistics, Nest Variables, and Social Group Sizes

19 June 2021

Repeatedly sampled individuals

Call dataset of broader geographic resolution

Nest estimates

Nest densities

References