Habitat associations
We used Percent Perfect Indicator (Dufrêne and Legendre 1997) to quantify species habitat associations. The Percent Perfect Indicator (PPI) score for a given species and habitat is the relative abundance of the species in the habitat multiplied by the detection frequency for the species in that habitat. The combination of relative abundance and the consistency of a species use of a habitat (frequency) is a strong indicator of the affinity of the species to that habitat type. PPI was calculated using the Minnesota Breeding Bird Atlas point count data. We constructed bar charts of the scores to help visualize the relative use of habitats by a species. We used 13 habitat types derived from LANDFIRE Existing Vegetation Types (Rollins et al. 2006) and National Wetlands Inventory (http://www.fws.gov/wetlands/) land cover datasets. In the calculation of the PPI it is important not to include habitats with a small number of records because their relative importance is overemphasized for the species. To this end, we excluded habitats that were dominant at less than 1% of the point counts. This resulted in many less common though ecologically important habitats (e.g., lowland hardwoods) being removed or lumped for this analysis (e.g., lowland hardwoods were removed, and marsh and wet meadow were combined). In these cases, we often report less common but important habitat types in the Breeding Habitat description of each species account.
Species distribution models and population estimates
One of our goals for the Minnesota Breeding Bird Atlas was to produce useful species distribution models for as many of Minnesota’s breeding birds as possible. Ideally, these models could also be used to estimate statewide species populations. However, population estimates require large sample sizes and do not allow good population estimates for species not typically detected from territorial behavior. We chose three modeling strategies to maximize the number of useful models we could produce. In total, we created models for 115 species.
Modeling strategy 1
Our primary modeling strategy closely follows Ball et al. (2016). We used Poisson Generalized Linear Models (McCullagh and Nelder 1989) with a log link. Detection probability and detection distance were accounted for using the QPAD approach (Sólymos et al. 2013). This allowed us to model density (i.e., pairs per 40 ha (100 acres)) by estimating a species/count specific offset based on the species’ effective detection radius (Sólymos et al. 2013). This approach was applied to 66 species that were reliably sampled by point counts, were detected as singing birds from at least 75 point counts, and where more than half of their records were of singing birds. Modeling was conducted in R (R Core Team 2016).
We used forward “branching” variable selection (Ball et al. 2016) based on Bayesian information criterion (Schwarz 1978) to construct models from a suite of 44 candidate covariates. In other words, variables were grouped into similar types (land use/land cover, disturbance, land cover structure, landscape metrics, and climate), and forward selection was applied to one group at a time. After applying forward selection to the first group of candidate covariates, all variables found to improve the model from that group became the starting point (i.e., null model) for the selection process applied to the second group of candidate covariates. This process continued until all variable groups were evaluated.
We used bootstrap aggression (Breiman 1996, Efron 2014) to estimate the variability in the selection process and for population estimates. We iterated this selection process 240 times to produce 240 models and predictions per species. The first run used the full dataset (minus a 10% holdout dataset), and all subsequent iterations used a random selection of the data with replacement. The random selection was stratified by year and region to address spatial and temporal issues.
The 240 models per species were used to produce statewide population estimates and predicted species distribution maps. The median population estimate (i.e., a single model) was used as our estimate for the Minnesota state population with 2.5% and 97.5% quantiles as our 95% Confidence Interval (CI). For the predictive species distribution maps, the median prediction for each cell was used. So it was possible that the model that produced the median prediction for a given cell might not be the model that produced the median prediction for an adjacent cell. This bootstrap method has a smoothing effect on the predicted distribution and reduces the probability of extreme predictions.
We acknowledge the Minnesota Supercomputing Institute (MSI; www.msi.umn.edu) at the University of Minnesota for providing access to their computer cluster which significantly increased the efficiency of running these analyses.
Modeling strategy 2
For species recorded on at least 75 point counts that did not meet the requirements for strategy 1 (e.g., territorial behavior), we used the same method but without the QPAD offset. This resulted in predictions in units of expected birds per 10-minute point count. Birds in this group were reliably sampled by point counts. These included non-passerines or passerines that were primarily detected by means other than singing. Because we did not account for detectability in this case, some species in this group had variability that was too high to model with a Poisson General Linear Model (GLM) as described above. In this case, we used a negative binomial GLM. Otherwise, the method is very similar to that described in Strategy 1 above. We applied Strategy 2 to 28 species.
Modeling strategy 3
We used MaxEnt (Phillips et al. 2006, Elith et al. 2011) to model the distributions of species that did not meet the criteria for strategy 1 or 2, but had at least 20 georeferenced records (point count and/or volunteer data). This strategy resulted in predictions that range from best to worst habitat. While this does not allow us to give any kind of numeric estimates, it does allow us to estimate where many rare or elusive species are more or less likely to be breeding. Our MaxEnt analysis was similar to that used by Zlonis et al. (2017). We used a suite of 36 potential covariates derived from the same set of variables referenced in strategy 1, above. Model development and selection was implemented using the ENMeval package (Muscarella et al. 2014) in program R (R Core Team 2016). To simplify interpretation, we limited MaxEnt analyses to linear and quadratic transformations of covariates (Merow et al. 2013). We also applied five-fold cross-validation (using 80% of the data) and averaged models from these five partitions. To account for bias in volunteer reporting locations (e.g., more likely to be located near cities and roads), we used a targeted background approach where the environmental characteristics available to each species were characterized from the locations of volunteer observations as opposed to random locations. The top model was determined using AICc (Burnham and Anderson 2002) and raw MaxEnt output was interpreted as an index of habitat suitability (Merow et al. 2013, Merow and Silander 2014). This method was applied to 21 species.
Literature Cited
Ball, Jeffrey R., Péter Sólymos, Fiona K. A. Schmiegelow, Samuel Hache, Jim Schieck, and Erin Bayne. 2016. “Regional Habitat Needs of a Nationally Listed Species, Canada Warbler (Cardellina canadensis), in Alberta, Canada. Avian Conservation and Ecology 11: 10.
Breiman, Leo. 1996. “Bagging Predictors.” Machine Learning 24: 123–140.
Burnham, Kenneth P., and David R. Anderson. 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. 2nd edition. New York: Springer-Verlag.
Dufrêne, Marc, and Pierre Legendre. 1997. “Species Assemblages and Indicator Species: The Need for a Flexible Asymmetrical Approach.” Ecological Monographs 67: 345–366.
Efron, Bradley. 2014. “Estimation and Accuracy after Model Selection.” Journal of the American Statistical Association 109: 991–1007.
Elith, Jane, Steven J. Phillips, Trevor Hastie, Miroslav Dudík, Yung En Chee, and Colin J. Yates. 2011. “A Statistical Explanation of MaxEnt for Ecologists.” Diversity and Distributions 17: 43–57.
McCullagh, P., and John A. Nelder. Generalized Linear Models. 2nd edition. New York: CRC Press.
Merow, Cory, and John A. Silander Jr. 2014. “A Comparison of Maxlike and Maxent for Modelling Species Distributions.” Methods in Ecology and Evolution 5: 215–225.
Merow, Cory, Matthew J. Smith, and John A. Silander Jr. 2013. “A Practical Guide to MaxEnt for Modeling Species’ Distributions: What it Does, and Why Inputs and Settings Matter.” Ecography 36: 1058–1069.
Muscarella, Robert, Peter J. Galante, Mariano Soley‐Guardia, Robert A. Boria, Jamie M. Kass, María Uriarte, and Robert P. Anderson. 2014. “ENMeval: An R Package for Conducting Spatially Independent Evaluations and Estimating Optimal Model Complexity for Maxent Ecological Niche Models.” Methods in Ecology and Evolution 5: 1198–1205.
Phillips, Steven J., Robert P. Anderson, and Robert E. Schapire. 2006. “Maximum Entropy Modeling of Species Geographic Distributions.” Ecological Modelling 190: 231–259.
R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/
Rollins, Matthew G., Robert E. Keane, Zhiliang Zhu, and James P. Menakis. 2006. “An Overview of the LANDFIRE Prototype Project.” In The LANDFIRE Prototype Project: Nationally Consistent and Locally Relevant Geospatial Data for Wildland Fire Management, edited by Matthew G. Rollins and Christine K. Frame, 5–43. U.S. Department of Agriculture Forest Service General Technical Report RMRS-GTR-175. Fort Collins, CO: USDA Forest Service, Rocky Mountain Research Station.
Schwarz, Gideon. 1978. “Estimating the Dimension of a Model.” Annals of Statistics 6: 461–464.
Sólymos, Péter, Steven M. Matsuoka, Erin M. Bayne, Subhash R. Lele, Patricia Fontaine, Steve G. Cumming, Diana Stralberg, Fiona K. A. Schmiegelow, and Samantha J. Song. 2013. “Calibrating Indices of Avian Density from Non-Standardized Survey Data: Making the Most of a Messy Situation.” Methods in Ecology and Evolution 4: 1047–1058.
Zlonis, Edmund J., Hannah G. Panci, Josh D. Bednar, Maya Hamady, and Gerald J. Niemi. 2017. “Habitats and Landscapes Associated with Bird Species in a Lowland Conifer-Dominated Ecosystem.” Avian Conservation and Ecology 12: 7. doi: 10.5751/ACE-00954-120107