Habitat scale model output of Port Fourchon, LA dervied from drone and satellite imagery taken in fall and spring 2023.

Website: https://www.bco-dmo.org/dataset/948167
Data Type: model results
Version: 1
Version Date: 2025-01-08

Project
» CAREER: Integrating Seascapes and Energy Flow: learning and teaching about energy, biodiversity, and ecosystem function on the frontlines of climate change (Louisiana E-scapes)
ContributorsAffiliationRole
Nelson, JamesUniversity of Louisiana at LafayettePrincipal Investigator, Contact
Leavitt, HerbertUniversity of Louisiana at LafayetteStudent
Thomas, AlexanderUniversity of Louisiana at LafayetteStudent
Soenen, KarenWoods Hole Oceanographic Institution (WHOI BCO-DMO)BCO-DMO Data Manager

Abstract
This dataset and code form part of a broader analysis aimed at evaluating the relationship between habitat structure and species abundance across multiple spatial scales in a rapidly changing estuarine environment near Port Fourchon, Louisiana. Specifically, the code implements a Generalized Additive Modeling (GAM) approach to identify the optimal spatial scale at which habitat features—derived from satellite imagery—best explain the abundance of common estuarine species observed during the Fall 2022 drop sampling season. The data processing pipeline begins by merging species count data and environmental variables (salinity, temperature, site coordinates) with spatial habitat metrics, including percent edge habitat, mangrove edge length, and land-water ratio. These metrics are calculated at varying spatial scales, defined by buffer radii (20–600 m) and edge distances (1, 3, 5 m). The GAMs iteratively test combinations of predictors while excluding highly correlated variables to reduce multicollinearity. Models are ranked by Akaike Information Criterion (AIC), and the best models are selected based on performance across scales. The outputs include: AIC scores for all tested models across scales. Identification of the top model explaining white shrimp abundance. Evaluation of individual predictor significance and spatial autocorrelation in residuals. The results indicate that the relationship between habitat structure and estuarine species  is oftne scale-dependent, with percent edge habitat and mangrove edge length emerging as significant predictors at specific scales. Outputs are saved in CSV files for model summaries and GAM diagnostics, while visualizations illustrate R² values across spatial scales, predictor significance, and observed vs. predicted species abundance. This pipeline provides a quantitative framework for identifying ecologically relevant spatial scales and assessing the effects of habitat change, such as mangrove encroachment, on species distributions. The findings contribute to a broader effort to model species-habitat relationships in coastal systems and inform management strategies in the face of climate-driven habitat change.


Coverage

Location: Marshes surrounding Port Fourchon, Louisiana.
Spatial Extent: N:29.164671 E:-90.149744 S:29.092646 W:-90.269831
Temporal Extent: 2022-09-23 - 2022-09-29

Methods & Sampling

No raw data is included in this dataset. For collection methods of data used in this analysis, please refer to methods outlined in the linked datasets.


Data Processing Description

The data processing pipeline for this analysis begins with the preparation of input datasets, which include habitat data derived from satellite imagery and community composition data. The habitat data files, such as google2022_edge[edge]_buf[buffer].csv, contain spatial metrics including mangrove edge length, land-water ratios, and percent edge cover. The community composition data, PtFouSept2022count_ns.csv, provides site-specific counts for the target species (eg. Litopenaeus setiferus, abbreviated as PENSET) along with environmental variables like salinity, temperature, and geographic coordinates. These datasets are merged using a common identifier (site_date_key) to align spatial and ecological information.

Once merged, all predictor variables, such as % mangrove edge and % land-water ratio, are standardized by subtracting the mean and dividing by the standard deviation to ensure consistent scaling. A correlation matrix is computed to identify and exclude highly correlated predictor pairs (correlation > 0.8) to reduce multicollinearity in subsequent models. The core modeling process is conducted by the model_compile function, which iterates over combinations of buffer radii (100–600 m or satscale, 20-150 for smallscale) and edge distances (1, 3, 5 m) to assess species-habitat relationships. For each combination, spatial metrics are calculated, and generalized additive models (GAMs) are fitted using subsets of predictors. Knots for all GAM models were limited to k=4 to try and limit over fitting. Models with correlated variables are excluded, and the remaining models are ranked by Akaike Information Criterion (AIC). The best models within two AIC points of the top model are recorded, capturing their formula, AIC, buffer, and edge values. This process outputs several files. 

The top model is then identified based on its frequency of selection across all scales and is saved for further analysis. Using this top model, the univariate_edge function evaluates individual predictor performance across scales. Residuals are examined for spatial autocorrelation using Moran’s I test, and the significance of predictors, including their interactions (e.g., mangrove edge × land-water ratio), is assessed. These results are consolidated and saved in outputs_over_scales.csv. 

Visualization and diagnostics follow, with R² values plotted across buffer sizes and edge distances to identify the habitat scale with the strongest explanatory power. Key predictors, such as mangrove edge and percent edge cover, are visualized in relation to species abundance, with significance indicated by custom point shapes (determined using threshold of p<0.05). At the best scale, relationships between key variables and predicted species abundance are visualized, while residual and diagnostic plots (e.g., Cook’s distance) are used to assess model fit. Predictions from the top model are compared to observed counts to validate performance.

The final outputs of this pipeline include several CSV files and visualizations. The  species_aic files show the AIC scores for all the models considered for each species at each scale. Then, the model_list files list all models that were within 2 points of the lowest AIC score for each set. Next, the  model_compare file shows how many times a particular model occurs within model_list along with it's mean R-squared value. Finally, the topmodel.txt file contains the formula for the GAM model that was ultimately selected for further analysis. The output_over_scales.csv file summarizes model results across scales, including R² values, p-values, and significant predictors. Visualizations include plots of R² across scales and relationships between predictors and species abundance, as well as diagnostic plots for residuals and influential observations. This comprehensive pipeline rigorously evaluates the influence of spatial metrics on species abundance, identifying the optimal habitat scale for common in Port Fourchon, LA.

This analysis was performed in R version 4.3.2. Relevant packages in this analysis are as follows: 
mgcv_1.9-0, nlme_3.1-163, gam_1.22-4, foreach_1.5.2, spdep_1.3-5, sf_1.0-16, spData_2.3.1, lubridate_1.9.3, forcats_1.0.0, stringr_1.5.1, dplyr_1.1.4, purrr_1.0.2, readr_2.1.4, tidyr_1.3.0, tibble_3.2.1, ggplot2_3.5.1, tidyverse_2.0.0, scales_1.3.0, statmod_1.5.0, pscl_1.5.9     

loaded via a namespace (and not attached): gtable_0.3.4, lattice_0.21-9, tzdb_0.4.0, vctrs_0.6.5, tools_4.3.2        generics_0.1.3, proxy_0.4-27, fansi_1.0.5, pkgconfig_2.0.3, Matrix_1.6-1.1, KernSmooth_2.23-22, lifecycle_1.0.4    compiler_4.3.2, farver_2.1.1, deldir_2.0-4, munsell_0.5.0, codetools_0.2-19, class_7.3-22, pillar_1.9.0  crayon_1.5.2, MASS_7.3-60, classInt_0.4-10, wk_0.9.1, iterators_1.0.14, boot_1.3-28.1, tidyselect_1.2.1, stringi_1.8.2  labeling_0.4.3 , grid_4.3.2, colorspace_2.1-0, cli_3.6.1, magrittr_2.0.3, utf8_1.2.4, e1071_1.7-14, withr_3.0.2, sp_2.1-3        timechange_0.2.0, hms_1.1.3, viridisLite_0.4.2, s2_1.1.6, rlang_1.1.2, Rcpp_1.0.11, glue_1.6.2, DBI_1.1.3          rstudioapi_0.15.0, R6_2.5.1, units_0.8-5     


BCO-DMO Processing Description

* Created zip files for submitted datasets


[ table of contents | back to top ]

Data Files

File
model_comparisons.zip
(ZIP Archive (ZIP), 4.88 KB)
MD5:353a6d2dc2f41442de80fba0135f1e98
Statistical intermediate that compares the number of times a given model was within 2AIC points of the lowest AIC score for that scale. The model that is selected as having the lowest AIC score most often is ultimately selected as the 'best' model and used to study variable effects over scale. Naming scheme is: scale of analysis, species code, "model_comparison".

Parameters in the .csv files of the folder:
Column Name,Column Description [Include meaning of any codes or flags used in data column as well as detection limits.],Units of measurement,missing data/no data value
best_model ,GAM formula for the model being compared,no units,no missing values
count ,the number of instances over the course of the all permutations of buffer and edge at this analysis scale that this model was within 2 points of having the lowest AIC score ,count ,no missing values
r.sq ,the average adjusted GAM r.sq value over all scales. ,no units,no missing values
mean_buffer,the mean buffer value for all instances where this model was selected. ,meters ,no missing values
model_lists.zip
(ZIP Archive (ZIP), 17.97 KB)
MD5:5e3d3309b313fd827be42865f0da1b10
List of models that were determined to be within 2 AIC points of the best model for each permutation of edge and buffer.

Parameters of the .csv files:
Column Name,Column Description [Include meaning of any codes or flags used in data column as well as detection limits.],Units of measurement,missing data/no data value
edge,"distance from the marsh edge in meters used to define edge habitat (1,3,or 5)",meters ,no missing values
buffer,radius of habitat region around the sample site used to aggregate habitat data. ,meters ,no missing values
sp,species code for the current analysis,no units,no missing values
best_model,model formula ,no units,no missing values
aic,aic score for the model ,no units,no missing values
r.sq ,the average adjusted GAM r.sq value over all scales. ,no units,no missing values
output_over_scale.zip
(ZIP Archive (ZIP), 14.37 KB)
MD5:e26ca2d2e1e59f39d1fc7fad8229dbc3
Tables that show selected model (as seen in the top_model.txt file corresponding to the species and analysis) outputs over all scales in the analysis. naming convention follows: analysis scale, "_", species code, "output_over_scales.csv".

Parameters of the .csv files:
Column Name,Column Description [Include meaning of any codes or flags used in data column as well as detection limits.],Units of measurement,missing data/no data value
edge,"distance from the marsh edge in meters used to define edge habitat (1,3,or 5)",meters ,no missing values
buffer,radius of habitat region around the sample site used to aggregate habitat data. ,meters ,no missing values
sp,species code for the current analysis,no units,no missing values
best_model,model formula ,no units,no missing values
r_sq ,the average adjusted GAM r.sq value over all scales. ,no units,no missing values
dev_ex,deviance in response variable explained by the model,no units,no missing values
moran_P,P value for Moran's I test ,no units,no missing values
cov,column indicating presence of covariance between variables. -Inf indicates no covariance,no units,no missing values
p.edge_perc,p value for % edge habitat ,no units,no missing values
p.edge_l.mangrove,p value for % mangrove habitat,no units,no missing values
p.land_water_ratio,p value for %land ,no units,no missing values
p.man_lwr,p value for interaction of % mangrove and % land ,no units,no missing values
p.man_edge,p value for interaction of % mangrove and % edge,no units,no missing values
spatial_analysis_across_scales.R
(R Script, 135.27 KB)
MD5:ce04bd064c0a697fe804cc339e864d4f
main code file, takes inputs from linked datasets and produces outputs shown here
species_aic.zip
(ZIP Archive (ZIP), 119.04 KB)
MD5:36ae9c9aaea60b28ec25d651a4a99aad
AIC scores for all models tested at every permutation for every species. naming convention is: species code, "_", scale of analysis, "edge", edge distance (m), "buf", buffer distance (m).
topmodel.zip
(ZIP Archive (ZIP), 2.48 KB)
MD5:7c8d4c8b1ce1373821fe734619a375b5
Text files with the formula of the top model chosen for the analysis for each species at each analysis scale. Naming convention is: analysis scale, species code, "_topmodel".

[ table of contents | back to top ]

Related Datasets

IsDerivedFrom
Leavitt, H., Thomas, A., Nelson, J. (2025) Habitat variables (mangrove, marsh, water) of Port Fourchon, LA dervied from drone imagery taken in spring 2023. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 1) Version Date 2025-01-08 doi:10.26008/1912/bco-dmo.948112.1 [view at BCO-DMO]
Relationship Description: The dataset "2023 Drone-derived habitat scales from Port Fourchon, LA" contains the habitat variable tables for each site included in the analysis at scales where we used drone-derived data to calculate habitat metrics. Each table corresponds to a habitat scale that was tested in this mode. In the code, this data is imported as the object hab and merged with species data to generate the pf.env table used in the GAM models. The scripts necessary to generate this data from raw shapfiles are also included.
Leavitt, H., Thomas, A., Nelson, J. (2025) Habitat variables (mangrove, marsh, water) of Port Fourchon, LA dervied from satellite imagery taken in fall 2022. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 1) Version Date 2025-01-06 doi:10.26008/1912/bco-dmo.947975.1 [view at BCO-DMO]
Relationship Description: The dataset "2022 Satellite-derived habitat scales from Port Fourchon, Louisiana" contains the habitat variable tables for each site included in the analysis at scales where we used satellite-derived data to calculate habitat metrics. Each table corresponds to a habitat scale that was tested in this mode. In the code, this data is imported as the object hab and merged with species data to generate the pf.env table used in the GAM models. The scripts necessary to generate this data from raw shapfiles are also included.
Leavitt, H., Thomas, A., Nelson, J. (2025) Species counts, site-level information and environmental context sampled near Port Fourchon, Louisiana from September 23 - 29, 2022. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 1) Version Date 2025-04-07 doi:10.26008/1912/bco-dmo.947784.1 [view at BCO-DMO]
Relationship Description: The dataset "Drop-sampling species data collected from Port Fourchon, LA during Fall 2022 sampling season" contains the species abundance data used in this analysis. Abundance data is linked to the habitat data in this analysis using the identifier site_date_key. In the code, this data is imported as the object comtab and merged with habitat date to produce a dataset pf.env including species abundance and the habitat variables at a given scale for each site in the analysis.

[ table of contents | back to top ]

Parameters

Parameters for this dataset have not yet been identified

[ table of contents | back to top ]

Project Information

CAREER: Integrating Seascapes and Energy Flow: learning and teaching about energy, biodiversity, and ecosystem function on the frontlines of climate change (Louisiana E-scapes)


Coverage: Saltmarsh ecosystem near Port Fourchon, LA


NSF Award Abstract:
Coastal marshes provide a suite of vital functions that support natural and human communities. Humans frequently take for granted and exploit these ecosystem services without fully understanding the ecological feedbacks, linkages, and interdependencies of these processes to the wider ecosystem. As demands on coastal ecosystem services have risen, marshes have experienced substantial loss due to direct and indirect impacts from human activity. The rapidly changing coastal ecosystems of Louisiana provide a natural experiment for understanding how coastal change alters ecosystem function. This project is developing new metrics and tools to assess food web variability and test hypotheses on biodiversity and ecosystem function in coastal Louisiana. The research is determining how changing habitat configuration alters the distribution of energy across the seascape in a multitrophic system. This work is engaging students from the University of Louisiana Lafayette and Dillard University in placed-based learning by immersing them in the research and local restoration efforts to address land loss and preserve critical ecosystem services. Students are developing a deeper understanding of the complex issues facing coastal regions through formal course work, directed field work, and outreach. Students are interacting with stakeholders and managers who are currently battling coastal change. Their directed research projects are documenting changes in coastal habitat and coupling this knowledge with the consequences to ecosystems and the people who depend on them. By participating in the project students are emerging with knowledge and training that is making them into informed citizens and capable stewards of the future of our coastal ecosystems, while also preparing them for careers in STEM. The project is supporting two graduate students and a post-doc.

The transformation and movement of energy through a food web are key links between biodiversity and ecosystem function. A major hurdle to testing biodiversity ecosystem function theory is a limited ability to assess food web variability in space and time. This research is quantifying changing seascape structure, species diversity, and food web structure to better understand the relationship between biodiversity and energy flow through ecosystems. The project uses cutting edge tools and metrics to test hypotheses on how the distribution, abundance, and diversity of key species are altered by ecosystem change and how this affects function. The hypotheses driving the research are: 1) habitat is a more important indirect driver of trophic structure than a direct change to primary trophic pathways; and 2) horizontal and vertical diversity increases with habitat resource index. Stable isotope analysis is characterizing energy flow through the food web. Changes in horizontal and vertical diversity in a multitrophic system are being quantified using aerial surveys and field sampling. To assess the spatial and temporal change in food web resources, the project is combining results from stable isotope analysis and drone-based remote sensing technology to generate consumer specific energetic seascape maps (E-scapes) and trophic niche metrics. In combination these new metrics are providing insight into species’ responses to changing food web function across the seascape and through time.

This project is jointly funded by Biological Oceanography and the Established Program to Stimulate Competitive Research (EPSCoR).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.



[ table of contents | back to top ]

Funding

Funding SourceAward
NSF Division of Ocean Sciences (NSF OCE)

[ table of contents | back to top ]