Knowledge of predator diets provides numerous insights into their ecology. Diet estimation therefore remains an active area of research in quantitative ecology. Quantitative Fatty Acid Signature Analysis (QFASA) is a method of estimating the diet composition of predators that was introduced by Iverson et al. (2004).
The fundamental unit of information in QFASA is a fatty acid signature (signature), which is a vector of proportions describing the fatty acid mass composition of lipids. For diet estimation, signatures from one or more predators and from samples of all types of prey potentially consumed by the predators are required. Calibration coefficients, which adjust for the differential metabolism of individual fatty acids by predators, are also required. Given those data inputs, a predator signature is modeled as a linear mixture of the mean prey-type signatures and diet composition is estimated as the specific mixture that minimizes a measure of distance between the observed and modeled predator signatures.
The package qfasar originated as a byproduct of exploration into the performance of QFASA, investigation of alternative modeling and estimation techniques, and the desire to share computational capacity among collaborators participating in that work. qfasar implements a variety of options for the estimation of predator diets in typical QFASA applications, including some new diagnostic measures, as well as methods to explore and potentially improve the performance of a library of prey signatures. In addition, qfasar is designed to facilitate additional research into improved estimation methods through implementation of various bootstrapping and simulation techniques. Several methods that have not previously been published have been included to facilitate investigation of their performance.
Because qfasar has been intentionally designed to support future research into QFASA methodology, the workflow that may be necessary is impossible to predict. Consequently, the user may need to do a little more work and assume greater responsibility for code development than is the case with some R packages that implement analyses with a more predictable workflow. Specific tasks and computations have been encapsulated into functions, but the user needs to know which functions to call and in what order. In other words, the user needs to be familiar with the breadth of functionality available. However, general guidance is provided for users whose only need is to estimate predator diet composition (see Getting started).
Users whose objectives involve basic diet estimation need to know how to get their data organized for an analysis, understand the options for diet estimation that are available in qfasar, and ultimately estimate diet composition. That information is summarized in the following sections:
Users that wish to evaluate the performance of a prey library, explore a prey library for hidden structure to potentially improve diet estimates, or use diagnostic techniques that might provide insights into the suitability of diet estimates or potential estimation problems should also read the section Diagnostic functionality.
Users whose objectives involve simulation-based research should read the section Simulation functionality for an overview of the tools implemented in qfasar.
Once qfasar has been installed, it can be attached and made
available for use by typing the command
library(qfasar)
.
To view a list of the contents of qfasar, type
library(help=qfasar)
.
To view the documentation of a function, which may contain more
technical information than is included in this vignette, type
?function_name
.
To be notified when new versions of qfasar are available, e-mail a request to be added to the qfasar distribution list to [email protected]. Please include your name and affiliation in the request.
Three types of data are potentially necessary when using
qfasar. The user will need to use an appropriate base R
function, such as read.csv()
, to read data into data
frames, possibly modify the data frames to get them in the required
formats, and pass the data frames as arguments to qfasar
functions. Note that qfasar has strict formatting requirements
for the data frames, which are described in the following
subsections.
Users may find it most convenient to store data in comma separated variable (csv) ASCII files in formats similar or identical to the data frames.
The prey signature data frame contains information on prey types, unique sample identifiers, and fatty acid signature proportions or percentages. The formatting requirements for this file are:
header=TRUE
option when reading the data into R. Fatty
acid names in the prey data frame must exactly match the names in the
predator and fatty acid suite data frames (described below).As an example, the first few rows and columns of a prey signature data file used by Bromaghin et al. (2015), in comma separated variable (csv) format, follow. Note that extra spaces have been added between commas and the subsequent values to enhance readability here, but extra spaces should not be included in the data file.
Species, ID, c12.0, c13.0, Iso14, c14.0, c14.1w9, c14.1w7, c14.1w5
Atlantic argentine, E68, 0.17, 0.05, 0.02, 6.17, 0.21, 0.03, 0.07
Atlantic argentine, E69, 0.11, 0.05, 0.03, 6.37, 0.27, 0.03, 0.06
Atlantic argentine, E70, 0.14, 0.05, 0.03, 6.64, 0.26, 0.03, 0.06
Atlantic argentine, E71, 0.15, 0.05, 0.03, 7.16, 0.28, 0.02, 0.06
Atlantic argentine, E72, 0.13, 0.05, 0.02, 6.65, 0.29, 0.02, 0.05
Atlantic argentine, E73, 0.14, 0.04, 0.02, 5.68, 0.20, 0.02, 0.06
Atlantic argentine, E74, 0.16, 0.05, 0.02, 6.70, 0.25, 0.02, 0.06
Atlantic argentine, E75, 0.17, 0.04, 0.02, 6.32, 0.23, 0.03, 0.06
Atlantic argentine, E76, 0.11, 0.04, 0.02, 6.94, 0.27, 0.02, 0.06
Atlantic argentine, E77, 0.14, 0.05, 0.02, 6.97, 0.26, 0.03, 0.06
Butterfish, E141, 0.30, 0.11, 0.03, 5.87, 0.09, 0.05, 0.05
Butterfish, E142, 0.10, 0.10, 0.02, 4.93, 0.07, 0.03, 0.04
NOTE: the prey data file can contain columns with additional “tombstone” information such as sex, age, sampling date, sampling location, etc. If so, read the data file into a data frame and from that data frame create a second data frame in the required format. Including the sample ID in both data frames will allow the subsequent merging of qfasar results with the tombstone information.
The predator signature data frame must be formatted exactly like the prey signature data frame (see Prey signature data), with each record being comprised of a predator type, a unique sample ID, and signature proportions.
The only difference between how prey and predator data are handled by qfasar is that the first column in the predator data frame is used to designate types of predators for which separate estimates of mean diet composition are desired. For example, Rode et al. (2014) estimated mean diet composition for four sex-age classes of polar bears, so the first column might contain the strings adult-male, adult-female, subadult-male, and subadult-female. If separate mean estimates are not desired for predator types, each row of the data frame should have an identical string in the first column, such as polar-bear.
NOTE: the predator data file can contain columns with additional “tombstone” information such as sex, age, sampling date, sampling location, etc. If so, read the data file into a data frame and from that data frame create a second data frame in the required format. Including the sample ID in both data frames will allow the subsequent merging of qfasar results with the tombstone information.
The fatty acid suite data frame contains fatty acid names, one or more sets of calibration coefficients, and definitions for one or more suites of fatty acids. The data frame must strictly meet the following formatting requirements.
header=TRUE
option when
reading the data into R.Use of the first row in the data frame to denote the set of calibration coefficients and fatty acid suite to be used in an analysis is intended to allow all available sets of calibration coefficients and all potential suites of fatty acids to be conveniently stored in the same file, while permitting fast and easy switching between them. It also allows all fatty acid data to be stored in a single file, which is more convenient and less confusing than having separate data files for each suite of fatty acids that one might want to use.
The first few rows of a csv fatty acid suite data file follow below as an example. This example has a single set of calibration coefficients (harbor_seal), definitions for two suites of fatty acids (dietary and ext_dietary), and comments in the last column. Note that extra spaces have been added between commas and the subsequent values to enhance readability here, but extra spaces should not be included in the data file.
fatty_acid, harbor_seal, dietary, ext_dietary, comments
use_me, 1, 0, 1, Values from Rosen and Tollit (2012) unless noted
c12.0, 0.863, 0, 0,
c13.0, 0.516, 0, 0,
Iso14, 0.806, 0, 0,
c14.0, 0.588, 0, 1,
c14.1w9, 0.823, 0, 0,
c14.1w7, 2.853, 0, 0,
c14.1w5, 12.137, 0, 0,
Iso15, 0.538, 0, 0,
Anti15, 0.785, 0, 0,
c15.0, 0.673, 0, 0,
c15.1w8, 0.942, 0, 0,
c15.1w6, 6.372, 0, 0,
Iso16, 0.812, 0, 0,
c16.0, 0.479, 0, 1,
c16.1w11, 1.898, 0, 0,
c16.1w9, 3.506, 0, 0,
c16.1w7, 2.377, 0, 1,
c7Me16.0, 0.976, 0, 0,
c16.1w5, 1.55, 0, 0,
c16.2w6, 0.745, 1, 1,
Iso17, 0.864, 0, 0,
c16.2w4, 0.954, 1, 1,
c16.3w6, 0.759, 1, 1,
c17.0, 0.1, 0, 1,
c16.3w4, 0.552, 1, 1,
c17.1, 1.618, 0, 0,
c16.3w1, 1, 1, 1, From Nordstrom et al. (2008), BDL in Rosen & Tollit
The first steps in an analysis performed in qfasar involve
reading the fatty acid suite, prey signature data, and possibly predator
signature data into data frames (see Data frames). This will most
likely be accomplished using the base R function
read.csv()
, or a related function, using code similar to
that in the following code snippet.
# Specify directories and files.
my_data_dir <- "C:/Research/QFASA/My_Project"
my_fa_suites <- "my_fa_file"
my_prey_sigs <- "my_prey_data"
my_pred_sigs <- "my_pred_data"
# Set the working directory.
setwd(my_data_dir)
# Read the fatty acid suite information and signatures into data frames.
df_fa <- read.csv(file = my_fa_suites, header = TRUE)
df_prey <- read.csv(file = my_prey_sigs, header = TRUE)
df_pred <- read.csv(file = my_pred_sigs, header = TRUE)
The initial data frames read from files may need to be modified to get them in the format required by qfasar (see Data frames). Once the data frames have been correctly formatted, the following function calls are, or may be, required to prepare the data for analysis:
prep_fa()
with the
fatty acid suites data frame.prep_sig()
with the
prey data frame.prep_sig()
with the predator
data frame is required if the analysis will involve predator signature
data.cc_aug()
is required if
signatures are augmented (see Scaling signatures and
Calibration
coefficient for an augmented proportion.The tasks performed by each of the functions mentioned in the above list are described in the follow subsections.
The function prep_fa()
takes the fatty acid suite data
frame as input and returns a list containing the following objects.
The following code snippet contains an example call to
prep_fa()
.
# Process the fatty acid suite information.
fa_info <- prep_fa(df_fa = df_fa)
str(fa_info)
The function prep_sig()
modifies raw fatty acid
signatures in two important ways to prepare them for analysis, replacing
invalid signature proportions and scaling signatures, each of which is
described below.
prep_sig()
returns a list containing the following
objects. Note that many of these objects are required inputs to other
functions in qfasar.
Any signature proportions that are negative, missing values (value NA
in R), or equal to zero are generally invalid. prep_sig()
replaces these invalid values, usually with a small positive constant
(with one exception described below). This is required because the
definitions of two popular distance measures, the Aitchison (Stewart et
al. 2014) and the Kullback-Leibler (Iverson et al. 2004) involve
logarithms and log(0) is not defined. prep_sig()
implements
two slightly different methods of replacing invalid proportions, the
choice of method being controlled by the argument zero_rep
in the call to prep_sig()
.
If 0 <= zero_rep
<= 0.01, the specified value is
used to replace invalid proportions. Note that if either the Aitchison
or Kullback-Leibler distance measure will be used in an analysis, the
value of zero_rep
must be strictly greater than 0. The
chi-square distance measure (Stewart et al. 2014) is defined for
proportions of zero, so the value of zero_rep
may equal
zero if that distance measure will be used in the analysis. Information
on the distance measures implemented in qfasar is available in
the section Distance
measures below.
If 10 <= zero_rep
<= 100, the value is interpreted
as a percentage, the smallest non-zero proportion in the signature data
is multiplied by the percentage and divided by 100, and the result is
used to replace invalid proportions. The default value for
zero_rep
is 75, a rather uninformed choice.
No matter which of these two methods is used to replace invalid proportions, the modified signatures are then scaled to sum to 1.0 using the multiplicative method (Martin-Fernandez et al. 2011).
Note that for analyses involving both prey and predator signatures,
such as estimation of diet composition, it may be advisable to use a
common value to replace invalid proportions in both prey and predator
signatures. One way to accomplish this is to call
prep_sig()
with the either the prey or predator data frame,
and then use the value it returns as zero_rep_val
in the
call with the other data frame. The following code snippet presents an
example of this.
# Prepare the signatures for analysis.
prey_info <- prep_sig(df_sig = df_prey,
fa_names = fa_info$fa_names,
use_fa = fa_info$use,
zero_rep = 90)
str(prey_info)
pred_info <- prep_sig(df_sig = df_pred,
fa_names = fa_info$fa_names,
use_fa = fa_info$use,
zero_rep = prey_info$zero_rep_val)
str(pred_info)
Most QFASA applications use a subset of all available fatty acids
(e.g., Iverson et al. 2004; Wang et al. 2010; Bromaghin et al. 2013).
The traditional analytical method is to censor the fatty acids that are
not wanted and scale the proportions of the remaining fatty acids so
they sum to 1.0. However, Bromaghin et al. (2016b) found that the
traditional scaling approach can meaningfully bias diet estimates under
some conditions. They explored the performance of two alternative
scaling options and found that both avoid the bias caused by traditional
scaling and have comparable performance. prep_sig()
implements all three scaling options evaluated by Bromaghin et
al. (2016b).
The choice of the scaling option is controled by the
prep_sig()
argument scale
.
scale = 1
Traditional scaling; the proportions within
each censored signature are scaled to sum to 1.0. This option is not
recommended for routine use in QFASA applications, as Bromaghin et
al. (2016b) found that it can meaningfully bias diet estimates under
some conditions. It is implemented here to provide compatibility with
traditional methods and to potentially facilitate future research.scale = 2
The proportions within each censored
signature are not scaled, so each signature will have a different
“partial sum” (see Bromaghin et al. (2016b) for a definition of that
term).scale = 3
Each censored signature is augmented with an
additional proportion whose value equals the sum of the censored
proportions, so that the proportions in each augmented signature sum to
1.0. This is the default option.The following code snippet illustrates a call to the function
prep_sig()
that overrides the default value of the scale
argument.
pred_info <- prep_sig(df_sig = df_pred,
fa_names = fa_info$fa_names,
use_fa = fa_info$use,
zero_rep = prey_info$zero_rep_val,
scale = 1)
Calibration coefficients provide a one-to-one mapping between the
prey and predator spaces (Bromaghin et al. 2015). However, when using
signature augmentation (see Scaling signatures),
no calibration coefficient is available for the augmented proportion. In
this circumstance, and if an analysis will involve diet estimation or
otherwise require use of calibration coefficients, the call to
prep_sig()
with the prey library must be followed by a call
to cc_aug()
before the analysis can proceed.
The following code snippet illustrates a call to the function
cc_aug()
.
# Compute the calibration coefficient for the augmented proportion.
new_cc <- cc_aug(sig_rep = prey_info$sig_rep,
sig_scale = prey_info$sig_scale,
cc_all = fa_info$cc,
use_fa = fa_info$use,
dist_meas = 1)
The algorithm implemented in cc_aug()
is straightforward
(Bromaghin In press). The calibration coefficients cc
are
used to transform complete prey signatures in sig_rep
to
the predator space, and the transformed signatures are censored to the
suite of fatty acids defined by the argument use_fa
and
then augmented (Bromaghin et al. 2016b). The subset of calibration
coefficients in the suite definition use_fa
are combined
with an initial calibration coefficient for the augmented proportion
(initial guess of 1), the censored signatures in sig_scale
are also transformed to the predator space, and the total distance
between the two sets of censored signatures is computed. The calibration
coefficient for the augmented proportion is taken as the value that
minimizes the total distance. The function Rsolnp::solnp()
(Ghalanos & Theussl 2014) is used to minimize the distance between
the two sets of censored signatures.
cc_aug()
returns a list containing the following
objects.
qfasar implements three distance measures that have been used by QFASA practitioners and researchers:
A form of the Kullback-Leibler distance was recommended by Iverson et al. (2004). The Kullback-Leibler distance represents a balance between the absolute difference between two proportions and the relative difference or ratio between the proportions. Iverson et al. (2004) did some work to compare the Kullback-Leibler distance with several other distance measures, such as least squares, but the extent and results of those comparisons were not well-documented.
The Aitchison distance (Aitchison et al. 2000) has been used in some recent methodology work (e.g., Stewart and Field 2011; Bromaghin et al. 2015, 2016a, 2016b). It is based on differences between the logarithms of the ratio of a fatty acid proportion and the geometric mean of all proportions in a signature, and is therefore a relative measure.
Stewart et al. (2014) used the “chi-square” distance measure. This measure is based on the ratio of the squared difference between proportions and their sum, after the proportions have been power transformed and reclosed (scaled to sum to 1). With a power parameter of 1, this measure is therefore based on squared differences. However, Stewart et al. (2014) state that this measure asymptotically approaches the Aitchison under some conditions as the value of the power parameter approaches 0.
The following summary comments are collectively based on the simulation work of Bromaghin et al. (2015, 2016a, 2016b) to compare the performance of QFASA estimators. The estimator based on the Aitchison distance tends to have less bias than the Kullback-Leibler estimator. An additional advantage of the Aitchison distance is that it produces identical diet estimates in both the prey and predator spaces (Estimation spaces), subject to slight differences due to numerical optimization. For those reasons, it is the default distance measure in qfasar. However, with some prey libraries and predator diets, estimates based on the Kullback-Leibler distance have had less bias. More limited simulation work has found the chi-square distance to perform similarly to the Kullback-Leibler, which is not surprising as a power parameter of 1 Chi-square distance power parameter was used and so the chi-square distance was essentially based on relative squared differences. The performance of the chi-square distance for different values of the gamma parameter has not yet been tested. Importantly, this body of simulation work found that the interaction of a prey library, a specific diet composition, and the values of the calibration coefficients interact strongly to influence QFASA performance, and this strong interaction makes it difficult to make universal recommendations regarding methodological options, including the choice of a distance measure.
The chi-square distance involves a power transformation of signature
proportions, with the power parameter being denoted gamma
.
The function comp_chi_gamma()
implements the algorithm of
Stewart et al. (2014) to find a suitable value of
gamma
.
The algorithm is initialized with inv_gamma
equal to 1
and gamma
is computed as 1/inv_gamma
. The
distances between all possible pairs of full signatures are computed
(distances). For each pair of full signatures, the distances between all
possible sub-signatures comprised of only two fatty acid proportions are
computed (sub-distances). The proportion of sub-distances that exceed
their corresponding distance is computed across all possible pairs of
signatures. If that proportion is less than a user specified argument
near_zero
, the function returns with gamma
equal to 1; otherwise, the function enters an iterative phase. At each
iteration, inv_gamma
is incremented by 1,
gamma
is computed as 1/inv_gamma
, distances
and sub-distances are recomputed, and the proportion of the
sub-distances that exceed their corresponding distance is recomputed.
The algorithm terminates when the proportion is less than the argument
near_zero
or the value of gamma
is less than
min_gamma
.
The argument space
must equal 1 or 2 (see Estimating
individual diet and variance). If its value is 1, the
calibration coefficients are used to map the signatures to the predator
space prior to initializing the algorithm.
The following code snippet contains an example call to
comp_chi_gamma()
.
# Estimate the chi-square distance gamma parameter.
gamma_est <- comp_chi_gamma(sigs = prey_info$sig_rep,
cc = fa_info$cc,
near_zero = 0.00001,
min_gamma = 0.05,
space = 1)
str(est_gamma)
comp_chi_gamma()
returns a list containing the following
objects:
gamma
.gamma
at each step of the iteration.In the above code block, note that gamma
is estimated
using the full signatures with any values equal to zero or missing
replaced (prey_info$sig_rep
) and all calibration
coefficients. If using augmentation, one might initially think to call
comp_chi_gamma()
with augmented signatures (Scaling signatures).
However, specification of a distance measure is required to obtain a
calibration coefficient for the augmented proportion (Calibration
coefficient for an augmented proportion). Consequently, if
augmentation is used, gamma
must be estimated using the
full signatures. If traditional scaling is to be used
(scale <- 1
), one could estimate gamma
either with the full signatures or with the scaled signatures, though we
have no idea how similar the two estimates of gamma
would
be; with traditionally scaled signatures, one would need to restrict the
calibration coefficients to those associated with fatty acids in the
scaled signatures, i.e.,
my_ccs <- fa_info$cc[fa_info$use]
. If one is not scaling
signatures (scale <- 2
), they will be scaled internally
for the computation of gamma.
As the number of signatures in the library and/or the number of fatty
acids in a signature increases, the number of possible pairs of
signatures and the number of all possible two-proportion sub-signatures
increases rapidly. Consequently, comp_chi_gamma()
is likely
to require long run times. However, it only needs to be run once for any
particular library of signatures.
I have little experience with the chi-square distance and with
different values of the gamma
parameter. Stewart et
al. (2014) state that the chi-square distance approaches the Aitchison
distance as gamma
decreases. However, as gamma
decreases, the power-transformed signature components become
increasingly similar in magnitude, a characteristic which seems
counterintuitive for a distance measure, at least on the surface.
Consequently, I provide no recommendations regarding gamma
and provide this functionality for the sake of completeness and to
facilitate future comparative research into the performance of QFASA
with the chi-square distance and a range of values for
gamma
.
Bromaghin et al. (2015) introduced the terms “prey space” and “predator space”. The prey space is simply the simplex in which the prey signatures reside and, similarly, the predator space is the simplex in which the predator signatures reside. The spaces differ due to predator metabolism of ingested prey tissue and the resulting modification of proportions. As mentioned earlier (Calibration coefficient for an augmented proportion), calibration coefficients provide a one-to-one mapping or transformation between the prey and predator spaces.
Diet estimation and other analyses, such as computer simulation, can be performed in either space. Iverson et al. (2004) used the calibration coefficients to map predator signatures to the prey space for diet estimation. Bromaghin et al. (2013) took the opposite approach, mapping prey signatures to the predator space for diet estimation. As diet estimation involves modeling predator signatures, operating in the predator space may seem more natural, but simulation work has not revealed any strong reason to prefer one space over the other (Bromaghin et al. 2015). However, there is one important issue with respect to estimation space to be aware of involving distance measures (Distance measures).
Estimating diet composition with the Kullback-Leibler distance measure may produce different estimates in the two spaces (Bromaghin et al 2015). This difference is caused by the component of the Kullback-Leibler distance that involves absolute differences between proportions, in combination with the change in the magnitude of proportions induced by application of the calibration coefficients. Unfortunately, I am unaware of a means to determine which set of estimates are better in any particular investigation. Although the sensitivity of the chi-square distance measure to estimation space has not been explicitly explored, its definition is based on absolute differences between power-transformed proportions and I therefore suspect estimates obtained in the two spaces would also differ. The Aitchison distance is scale-invariant and estimates therefore do not differ as a function of the estimation space in which they are obtained, except for small differences attributable to numerical optimization. This stability is one reason the Aitchison distance is the default distance measure in qfasar.
Diet estimation is the central purpose of qfasar and a
variety of estimation options are implemented in the function
est_diet()
. Estimation options include choices for
estimation space, measuring distance between signatures, estimating mean
diet composition, and estimating the variances of individual and mean
diet composition estimates. Please refer to the sections Estimation spaces and
Distance measures
for information on those topics.
The data arguments passed in a call to est_diet()
are
intended to be the objects returned by calls to prep_fa()
with the fatty acid suites data frame, prep_sig()
with the
predator data frame, and prep_sig()
with the prey data
frame (see Preparing
for an analysis). The remaining arguments are integer
indicators of method selection or bootstrap sample sizes, described in
subsequent subsections.
The following block of code presents a complete example from reading data files to diet estimation.
# Specify directories and file names.
my_data_dir <- "C:/Research/QFASA/My_Project"
my_fa_suites <- "my_fa_file"
my_prey_sigs <- "my_prey_data"
my_pred_sigs <- "my_pred_data"
# Specify constants.
my_zero_rep <- 90
my_scale <- 3
my_dist_meas <- 1
my_gamma <- NA
my_space <- 1
my_ind_boot <- 100
my_mean_meth <- 1
my_var_meth <- 1
my_mean_boot <- 250
# Set the working directory.
setwd(my_data_dir)
# Read and prepare fatty acid information.
df_fa <- read.csv(file = my_fa_suites, header = TRUE)
fa_info <- prep_fa(df_fa = df_fa)
str(fa_info)
# Read and prepare prey signatures.
df_prey <- read.csv(file = my_prey_sigs, header = TRUE)
prey_info <- prep_sig(df_sig = df_prey,
fa_names = fa_info$fa_names,
use_fa = fa_info$use,
zero_rep = my_zero_rep,
scale = my_scale)
str(prey_info)
# Read and prepare predator signatures.
df_pred <- read.csv(file = my_pred_sigs, header = TRUE)
pred_info <- prep_sig(df_sig = df_pred,
fa_names = fa_info$fa_names,
use_fa = fa_info$use,
zero_rep = prey_info$zero_rep_val,
scale = my_scale)
str(pred_info)
# I used signature augmentation in the above call to prep_sig(), i.e., scale == 3.
# Consequently need to compute a calibration coefficient for the augmented proportion.
new_cc <- cc_aug(sig_rep = prey_info$sig_rep,
sig_scale = prey_info$sig_scale,
cc_all = fa_info$cc,
use_fa = fa_info$use,
dist_meas = my_dist_meas,
gamma = my_gamma)
# Estimate predator diets.
# Use cc = fa_info$cc[fa_info$use] if signatures are not augmented.
diet_est <- est_diet(pred_sigs = pred_info$sig_scale,
pred_uniq_types = pred_info$uniq_types,
pred_loc = pred_info$loc,
prey_sigs = prey_info$sig_scale,
prey_uniq_types = prey_info$uniq_types,
prey_loc = prey_info$loc,
cc = new_cc$cc,
space = my_space,
dist_meas = my_dist_meas,
gamma = my_gamma,
ind_boot = my_ind_boot,
mean_meth = my_mean_meth,
var_meth = my_var_meth,
mean_boot = my_mean_boot)
str(diet_est)
est_diet()
returns a list containing the following
elements:
space
.space
.space
.NOTE: The numerical optimization and bootstrap sampling performed by
est_diet()
are numerically intensive and can result in long
run times. Patience is advised! The primary factors causing slow
execution are the number of predator signatures, the number of predator
and prey types, and the bootstrap sample sizes.
Options for estimating individual and mean diet composition, the variance of diet composition estimates, and adjusting diet estimates for differential prey fat content are described in the following subsections.
The diet composition of an individual predator is estimating by
minimizing a measure of distance (Distance measures)
between its observed signature and a modeled signature computed from the
diet composition and mean prey-type signatures (Iverson et al. 2004).
est_diet()
uses the function Rsolnp::solnp()
to minimize the distance (Ghalanos & Theussl 2014). The selection of
a distance measure is controlled by the argument
dist_meas
:
Diet estimation can occur in either the predator or prey space
(Bromaghin et al. 2015; Estimation spaces),
controlled by the argument space
.
The variance matrix of an individual diet composition estimate is
estimated through bootstrap sampling. The signatures of each prey type
are indepedently sampled with replacement, with sample sizes equal to
the prey-type sample sizes in the prey library, and amalgamated to
construct a bootstrapped prey library. The bootstrapped prey library is
then used to estimate the predator’s diet. This is repeated a specified
number of times (argument ind_boot
), resulting in
replicated estimates of the diet of each predator. The empirical
variance and covariance of the replicated prey-type estimates are used
to construct a covariance matrix for each predator (Beck et al. 2007,
Bromaghin et al. 2015). Note that the prey library is resampled
independently for each predator.
If a particular analysis does not require the variances of the diet
estimates for individual predators to be estimated, variance estimation
can be disabled by setting the individual bootstrap sample size equal to
zero (ind_boot <- 0
).
If estimated, the variance matrices, one for each predator, are
returned via the three-dimensional array var_ind
. The
predators index the third dimension. The following code snippet
illustrates how to extract the variance information for an individual
predator.
# Specify a predator.
this_pred <- 1
# Extract the variance matrix for this predator.
var_mat <- diet_est$var_ind[,,this_pred]
print(var_mat)
# Extract the standard errors for this predator.
se <- sqrt(diag(var_mat))
print(se)
Mean diet composition can be estimated for each type in the predator
signature data (see Data
frames). est_diet()
provides three options for
estimating mean diets, controlled by the argument
mean_meth
.
The empirical mean estimator is simply the average of the estimated diets for individual predators within each predator type.
The “parameterized mean” estimator is an unpublished estimator that I have experimented with to a limited extent. This estimator reparameterizes the QFASA model with a single vector of diet proportions common to all predators and mean diet is estimated by minimizing the distance between the mean signature modeled from the mean diet proportions and each predator’s observed signature, summed over all predators. The parameterized mean estimator has not yet been thoroughly tested and its inclusion in qfasar is intended to facilitate planned research. Our limited work with the parameterized mean estimator to date suggests that it may perform well when predator signatures are homogeneous, but may be more sensitive to the presence of “outlier” signatures in the predator data than the empirical estimator.
est_diet()
returns the estimated mean diet compositions
in the object est_mean
.
qfasar implements two methods of estimating the variance of
mean diet composition estimates, the variance estimator of Beck et
al. (2007) and a bootstrap estimator, controlled by the argument
var_meth
.
The bootstrap estimator draws independent samples of each prey type
to form a bootstrap prey library, exactly as in Estimating
individual diet and variance, and also a random sample of
each predator type, with replacement and sample sizes equal to the
observed sample sizes. Mean diet is estimated using the bootstrapped
prey and predator signatures and the method indicated by the argument
mean_meth
. The argument mean_boot
controls the
number of times this process is replicated, and the replications are
used to estimate the covariance matrix of the mean diet composition
estimates for each predator type. Unpublished work indicates that the
bootstrap estimator is more reliable than the Beck et al. (2007)
estimator.
Note that the Beck estimator cannot be used in combination with the parameterized-mean estimator of mean diet composition.
est_diet()
returns the estimated variances of the
predator-type mean diet composition estimates in the object
var_mean
.
The following code snippet illustrates one method of extracting the diagonal of the variance matrix for all predator types, computing the standard errors, and combining mean diet estimates and associated standard errors into a single matrix.
# There must be a more elegant way to extract all the diagonals into a matrix,
# but the following works.
# Allocate memory for the standard errors.
stderr <- matrix(data = 0, nrow = prey_info$n_types, ncol = pred_info$n_types)
colnames(stderr) <- colnames(diet_est$est_mean)
rownames(stderr) <- rownames(diet_est$est_mean)
# Extract the diagonal of the variance matrix for each predator type and compute
# the standard errors. Note: I use li# for "loop index number".
for(li1 in 1:pred_info$n_types){
stderr[,li1] <- sqrt(diag(diet_est$var_mean[,,li1]))
}
# Combine estimates of means and standard errors in one matrix.
est_comb <- rbind(diet_est$est_mean, stderr)
rownames(est_comb) <- c(paste("Mean", prey_info$uniq_types, sep = " - "),
paste("SE", prey_info$uniq_types, sep = " - "))
# Transpose the matrix (if desired?).
est_comb <- t(est_comb)
The function est_diet()
estimates diet composition in
terms of the fatty acid mass consumed (units [fatty acid mass]). Such
estimates may be of direct ecological interest, especially for
ecosystems in which lipids are particularly important. In other
situations, one may wish to transform estimates to a different scale
(Iverson et al. 2004). The qfasar function
adj_diet_fat()
performs such transformations.
adj_diet_fat()
takes constants specifying the
differential fat content of prey types, one or more estimates of diet
composition, and potentially one or more variance matrices as input
arguments. The diet composition and variance objects are intended to be
the corresponding objects returned by the function
est_diet()
. Either estimates for individual predators or
mean estimates for predator types may be passed to the function.
Note that the units of the fat-content constants (input argument
prey_fat
) control the units of the resulting transformed
diet estimates. For example, if the units of prey_fat
are
[fatty acid mass]/[animal mass], the adjusted estimates are in terms of
the relative mass of prey consumed (e.g., Bromaghin et al. 2013).
Alternatively, if the units of prey_fat
are [fatty acid
mass]/[animal], the adjusted estimates are in terms of the relative
numbers of prey animals consumed. It is the user’s responsibility to
know the units associated with the fat content values and interpret the
adjusted estimates correctly. In any case, the variance of the
transformed diet estimates is estimated from the input arguments using
the delta method (Seber 1982)
NOTE: values of the fat parameters are treated as known constants without variance.
The following code snippet illustrates an example call to
adj_diet_fat()
.
# Estimate predator diets in terms of fatty acid mass consumed.
mass_est <- est_diet(pred_sigs = pred_info$sig_scale,
pred_uniq_types = pred_info$uniq_types,
pred_loc = pred_info$loc,
prey_sigs = prey_info$sig_scale,
prey_uniq_types = prey_info$uniq_types,
prey_loc = prey_info$loc,
cc = new_cc$cc,
space = my_space,
dist_meas = my_dist_meas,
gamma = my_gamma,
ind_boot = my_ind_boot,
mean_meth = my_mean_meth,
var_meth = my_var_meth,
mean_boot = my_mean_boot)
# Specify fat content, one value for each prey type and in the alphanumeric order of the
# prey-type names. Here, I generate random values purely for illustrative purposes,
# because I do not have any fat data handy. With values, use the concatenate function:
# fat_vals <- c(val1, val2, ...).
fat_vals <- runif(n = length(prey_info$uniq_types), min = 0.25, max = 1.75)
names(fat_vals) <- prey_info$uniq_types
# Transform individual estimates of diet composition in terms of fatty acid mass
# to estimates in terms of some other desired scale. Note that the units of the
# fat constants determine the resulting scale. Ideally, the units would be
# (fatty acid mass)/(prey mass) or (fatty acid mass)/(prey animal).
trans_est_ind <- adj_diet_fat(prey_fat = fat_vals,
diet_est = mass_est$est_ind,
diet_var = mass_est$var_ind)
# Transform mean estimates of diet composition in terms of fatty acid mass
# to estimates in terms of some other desired scale.
trans_est_mean <- adj_diet_fat(prey_fat = fat_vals,
diet_est = mass_est$est_mean,
diet_var = mass_est$var_mean)
str(trans_est_mean)
adj_diet_fat()
returns a list containing the following
objects:
prey_fat
.prey_fat
.qfasar contains diagnostic functionality that may be useful for some applications. Those aspects of qfasar are described in the following subsections.
Prey and, perhaps to a lesser extent, predator signatures come with some pre-defined structure. A sample of signatures is required from each potential prey type, often species, so those a priori prey types represent structure within a prey library. Similarly, predator data may be compiled from some number of recognized classes, for example sex, age, or size classes. Such pre-defined structure helps frame an analysis in important ways, such as defining the prey types to which estimates of diet composition will pertain or classes of predators for which separate estimates of mean diet composition are desired.
However, one may reasonably wonder if additional structure exists within the signatures. For example, within a prey library structured by species, might there be clusters of individuals within one or more species that are more similar within clusters than between clusters? If so, there may be advantages to subdividing the prey library into a larger number of types.
qfasar implements an algorithm referred to as
divisive magnetic
clustering (dimac; Bromaghin et al. Under review) in
the function dimac()
to explore a collection of signatures
for additional structure within defined types. The dimac algorithm was
modified from the diana algorithm of the package cluster
(Maechler et al. 2016). The algorithm is initialized with all signatures
in one cluster. The first two “magnets”” are chosen as the two
signatures having the greatest distance between them, using a distance
measure of the user’s choice (Distance measures),
and each signature is drawn to the nearest magnet to form clusters. The
algorithm then enters an iterative phase. At each iteration, the cluster
with the greatest average distance between its signatures and the mean
signature is identified as the “active” cluster. The two signatures
within the active cluster having the greatest distance between them are
selected as new magnets. One of the two new magnets replaces the
original magnet for the active cluster and the second starts the
formation of an additional cluster. Each signature is drawn to the
nearest magnet to form clusters, without regard for its cluster
membership in the preceding iteration. Consequently, the algorithm is
not simply bifurcating, but rather is much more dynamic and flexible.
The iterations continue until each signature is in its own cluster.
dimac()
is intended to be called after the fatty acid
signatures have been prepared for analysis by calls to the functions
prep_fa()
and prep_sig()
. The following code
snippet illustrates a typical call to dimac()
.
# Explore prey signatures for sub-structure.
my_clust <- dimac(sigs = prey_info$sig_scale,
id = prey_info$id,
type = prey_info$uniq_types,
loc = prey_info$loc,
dist_meas = my_dist_meas)
str(my_clust)
dimac()
returns a list containing the following
elements:
Unfortunately, there is no objective method to determine the most
appropriate number of clusters. It may seem tempting to use a large
number of clusters in subsequent analyses to exploit fine scale
structure in the data. Indeed, Meynier et al. (2010) used individual
prey animals as prey types, so one might argue that having a large
number of clusters is acceptable. One disadvantage of using a large
number of prey types is that the speed of numerical optimization will be
slow with large prey libraries. Also, partitioning a prey library into a
large number of clusters can increase prey confounding between sub-types
(Bromaghin et al. Under review). My recommendation is to use the results
of dimac()
conservatively and only partition prey types
into multiple clusters if there is strong evidence for doing so. Examine
the distance results in the object clust_dist
returned by
make_prey_part()
and identify any large reductions in
distance that are followed by a more gradual decrease in distance as the
number of clusters increases further. An example of this pattern is
provided by the prey type Gaspereau from the Scotian fish prey library
(Budge et al. 2002; Bromaghin et al. 2016a).
Number_of_clusters <- 1:41
Distance <- c(209.8, 190.4, 107.6, 96.3, 88.8, 82.5, 76.5, 72.6, 64.2, 59.8,
55.2, 48.3, 45.2, 42.1, 40.0, 33.9, 31.8, 29.5, 26.9, 25.3,
23.0, 21.2, 19.1, 17.3, 16.2, 14.7, 14.8, 13.6, 11.1, 9.9,
8.5, 7.3, 6.5, 5.6, 5.1, 4.1, 3.2, 2.4, 1.4, 0.6,
0.0)
plot(x = Number_of_clusters, y = Distance, main="Scotian: Gaspereau")
Moving from one to two clusters provides little reduction in the summed distance within clusters, but the increase to three clusters is associated with a substantial drop in distance that is likely attributable to the discovery of some true structure within the signatures. As the number of clusters increases above three, distance decreases rather continuously to zero. I interpret this result as evidence that three clusters exist within the Gaspereau prey type and, in fact, the three clusters correspond well to ancillary information on the year and location samples were collected (Bromaghin et al. Under review).
Note that for diet estimation applications, partitioning a prey library into more clusters than the number of fatty acids used to estimate diet may result in estimates that are not unique. In such a case, estimates of diet composition need to be pooled into a smaller number of “reporting groups” (e.g., Bromaghin 2008; Meynier et al. 2010), which one would hope would largely eliminate any lack of uniqueness. See Bromaghin et al. (Under review) for a more detailed discussion of this issue.
To use the results of dimac()
prey partitioning for diet
estimation or some other analysis, the user needs to determine how many
clusters each prey type should be partitioned into. To do so,
dimac()
and the number of clusters for each prey
type.make_prey_part()
.The following code block continues the above example of Gaspereau and the Scotian fish prey library. Note that this prey library has 28 prey types; Gaspereau are the fifth prey type.
# Specify the number of clusters for each prey type.
n_clust <- c(2, 1, 4, 3, 3, 3, 1, 3, 1, 2,
2, 2, 3, 2, 3, 1, 1, 1, 3, 1,
1, 1, 1, 1, 2, 2, 2, 1)
# Partition the prey library.
prey_part <- make_prey_part(sig = prey_info$sig_scale,
clust = my_clust$clust,
n_clust = n_clust)
make_prey_part()
performs two important functions.
First, it uses the information on the desired number of clusters for
each prey type to construct a new prey signature library with modified
prey-type names that is ready for diet estimation or other analyses.
Second, it constructs two matrices to facilitate pooling diet estimates
made on the basis of the partitioned library to the original prey types
in the un-partitioned library.
make_prey_part()
returns a list containing the following
elements:
If significant structure is found within a prey library, estimating
diet composition on the basis of a partitioned prey library may produce
estimates with less bias and possibly less variation through a reduction
in prey confounding (Bromaghin et al. Under review). However, if a
partitioned library is used to estimate predator diet composition, one
may wish to pool the estimates back to the original, unpartitioned prey
types for reporting purposes. The functionality to do so is implemented
in the function diet_pool()
.
diet_pool()
takes information on the prey partition and
estimates returned by a call to est_diet()
(Diet estimation) as
inputs and returns estimates pooled back to the original, unpartitioned
prey types. The inputs to diet_pool()
are
make_prey_part()
as the object pool_post
.
Alternatively, a user-defined matrix for customized pooling (see below).
Each column defines a potentially pooled prey type.est_diet()
.var_ind
returned by a call to est_diet()
.est_mean
returned by a call to est_diet()
.var_mean
returned by a call to
est_diet()
.The following code block combines example code from Diet estimation and the Scotian fish prey library example from earlier in this section. Note that this prey library has 28 prey types.
# Specify the number of clusters for each prey type.
n_clust <- c(2, 1, 4, 3, 3, 3, 1, 3, 1, 2,
2, 2, 3, 2, 3, 1, 1, 1, 3, 1,
1, 1, 1, 1, 2, 2, 2, 1)
# Partition the prey library.
prey_part <- make_prey_part(sig = prey_info$sig_scale,
clust = my_clust$clust,
n_clust = n_clust)
# Estimate predator diets.
my_est <- est_diet(pred_sigs = pred_info$sig_scale,
pred_uniq_types = pred_info$uniq_types,
pred_loc = pred_info$loc,
prey_sigs = prey_part$sig_part,
prey_uniq_types = prey_part$uniq_types,
prey_loc = prey_part$loc,
cc = new_cc$cc,
space = my_space,
dist_meas = my_dist_meas,
gamma = my_gamma,
ind_boot = my_ind_boot,
mean_meth = my_mean_meth,
var_meth = my_var_meth,
mean_boot = my_mean_boot)
str(my_est)
# Pool estimates to the original prey types.
pool_est <- diet_pool(rep_grp = prey_part$pool_post,
est_ind = my_est$est_ind,
var_ind = my_est$var_ind,
est_mean = my_est$est_mean,
var_mean = my_est$var_mean)
str(pool_est)
diet_pool()
returns the same four estimation objects
that it takes as input, but pooled to the original prey types.
NOTE: diet_pool()
can be used to pool any estimates into
a smaller number of combined prey types for reporting purposes, i.e., it
does not necessarily have to be called with estimates from a partitioned
library. For example, imagine a prey library with a large number of prey
types, subsets of which have similar ecological function and signatures
that are consequently somewhat similar (prey confounding, Bromaghin et
al. 2016b). In such a case, one may wish to estimate diet on the basis
of the full prey library, but subsequently pool the resulting estimates
to a smaller number of combined prey types for reporting purposes
(reporting groups, Bromaghin 2008) to reduce bias from prey confounding.
diet_pool()
can be used for this purpose, though the user
needs to manually construct the reporting group matrix
rep_grp
because the original library was not partitioned by
a call to make_prey_part()
. As an example, a reporting
group matrix to pool estimates for 10 prey types to 7 prey types might
look like the following.
rep_grp = matrix(c(1, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 1),
nrow = 10, byrow = TRUE)
In the above example, prey types 2 and 3 would be pooled into a new prey type 2, prey types 5 and 6 would be pooled into a new prey type 4, and prey types 8 and 9 would be pooled into a new prey type 6.
In any type of QFASA analysis, one might reasonably wonder how well the prey library might perform. For example, if the signatures of the individuals within each prey type are similar and quite different from the signatures of other prey types, QFASA might perform very well. Conversely, performance might suffer if some prey types are similar, which Bromaghin et al. (2016b) referred to as “prey confounding”, or if distinct clusters of similar prey occur within some prey types (see Exploring signatures for structure). Although such expectations seem reasonable, it is important to recognize that model performance is largely controlled by the complex interaction of a predator’s diet and characteristics of a prey library (Bromaghin et al. 2015, 2016a, 2016b). For example, performance can be poor if most prey types are distinct, but a predator’s diet is dominated by a subset of prey types that are confounded. The degree to which a predator diet is concentrated on distinct prey types has been referred to as “diet identifiability” (Bromaghin et al. 2016b). Of course, the accuracy of the calibration coefficients is also a critical determinant of model performance.
Despite the above cautions, it seems reasonable to investigate the performance of a prey library as long as one interprets the results with the cautions firmly in mind. There may be several ways this could be done. For example, Haynes et al. (2015) used what is commonly called “prey-on-prey” simulations to evaluate their prey library. The approach implemented in qfasar is a conceptually-similar leave-one-prey-out technique.
A leave-one-prey-out analysis in qfasar is conducted as follows. A single signature is temporarily removed from the prey library, the mean prey-type signatures are computed, the mixture of prey signatures that best approximates the removed signature is estimated as in diet estimation, after which the temporarily removed signature is returned to the library. This is done for all prey signatures in turn. The mean diet estimate is computed for the signatures of each prey type. This analysis performed on an ideal prey library would result in 100% of the diet estimate from each signature being attributed to the prey type to which the signature belongs. The degree of prey confounding within a library is reflected by the extent to which portions of the diet estimates are attributed to incorrect prey types.
In qfasar, a
leave-one-prey-out
analysis is conducted by a call to the function lopo()
. The
inputs lopo()
requires are obtained by calls to the
functions prep_sig()
with the prey signature data, or
possibly make_prey_part()
if a prey library has been
partitioned into clusters (see Exploring
signatures for structure). The following code snippet
contains an example call to lopo()
.
# Perform a leave-one-prey-out analysis with a partitioned library.
lib_perf <- lopo(sigs = prey_part$sig_part,
type = prey_part$type,
uniq_types = prey_part$uniq_types,
type_ss = prey_part$type_ss,
loc = prey_part$loc,
dist_meas = my_dist_meas,
gamma = my_gamma)
str(lib_perf)
lopo()
returns a list containing the following
elements:
Note:
lopo()
are computed based on
the estimates that successfully converge.lopo()
can take a few minutes
to run with a large prey library.If a prey library has been partitioned using
make_prey_part
(see Exploring
signatures for structure), one might want to pool the
results back to the original, unpartitioned prey types to assess the
improvement relative to the unpartitioned prey library. The function
lopo_pool()
performs that task, e.g.,
# Perform a leave-one-prey-out analysis with the original library.
perf_orig <- lopo(sigs = prey_info$sig_scale,
type = prey_info$type,
uniq_types = prey_info$uniq_types,
type_ss = prey_info$type_ss,
loc = prey_info$loc,
dist_meas = my_dist_meas,
gamma = my_gamma)
# Partition the prey library.
prey_part <- make_prey_part(sig = prey_info$sig_scale,
clust = my_clust$clust,
n_clust = n_clust)
# Perform a leave-one-prey-out analysis with the partitioned library.
perf_part <- lopo(sigs = prey_part$sig_part,
type = prey_part$type,
uniq_types = prey_part$uniq_types,
type_ss = prey_part$type_ss,
loc = prey_part$loc,
dist_meas = my_dist_meas,
gamma = my_gamma)
# Pool results back to the original prey types.
pooled_results <- lopo_pool(est = perf_part$est,
n_conv = perf_part$n_conv,
type_ss = prey_part$type_ss,
pre = prey_part$pool_pre,
post = prey_part$pool_post)
# Compare correct allocations.
cbind(diag(perf_orig$est), diag(perf_part$est))
lopo_pool()
returns a list containing the following
elements, all of which are organized on the basis of the original,
unpartitioned prey types:
Please refer to the documentation for lopo_pool()
for
additional information.
Predator signatues are assumed to be a linear mixture of mean prey signatures (Iverson et al. 2004) and are modeled as such during diet estimation. Predator signature proportions should therefore be within the range of prey signature proportions and proportions outside the range of the prey proportions are indicative of a violation of one or both of the primary model assumptions, i.e., the prey library is incomplete or the calibration coefficients are inaccurate (Bromaghin et al. 2015, 2016a). Consequently, checking for predator proportions that are outside the range of mean prey proportions is an important diagnostic aid that may provide insights into the reliability of diet estimates.
qfasar uses the function pred_beyond_prey()
to
identify predator signature proportions that are outside the range of
proportions observed among the individual and mean prey signatures. For
purposes of diet estimation, proportions outside the range of the mean
signatures are most important. However, pred_beyond_prey()
also identifies predator proportions that are outside the range of the
individual prey proportions for exploratory purposes.
pred_beyond_prey()
is designed to be called with inputs
returned by the function est_diet()
(see Diet estimation).
Although it is not conceptually necessary to estimate diets before
performing this diagnostic check, doing so ensures that the predator and
prey signatures have been transformed to the optimization space
(Bromaghin et al. 2015) in which diets have been estimated. An example
call to pred_beyond_prey()
is contained in the following
code snippet.
# Find predator proportions outside range of prey proportions.
beyond_prey <- pred_beyond_prey(pred_sigs = diet_est$pred_sigs,
prey_sigs = diet_est$prey_sigs,
mean_sigs = diet_est$mean_sigs)
str(beyond_prey)
# Compute percentage out of range by fatty acid, predator, and overall.
mean_by_fa <- apply(X = beyond_prey$beyond_mean, MARGIN = 1, FUN = mean)
mean_by_pred <- apply(X = beyond_prey$beyond_mean, MARGIN = 2, FUN = mean)
mean_all <- mean(beyond_prey$beyond_mean)
pred_beyond_prey()
returns a list with the following
objects:
Although pred_beyond_prey()
performs an important
diagnostic check, how best to use the results may not always be obvious.
Finding that all the predator proportions are within the range of the
prey proportions is obviously desirable, but that may not be a realistic
expectation when working with real data. A relatively small number of
proportions being out of range may not be problematic. One might find
that signs of problems are limited to a small number of predators or to
one or two fatty acids across many predators. In such cases, a
reasonable way forward may not be difficult to devise. However, as the
signs of problems become increasingly prevalent, the best way to deal
with them may not be clear. For example, Bromaghin et al. (2015)
summarised an instance in which nearly half of all predator signature
proportions were outside the range of the mean prey proportions. I can
offer no general advice for how to proceed in such extreme cases.
In diet estimation (Diet estimation), a predator signature is modeled as a linear mixture of prey signatures and diet composition is estimated as the mixture proportions that minimize a measure of distance between observed and modeled predator signatures (Iverson et al. 2004). Consequently, one might expect that how well modeled signatures approximate observed signatures would be of interest, and potentially informative with respect to interpretation of the resulting diet estimates. While that seems likely to be true, methods to assess how well predator signatures are modeled have received almost no attention in the literature (but see Bromaghin et al. 2015).
A potentially informative byproduct of diet estimation is the value of the minimized distance measure associated with each estimated diet. The value will tend to be small when a modeled signature closely approximates an observed signature, and larger when a modeled signature approximates an observed signature less well. However, what value of the minimized distance measure might provide a warning flag for a potentially poor fit is an open question.
The function gof()
represents a first attempt to address
that question. The algorithm implemented in the function is based on the
following logic. First, I assume that a predator consumes the mixture of
prey specified by its estimated diet composition. Given that assumption,
the expected value of the distance measure is, in a sense, fixed by the
combination of mean prey signatures and the assumed diet composition
(Bromaghin 2015). Large values of the distance measure are then most
likely to occur when there is a high degree of random variation in a
predator signature, which results only from the selection of individual
prey within prey types if model assumptions are met. Within the
framework of simulating predator signatures, variation is maximized when
bootstrap sample sizes of the prey signatures used to construct a
predator signature are minimized (Bromaghin 2015; Bromaghin et
al. 2016b).
Implementing the above logic, gof()
randomly samples a
single signature from each prey type and weights those signatures by a
predator’s estimated diet composition to construct a modeled signature.
The modeled signature is then used to estimate diet. If the optimization
function converges, the value of the minimized distance measure obtained
with the modeled signature is compared to the value of the minimized
distance measure obtained while estimating diet with the observed
signature. This process is repeated a user-specified number of times
(argument boot_gof
) and the proportion of the simulated
distance measure values that exceed the observed value of the distance
measure is computed. gof()
therefore constructs a statistic
similar to a p-value, with small values being suggestive of predator
signatures that may not have been closely approximated during diet
estimation.
gof()
is designed to be called with objects returned by
est_diet()
and prep_sig()
, as well as
associated constants; please refer to the function documentation for
details. An example call to gof()
is contained in the
following code snippet.
# Check goodness-of-fit diagnostics.
gof_diag <- gof(prey_sigs = diet_est$prey_sigs,
prey_loc = prey_info$loc,
mean_sigs = diet_est$mean_sigs,
diet_est = diet_est$est_ind,
conv = diet_est$conv,
obj_func = diet_est$obj_func,
dist_meas = my_dist_meas,
gamma = my_gamma,
boot_gof = boot_gof)
str(gof_diag)
hist(gof_diag$p_val)
gof()
returns a list containing the following
objects:
NOTE: this algorithm implements an idea whose performance has not been explored. It has been included in qfasar to support future research on this topic. Also, like diet estimation itself, this function involves considerable bootstrap resampling and numerical optimization, which can result in very long run times.
One of the primary motivations for the development of qfasar was to facilitate future research into potentially improved methods, work that is normally accomplished using computer simulation. The existence of a “toolbox” of ready-to-use simulation functions was expected to make future research easier and faster to complete.
The simulation capabilities of qfasar are summarized in the following subsections.
A majority of simulations designed to investigate the performance of QFASA will probably involve some method of establishing predators with known diet compositions, simulating predator signatures based on those known diets (see Simulating predator signatures), and then evaluating the performance of diet estimators relative to the known “true diet”. Potential methods of establishing known diets, and their implementation in qfasar, are described in the following subsections:
In some circumstances, one might might wish to work with a small number of “realistic” diets (e.g., Bromaghin et al. 2015). Realistic diets might be taken from the literature, developed using expert opinion, based on biological hypotheses, or constructed to test certain features of a prey library.
qfasar does not implement any special methods to facilitate
use of realistic diets. They could be manually entered using the base
function concatenate c()
or read from a file. In either
case, diet proportions should be stored in a matrix in column-major
format, i.e., one set of diet proportions per column, to be consistent
with qfasar design.
NOTE: it is critical that the order of diet proportions be identical
to the alphanumeric order of the prey types (check the object
uniq_types
returned by prep_sig()
Preparing signature
data or by make_prey_part()
Using a prey
partition to verify the order). If the diets are not
entered or read in that order, they should be sorted using the base
function order()
prior to calling qfasar
functions.
Given one or more diet compositions, predator fatty acid signatures
can be generated using the function make_pred_sigs()
(Simulating predator
signatures). The diets of such simulated predators can then
be estimated (Diet
estimation), and the resulting estimates can be compared to
the known diet composition to evaluate bias, variance, or other
properties.
As an example, the realistic diets of adult male and female Chukchi Sea polar bears used by Bromaghin et al. (2015) could be loaded into memory as follows.
# Enter data.
diets <- matrix(c(0.065, 0.005, 0.051, 0.000, 0.873, 0.000, 0.006,
0.207, 0.014, 0.077, 0.000, 0.658, 0.000, 0.044),
ncol = 2)
rownames(diets) <- c("Bearded", "Beluga", "Bowhead", "Ribbon", "Ringed", "Spotted", "Walrus")
colnames(diets) <- c("AF", "AM")
# Check results.
print(diets)
colSums(diets)
An alternative to working with a small number of realistic diets (Realistic diets) is to work with random diets. This might be useful if there is truly no information on which to base realistic diets or if one wishes to explore the performance of a prey library under a wide range of possible diets (e.g., Bromaghin et al. 2015). One advantage of working with diets distributed throughout the space of all possible diets is that it can reveal broad scale patterns in the performance of diet estimators (e.g., Bromaghin et al. 2016b).
Given one or more diet compositions, predator fatty acid signatures
can be generated using the function make_pred_sigs()
(Simulating predator
signatures). The diets of such simulated predators can then
be estimated (Diet
estimation), and the resulting estimates can be compared to
the known diet composition to evaluate bias, variance, or other
properties.
qfasar has the capability of generating diet compositions
randomly throughout the space of all possible diets for a given
collection of prey types, implemented in the function
make_diet_rand()
. This function takes a factor of the
unique prey-type names, intended to be the object
uniq_types
returned by a call to the function
prep_sig()
with a prey data frame (Preparing signature
data) or a call to make_prey_part()
(Using a prey
partition), and the number of diet compositions to generate
as input.
An example call to make_diet_rand()
is contained in the
following code snippet.
# Generate some random diets.
diet_rand <- make_diet_rand(uniq_types = prey_info$uniq_types,
n_diet = 100000)
str(diet_rand)
make_diet_rand()
returns a list containing the following
objects.
An alternative to working with a small number of realistic diets (Realistic diets) is to work with a systematic grid of regularly-spaced diet compositions. A diet grid is similar to random diets (Random diets), but diet compositions are evenly, rather than randomly, distributed throughout the space of all possible diets. This method of generating diet compositions might be useful if there is truly no information on which to base realistic diets or if one wishes to explore the performance of a prey library under a wide range of possible diets (e.g., Bromaghin et al. 2015). One advantage of working with diets distributed throughout the space of all possible diets is that it can reveal broad scale patterns in the performance of diet estimators (e.g., Bromaghin et al. 2016b).
Given one or more diet compositions, predator fatty acid signatures
can be generated using the function make_pred_sigs()
(Simulating predator
signatures). The diets of such simulated predators can then
be estimated (Diet
estimation), and the resulting estimates can be compared to
the known diet composition to evaluate bias, variance, and perhaps other
properties.
qfasar has the capability of generating a grid of diet
compositions evenly distributed throughout the space of all possible
diets for a given collection of prey types, implemented in the function
make_diet_grid()
. This function takes a factor of the
unique prey-type names, intended to be the object
uniq_types
returned by a call to the function
prep_sig()
with a prey data frame (Preparing signature
data) or make_prey_part()
(**Using a prey partition), and
the inverse of the desired spacing between diet proportions as input.
For example, an inverse increment of 10 would produce diet compositions
with proportions shifted by an increment of 0.1. A small example of a
diet grid with three prey types and a diet increment of 0.25 (integer
inverse 4), modified from Bromaghin et al (2016b), follows.
Diet | Prey 1 | Prey 2 | Prey 3 |
---|---|---|---|
1 | 1.00 | 0.00 | 0.00 |
2 | 0.75 | 0.25 | 0.00 |
3 | 0.75 | 0.00 | 0.25 |
4 | 0.50 | 0.50 | 0.00 |
5 | 0.50 | 0.25 | 0.25 |
6 | 0.50 | 0.00 | 0.50 |
7 | 0.25 | 0.75 | 0.00 |
8 | 0.25 | 0.50 | 0.25 |
9 | 0.25 | 0.25 | 0.50 |
10 | 0.25 | 0.00 | 0.75 |
11 | 0.00 | 1.00 | 0.00 |
12 | 0.00 | 0.75 | 0.25 |
13 | 0.00 | 0.50 | 0.50 |
14 | 0.00 | 0.25 | 0.75 |
15 | 0.00 | 0.00 | 1.00 |
NOTE: The number of possible diets grows extremely quickly as the number of prey types increases and the diet increment decreases, and internal memory limits may be exceeded.
An example call to make_diet_grid()
is contained in the
following code snippet.
# Generate a diet grid.
diet_grid <- make_diet_grid(uniq_types = prey_info$uniq_types,
inv_inc = 8)
str(diet_grid)
make_diet_grid()
returns a list containing the following
objects.
Investigations to study performance characteristics of QFASA may require predator signatures to be simulated so that thay have known properties, such as a known diet composition. Such simulated predator signatures have been called “pseudo-predators” (Iverson et al. 2004). The diet compositions of the simulated predators are then estimated, allowing the observed characteristics of the estimates to be compared with the known characteristics of the specified diet (e.g., Bromaghin et al. 2016a).
Predator signatures are simulated using a procedure that that closely parallels diet estimation. A diet composition must first be specified (see Predator diet composition). Conditioned on the specified diet, a bootstrap sample of signatures from each prey type is drawn (sampling with replacement) and the mean signature of each prey type is computed. A predator signature is constructed as a linear mixture of the mean prey signatures, with the diet composition as the mixture proportions.
An important question in simulating predator signatures is the number of signatures to bootstrap from each prey type. In the literature, the choice of bootstrap sample sizes has been made subjectively. As examples, Iverson et al. (2004) specified three levels of prey sample size (30, 60, and 90) and Bromaghin et al. (2015) used bootstrap sample sizes equal to the sample sizes in the prey library. Some authors apparently used subjectively-determined sample sizes but did not report the values (e.g., Thiemann et al. 2008; Wang et al. 2010; Haynes et al. 2015). However, Bromaghin et al. (2016b) found that bootstrap sample sizes strongly influence both the bias and variance of diet composition estimates. Consequently, the selection of bootstrap sample sizes is an important aspect of simulation design. Using sample sizes that are too small or too large may lead to an overly pessimistic or optimistic assessment, respectively, of model capabilities.
Bromaghin (2015) presented an objective method of establishing a bootstrap sample size for each prey type that produces simulated predator signatures having a realistic level of between-signature variation. That algorithm and the function implementing it are briefly described in a subsequent subsection (Bootstrap sample size algorithm).
qfasar implements the simulation of predator signatures in
the function make_pred_sigs()
. Predator signatures are
first generated in the prey space and then transformed to their natural
predator space. make_pred_sigs()
is called using the
following arguments:
sig_scale
returned by a call to the
function prep_sig()
, or the object sig_part
returned by a call to the function make_prey_part()
.prey_sigs
, intended to
be the object loc
returned by a call to one of the
functions prep_fa()
or make_prey_part()
.cc
returned by a call to one of
the functions prep_fa()
or cc_aug()
.make_pred_sigs()
returns a list with the following
components:
Please see the code block in the section Bootstrap sample
size algorithm for an example call to
make_pred_sigs()
.
Bromaghin (2015) presented an objective method of establishing a
bootstrap sample size for each prey type that produces simulated
predator signatures having a realistic level of between-signature
variation. The algorithm, implemented in the function
find_boot_ss()
, is based on the conceptualization of
variation in predator signatures being partitioned into two sources,
variation in diet between predators and variation in prey consumed given
a common diet. Constructing predator signatures by bootstrapping prey
signatures mimics the selection of prey by predators, so the objective
is to find the level of variation in predator diets that is attributable
only to the selection of prey.
To derive a target level of variation in predator signatures,
conditioned on diet, a measure of variation between the estimated diets
of all possible pairs of predators is computed. Similarly, a measure of
variation between the signatures of all possible pairs of predators is
computed. A loess smooth is used to approximate the empirical
relationship between these two measures of variance. Conceptually, as
the variation between diet estimates approaches zero, the corresponding
level of variation between signatures approaches the variation
attributable to prey selection given a common diet. In practice,
find_boot_ss()
finds the pair of predators with the
smallest variation between their diets and takes the value of variation
from the loess smooth for that pair as the target level of
variation.
Given the target level of variation, the algorithm is initialized with a sample size of 1 from each prey type. A sample of predator signatures is generated using those sample sizes and the measure of variance among the signatures is computed. If the variance measure exceeds the target level, the prey type contributing most to the variance measure is identified and its sample size is increased by 1. The algorithm then iterates, increasing the sample size by one at each iteration, until the measure of variation drops below the target level for the first time. Please refer to Bromaghin (2015) for additional details.
find_boot_ss()
is designed to take objects returned by
est_diet()
(see Diet
estimation) as input. est_diet()
transforms
signatures to a common estimation space, as selected by the user, so
find_boot_ss()
operates in the estimation space used to
estimate predator diet composition. Similarly, if diets were estimated
with a partitioned prey library, the results of
find_boot_ss()
will be applicable to the prey types in the
partitioned library.
Example calls to find_boot_ss()
and
make_pred_sigs()
are contained in the following code
snippet.
# Find bootstrap sample sizes suitable for simulation work.
boot_ss <- find_boot_ss(pred_sigs = diet_est$pred_sigs,
pred_diets = diet_est$est_ind,
prey_sigs = diet_est$prey_sigs,
prey_loc = prey_info$loc,
n_pred_boot = 500)
str(boot_ss)
print(boot_ss$boot_ss)
# Simulate predator signatures.
my_sim_preds <- make_pred_sigs(prey_sigs = prey_info$sig_scale,
prey_loc = prey_info$loc,
cc = new_cc$cc,
diet = apply(X = diet_est$est_ind, MARGIN = 1,
FUN = mean),
prey_ss = boot_ss$boot_ss,
n_pred = 1000)
str(my_sim_preds)
NOTE: the argument n_pred_boot
controls the number of
predator signatures to simulate. The measure of variance over the
simulated predators is compared to the target level of variation.
Consequently, the value of n_pred_boot
should be large
enough to return an estimate of the variance measure that itself has low
variance, so that the algorithm returns a numerically stable estimate of
suitable sample sizes over repeated function calls. I suspect that the
default value of 1000 errs on the side of caution, but this has not been
tested.
find_boot_ss()
returns a list with the following
components:
The most important object returned by find_boot_ss()
is
the vector of sample sizes for the prey types (boot_ss
).
However, the other objects may also be of some interest. For example,
they can be used to create plots similar to those of Bromaghin
(2015).
# Figure 1 of Bromaghin (2015).
plot(x = boot_algo$var_diet, y = boot_algo$var_sig, cex.lab=1.5,
xlab="Variation in predator diets",
ylab="Variation in predator signatures")
lines(x = boot_algo$var_diet, y = boot_algo$var_sig_smooth, lwd=3, col="red")
# Figure 2 of Bromaghin (2015).
iteration <- 1:length(boot_algo$mod_sig_var)
plot(x = iteration, y = boot_algo$mod_sig_var,
pch = 16, cex.lab = 1.5, type = "b",
xlab = "Iteration",
ylab = "Modeled predator signature variance")
abline(boot_algo$var_target, 0)
One of the major assumptions of QFASA is that the prey library
contains representatives of all prey types consumed by a predator.
Bromaghin et al. (2016a) investigated the robustness of diet estimators
to violations of this assumption. qfasar implements their
method of constructing a signature for a prey type not in the library,
termed a ghost prey, in the function make_ghost()
.
The ghost prey signature is constructed by maximizing the summed
distance between the ghost prey signature and the mean prey signatures,
while constraining the ghost signature proportions within reasonable
bounds to ensure that the signature is somewhat realistic for the prey
library. The definition of reasonable bounds is embodied in the argument
ghost_err
. ghost_err
is a proportion greater
than or equal to zero and less than 1 used to construct box constraints
for each fatty acid in the signatures.
(1 - ghost_err)*(smallest mean prey proportion)
(1 + ghost_err)*(largest mean prey proportion)
The ghost prey signature is obtained by maximizing the summed distance between the signature and the mean prey signatures, constraining the signature to lie within the box constraints and sum to 1.
This method ensures that the ghost prey signature is somewhat
distinct from the other prey types, but not so wildly different that it
represents a completely different pattern from the other prey types.
Although research into suitable values for ghost_err
has
not been conducted, it is probably advisable to use small to moderate
values. Bromaghin et al. (2016) used a value of 0.25. As the value of
ghost_err
is increased, the resulting ghost prey signature
will tend to become increasing different from any prey type in the
library. If the value of ghost_err
is too small, the ghost
prey signature could potentially be similar to one or more prey types in
the library, which is probably not what one would want for a simulation
study.
The following code snippet contains an example call to
make_ghost()
.
# Specify a level of error.
ghost_err <- 0.25
# Construct a ghost prey signature.
ghost <- make_ghost(prey_sigs = prey_info$sig_scale,
loc = prey_info$loc,
ghost_err = ghost_err,
dist_meas = 1,
gamma = NA)
str(ghost)
The function make_ghost()
returns a list containing the
following objects.
The following code snippet illustrates how to simulate predator signatures incorporating a specified contribution of ghost prey to their diets.
## Generate predator signatures with a contribution from the ghost prey.
# Find bootstrap sample sizes suitable for simulation work.
# There is only one ghost signature, so its sample size should be 1.
boot_algo <- find_boot_ss(pred_sigs = diet_orig$pred_sigs,
pred_diets = diet_orig$est_ind,
prey_sigs = diet_orig$prey_sigs,
prey_loc = prey_info$loc,
n_pred_boot = 500)
ghost_ss <- c(boot_algo$boot_ss, 1)
# Append ghost signature to prey data.
ghost_sigs <- cbind(prey_info$sig_scale, ghost$sig)
colnames(ghost_sigs) <- c(colnames(prey_info$sig_scale), "ghost")
last_prey <- prey_info$loc[prey_info$n_types,2]+1
ghost_loc <- rbind(prey_info$loc, c(last_prey, last_prey))
rownames(ghost_loc) <- c(rownames(prey_info$loc), "ghost")
colnames(ghost_loc) <- c("first", "last")
# Generate a random diet for the non-ghost prey components. These proportions
# could be established in whatever way makes sense in a particular application.
non_ghost <- make_diet_rand(uniq_types = prey_info$uniq_types, n_diet = 1)
# Specify ghost prey component and add to the non-ghost components.
ghost_prop <- 0.15
ghost_diet <- c(non_ghost$diet_rand*(1 - ghost_prop), ghost_prop)
print(ghost_diet)
sum(ghost_diet)
# Generate predator signatures.
ghost_pred_sigs <- make_pred_sigs(prey_sigs = ghost_sigs,
prey_loc = ghost_loc,
cc = new_cc$cc,
diet = ghost_diet,
prey_ss = ghost_ss,
n_pred = 500)
str(ghost_pred_sigs)
One of the major assumptions of QFASA is that the calibration
coefficients are known perfectly. Bromaghin et al. (2016a) investigated
the robustness of diet estimators to violations of this assumption. The
function add_cc_err()
uses the methods of Bromaghin et
al. (2016a) to add error to a set of calibration coefficients.
The argument err_bound
is used to compute box
constraints for the calibration coefficients:
(1 - err_bound)*cc_true
(1 + err_bound)*cc_true
where the argument cc_true
is the set of true
calibration coefficients provided as input. A uniformly distributed
random number is generated between the bounds for each calibration
coefficient and the vector of coefficients is scaled so that their sum
equals the sum of the true calibration coefficients. The mean relative
absolute difference between the true and error-added calibration
coefficients is computed as a measure of error for the entire vector
(Bromaghin et al 2016a).
The influence of calibration coefficient error can be investigated by
generating predator signatures with the true set of calibration
coefficients and estimating diet with a set of coefficients
incorporating error (Bromaghin et al. 2016a). An example call to
add_cc_err()
is contained in the following code block.
# Find bootstrap sample sizes suitable for simulation work.
boot_ss <- find_boot_ss(pred_sigs = diet_est$pred_sigs,
pred_diets = diet_est$est_ind,
prey_sigs = diet_est$prey_sigs,
prey_loc = prey_info$loc,
n_pred_boot = 500)
# Compute the mean diet to use in a simulation.
mean_diet <- apply(X = diet_est$est_ind, MARGIN = 1, FUN = mean)
# Simulate predator signatures using the true calibration coefficients.
my_sim_preds <- make_pred_sigs(prey_sigs = prey_info$sig_scale,
prey_loc = prey_info$loc,
cc = new_cc$cc,
diet = mean_diet,
prey_ss = boot_ss$boot_ss,
n_pred = 1000)
# Add error to the calibration coefficients.
cc_err <- add_cc_err(cc_true = new_cc$cc, err_bound = 0.25)
str(cc_err)
print(cc_err$err)
# Estimate diets of the simulated predators using the error-added calibration coefficients.
err_est <- est_diet(pred_sigs = pred_info$sig_scale,
pred_uniq_types = pred_info$uniq_types,
pred_loc = pred_info$loc,
prey_sigs = prey_info$sig_scale,
prey_uniq_types = prey_info$uniq_types,
prey_loc = prey_info$loc,
cc = cc_err$cc,
space = my_space,
dist_meas = my_dist_meas,
gamma = my_gamma,
ind_boot = my_ind_boot,
mean_meth = my_mean_meth,
var_meth = my_var_meth,
mean_boot = my_mean_boot)
The function add_cc_err()
returns a list containing the
following objects.
Aitchison, J., C. Barcelo-Vidal, J.A. Martin-Fernandez, and V. Pawlowsky-Glahn. 2000. Logratio analysis and compositional distance. Mathematical Geology 32:271-275.
Beck, C.A., S.J. Iverson, W.D. Bowen, and W. Blanchard. 2007. Sex differences in grey seal diet reflect seasonal variation in foraging behaviour and reproductive espenditure: evidence from quantitative fatty acid signature analysis. Journal of Animal Ecology 76:490-502.
Bromaghin, J.F. In Press. qfasar: quantitative fatty acid signature analysis in R.
Bromaghin, J.F. 2015. Simulating realistic predator signatures in quantitative fatty acid signature analysis. Ecological Informatics 30:68-71.
Bromaghin, J.F. 2008. BELS: Backward elimination locus selection for studies of mixture composition or individual assignment. Molecular Ecology Resources 8:568-571.
Bromaghin, J.F., S.M. Budge, and G.W. Thiemann. Under review. Detect and exploit hidden structure in fatty acid signature data.
Bromaghin, J.F., S.M. Budge, and G.W. Thiemann. 2016b. Should fatty acid signature proportions sum to 1 for diet estimation? Ecological Research 31:597-606.
Bromaghin, J.F., S.M. Budge, G.W. Thiemann, and K.D. Rode. 2016a. Assessing the robustness of quantitative fatty acid signature analysis to assumption violations. Methods in Ecology and Evolution 7:51-59.
Bromaghin, J.F., K.D. Rode, S.M. Budge, and G.W. Thiemann. 2015. Distance measures and optimization spaces in quantitative fatty acid signature analysis. Ecology and Evolution 5:1249-1262.
Bromaghin, J.F., M.M. Lance, E.W. Elliott, S.J. Jeffries, A. Acevedo-Gutierrez, and J.M. Kennish. 2013. New insights into the diets of harbor seals (Phoca vitulina) in the Salish Sea revealed by analysis of fatty acid signatures. Fishery Bulletin 111:13–26.
Budge, S.M., S.J. Iverson, and H.N. Koopman. 2006. Studying trophic ecology in marine ecosystems using fatty acids: a primer on analysis and interpretation. Marine Mammal Science 22:759–801.
Budge, S.M., S.J. Iverson, W.D. Bowen, and R.G. Ackman. 2002. Among- and within-species variability in fatty acid signatures of marine fish and invertebrates on the Scotian Shelf, Georges Bank, and southern Gulf of St. Lawrence. Canadian Journal of Fisheries and Aquatic Sciences 59:886-898.
Ghalanos, A., and S. Theussl. 2014. Rsolnp: general non-linear optimization using augmented Lagrange multiplier method. R package version 1:15.
Haynes, T.B., J.A. Schmutz, J.F. Bromaghin, S.J. Iverson, V.M. Padula, and A.E. Rosenberger. 2015. Diet of yellow-billed loons (Gavia adamsii) in Arctic lakes during the nesting season inferred from fatty acid analysis. Polar Biology 38:1239-1247.
Iverson, S.J., C. Field, W.D. Bowen, and W. Blanchard. 2004. Quantitative fatty acid signature analysis: A new method of estimating predator diets. Ecological Monographs 74:211-235.
Maechler, M., P. Rousseeuw, A. Struyf, M. Hubert, and K. Hornik. 2016. cluster: cluster analysis basics and extensions. R package version 2.0.4.
Martin-Fernandez, J.A., J. Palarea-Albaladejo, and R.A. Olea. 2011. Dealing with zeros. P. 43-58 in V. Pawlowsky-Glahn and A. Buccianto, eds. Compositional data analysis: theory and application. John Wiley, Chichester.
Meynier, L., P.C.H. Morel, B.L. Chilvers, D.D.S. Mackenzie, and P. Duignan. 2010. Quantitative fatty acid signature analysis on New Zealand sea lions: model sensitivity and diet estimates. Journal of Mammalogy 91:1484–1495.
Nordstrom, C.A., L.J. Wilson, S.J. Iverson, and D.J. Tollit. 2008. Evaluating quantitative fatty acid signature analysis (QFASA) using harbour seals Phoca vitulina richardsi in captive feeding studies. Marine Ecology Progress Series 360:245–263.
Rode, K.D., E.V. Regehr, D.C. Douglas, G. Durner, A.E. Derocher, G.W. Thiemann, and S.M. Budge. 2014. Variation in the response of an arctic top predator experiencing habitat loss: feeding and reproductive ecology of two polar bear populations. Global Change Biology 20:76–88.
Rosen, D.A.S. & D.J. Tollit. 2012. Effects of phylogeny and prey type on fatty acid calibration coefficients in three pinniped species: implications for the QFASA dietary quantification technique. Marine Ecology Progress Series 467:263–276.
Seber, G.A.F. 1982. The Estimation of Animal Abundance and Related Parameters, second edition. Macmillan Publishing Co., New York.
Stewart, C., and C. Field. 2011. Managing the essential zeros in quantitative fatty acid signature analysis. Journal of Agricultural, Biological, and Environmental Statistics 16:45?69.
Stewart, C., S. Iverson, and C. Field. 2014. Testing for a change in diet using fatty acid signatures. Environmental and Ecological Statistics 21:775-792.
Thiemann, G.W., S.J. Iverson, and I. Stirling. 2008. Polar bear diets and Arctic marine food webs: insights from fatty acid analysis. Ecological Monographs 78:591-613.
Wang, S.W., T.E. Hollmen, and S.J. Iverson. 2010. Validating quantitative fatty acid signature analysis to estimate diets of spectacled and Steller’s eiders (Somateria fischeri and Polysticta stelleri). Journal of Comparative Physiology B 180:125–139.