Control Options for FOCEi

foceiControl(
  sigdig = 3,
  ...,
  epsilon = NULL,
  maxInnerIterations = 1000,
  maxOuterIterations = 5000,
  n1qn1nsim = NULL,
  method = c("liblsoda", "lsoda", "dop853"),
  transitAbs = NULL,
  atol = NULL,
  rtol = NULL,
  atolSens = NULL,
  rtolSens = NULL,
  ssAtol = NULL,
  ssRtol = NULL,
  ssAtolSens = NULL,
  ssRtolSens = NULL,
  minSS = 10L,
  maxSS = 1000L,
  maxstepsOde = 500000L,
  hmin = 0L,
  hmax = NA_real_,
  hini = 0,
  maxordn = 12L,
  maxords = 5L,
  cores,
  covsInterpolation = c("locf", "linear", "nocb", "midpoint"),
  print = 1L,
  printNcol = floor((getOption("width") - 23)/12),
  scaleTo = 1,
  scaleObjective = 0,
  normType = c("rescale2", "mean", "rescale", "std", "len", "constant"),
  scaleType = c("nlmixr", "norm", "mult", "multAdd"),
  scaleCmax = 1e+05,
  scaleCmin = 1e-05,
  scaleC = NULL,
  scaleC0 = 1e+05,
  derivEps = rep(20 * sqrt(.Machine$double.eps), 2),
  derivMethod = c("switch", "forward", "central"),
  derivSwitchTol = NULL,
  covDerivMethod = c("central", "forward"),
  covMethod = c("r,s", "r", "s", ""),
  hessEps = (.Machine$double.eps)^(1/3),
  eventFD = sqrt(.Machine$double.eps),
  eventType = c("gill", "central", "forward"),
  centralDerivEps = rep(20 * sqrt(.Machine$double.eps), 2),
  lbfgsLmm = 7L,
  lbfgsPgtol = 0,
  lbfgsFactr = NULL,
  eigen = TRUE,
  addPosthoc = TRUE,
  diagXform = c("sqrt", "log", "identity"),
  sumProd = FALSE,
  optExpression = TRUE,
  ci = 0.95,
  useColor = crayon::has_color(),
  boundTol = NULL,
  calcTables = TRUE,
  noAbort = TRUE,
  interaction = TRUE,
  cholSEtol = (.Machine$double.eps)^(1/3),
  cholAccept = 0.001,
  resetEtaP = 0.15,
  resetThetaP = 0.05,
  resetThetaFinalP = 0.15,
  diagOmegaBoundUpper = 5,
  diagOmegaBoundLower = 100,
  cholSEOpt = FALSE,
  cholSECov = FALSE,
  fo = FALSE,
  covTryHarder = FALSE,
  outerOpt = c("nlminb", "bobyqa", "lbfgsb3c", "L-BFGS-B", "mma", "lbfgsbLG", "slsqp",
    "Rvmmin"),
  innerOpt = c("n1qn1", "BFGS"),
  rhobeg = 0.2,
  rhoend = NULL,
  npt = NULL,
  rel.tol = NULL,
  x.tol = NULL,
  eval.max = 4000,
  iter.max = 2000,
  abstol = NULL,
  reltol = NULL,
  resetHessianAndEta = FALSE,
  stateTrim = Inf,
  gillK = 10L,
  gillStep = 4,
  gillFtol = 0,
  gillRtol = sqrt(.Machine$double.eps),
  gillKcov = 10L,
  gillStepCov = 2,
  gillFtolCov = 0,
  rmatNorm = TRUE,
  smatNorm = TRUE,
  covGillF = TRUE,
  optGillF = TRUE,
  covSmall = 1e-05,
  adjLik = TRUE,
  gradTrim = Inf,
  maxOdeRecalc = 5,
  odeRecalcFactor = 10^(0.5),
  gradCalcCentralSmall = 1e-04,
  gradCalcCentralLarge = 10000,
  etaNudge = qnorm(1 - 0.05/2)/sqrt(3),
  etaNudge2 = qnorm(1 - 0.05/2) * sqrt(3/5),
  stiff,
  nRetries = 3,
  seed = 42,
  resetThetaCheckPer = 0.1,
  etaMat = NULL,
  repeatGillMax = 3,
  stickyRecalcN = 5,
  gradProgressOfvTime = 10,
  addProp = c("combined2", "combined1"),
  singleOde = TRUE,
  badSolveObjfAdj = 100
)

Arguments

sigdig

Optimization significant digits. This controls:

  • The tolerance of the inner and outer optimization is 10^-sigdig

  • The tolerance of the ODE solvers is 0.5*10^(-sigdig-2); For the sensitivity equations and steady-state solutions the default is 0.5*10^(-sigdig-1.5) (sensitivity changes only applicable for liblsoda)

  • The tolerance of the boundary check is 5 * 10 ^ (-sigdig + 1)

  • The significant figures that some tables are rounded to.

...

Ignored parameters

epsilon

Precision of estimate for n1qn1 optimization.

maxInnerIterations

Number of iterations for n1qn1 optimization.

maxOuterIterations

Maximum number of L-BFGS-B optimization for outer problem.

n1qn1nsim

Number of function evaluations for n1qn1 optimization.

method

The method for solving ODEs. Currently this supports:

  • "liblsoda" thread safe lsoda. This supports parallel thread-based solving, and ignores user Jacobian specification.

  • "lsoda" -- LSODA solver. Does not support parallel thread-based solving, but allows user Jacobian specification.

  • "dop853" -- DOP853 solver. Does not support parallel thread-based solving nor user Jacobain specification

  • "indLin" -- Solving through inductive linearization. The RxODE dll must be setup specially to use this solving routine.

transitAbs

boolean indicating if this is a transit compartment absorption

atol

a numeric absolute tolerance (1e-8 by default) used by the ODE solver to determine if a good solution has been achieved; This is also used in the solved linear model to check if prior doses do not add anything to the solution.

rtol

a numeric relative tolerance (1e-6 by default) used by the ODE solver to determine if a good solution has been achieved. This is also used in the solved linear model to check if prior doses do not add anything to the solution.

atolSens

Sensitivity atol, can be different than atol with liblsoda. This allows a less accurate solve for gradients (if desired)

rtolSens

Sensitivity rtol, can be different than rtol with liblsoda. This allows a less accurate solve for gradients (if desired)

ssAtol

Steady state absolute tolerance (atol) for calculating if steady-state has been archived.

ssRtol

Steady state relative tolerance (rtol) for calculating if steady-state has been achieved.

ssAtolSens

Sensitivity absolute tolerance (atol) for calculating if steady state has been achieved for sensitivity compartments.

ssRtolSens

Sensitivity relative tolerance (rtol) for calculating if steady state has been achieved for sensitivity compartments.

minSS

Minimum number of iterations for a steady-state dose

maxSS

Maximum number of iterations for a steady-state dose

maxstepsOde

Maximum number of steps for ODE solver.

hmin

The minimum absolute step size allowed. The default value is 0.

hmax

The maximum absolute step size allowed. When hmax=NA (default), uses the average difference + hmaxSd*sd in times and sampling events. The hmaxSd is a user specified parameter and which defaults to zero. When hmax=NULL RxODE uses the maximum difference in times in your sampling and events. The value 0 is equivalent to infinite maximum absolute step size.

hini

The step size to be attempted on the first step. The default value is determined by the solver (when hini = 0)

maxordn

The maximum order to be allowed for the nonstiff (Adams) method. The default is 12. It can be between 1 and 12.

maxords

The maximum order to be allowed for the stiff (BDF) method. The default value is 5. This can be between 1 and 5.

cores

Number of cores used in parallel ODE solving. This is equivalent to calling setRxThreads()

covsInterpolation

specifies the interpolation method for time-varying covariates. When solving ODEs it often samples times outside the sampling time specified in events. When this happens, the time varying covariates are interpolated. Currently this can be:

  • "linear" interpolation, which interpolates the covariate by solving the line between the observed covariates and extrapolating the new covariate value.

  • "constant" -- Last observation carried forward (the default).

  • "NOCB" -- Next Observation Carried Backward. This is the same method that NONMEM uses.

  • "midpoint" Last observation carried forward to midpoint; Next observation carried backward to midpoint.

print

Integer representing when the outer step is printed. When this is 0 or do not print the iterations. 1 is print every function evaluation (default), 5 is print every 5 evaluations.

printNcol

Number of columns to printout before wrapping parameter estimates/gradient

scaleTo

Scale the initial parameter estimate to this value. By default this is 1. When zero or below, no scaling is performed.

scaleObjective

Scale the initial objective function to this value. By default this is 1.

normType

This is the type of parameter normalization/scaling used to get the scaled initial values for nlmixr. These are used with scaleType of.

With the exception of rescale2, these come from Feature Scaling. The rescale2 The rescaling is the same type described in the OptdesX software manual.

In general, all all scaling formula can be described by:

v_scaled = (v_unscaled-C_1)/C_2

Where

The other data normalization approaches follow the following formula

v_scaled = (v_unscaled-C_1)/C_2;

  • rescale2 This scales all parameters from (-1 to 1). The relative differences between the parameters are preserved with this approach and the constants are:

    C_1 = (max(all unscaled values)+min(all unscaled values))/2

    C_2 = (max(all unscaled values) - min(all unscaled values))/2

  • rescale or min-max normalization. This rescales all parameters from (0 to 1). As in the rescale2 the relative differences are preserved. In this approach:

    C_1 = min(all unscaled values)

    C_2 = max(all unscaled values) - min(all unscaled values)

  • mean or mean normalization. This rescales to center the parameters around the mean but the parameters are from 0 to 1. In this approach:

    C_1 = mean(all unscaled values)

    C_2 = max(all unscaled values) - min(all unscaled values)

  • std or standardization. This standardizes by the mean and standard deviation. In this approach:

    C_1 = mean(all unscaled values)

    C_2 = sd(all unscaled values)

  • len or unit length scaling. This scales the parameters to the unit length. For this approach we use the Euclidean length, that is:

    C_1 = 0

    C_2 = sqrt(v_1^2 + v_2^2 + ... + v_n^2)

  • constant which does not perform data normalization. That is

    C_1 = 0

    C_2 = 1

scaleType

The scaling scheme for nlmixr. The supported types are:

  • nlmixr In this approach the scaling is performed by the following equation:

    v_scaled = (v_current - v_init)/scaleC[i] + scaleTo

    The scaleTo parameter is specified by the normType, and the scales are specified by scaleC.

  • norm This approach uses the simple scaling provided by the normType argument.

  • mult This approach does not use the data normalization provided by normType, but rather uses multiplicative scaling to a constant provided by the scaleTo argument.

    In this case:

    v_scaled = v_current/v_init*scaleTo

  • multAdd This approach changes the scaling based on the parameter being specified. If a parameter is defined in an exponential block (ie exp(theta)), then it is scaled on a linearly, that is:

    v_scaled = (v_current-v_init) + scaleTo

    Otherwise the parameter is scaled multiplicatively.

    v_scaled = v_current/v_init*scaleTo

scaleCmax

Maximum value of the scaleC to prevent overflow.

scaleCmin

Minimum value of the scaleC to prevent underflow.

scaleC

The scaling constant used with scaleType=nlmixr. When not specified, it is based on the type of parameter that is estimated. The idea is to keep the derivatives similar on a log scale to have similar gradient sizes. Hence parameters like log(exp(theta)) would have a scaling factor of 1 and log(theta) would have a scaling factor of ini_value (to scale by 1/value; ie d/dt(log(ini_value)) = 1/ini_value or scaleC=ini_value)

  • For parameters in an exponential (ie exp(theta)) or parameters specifying powers, boxCox or yeoJohnson transformations , this is 1.

  • For additive, proportional, lognormal error structures, these are given by 0.5*abs(initial_estimate)

  • Factorials are scaled by abs(1/digamma(inital_estimate+1))

  • parameters in a log scale (ie log(theta)) are transformed by log(abs(initial_estimate))*abs(initial_estimate)

These parameter scaling coefficients are chose to try to keep similar slopes among parameters. That is they all follow the slopes approximately on a log-scale.

While these are chosen in a logical manner, they may not always apply. You can specify each parameters scaling factor by this parameter if you wish.

scaleC0

Number to adjust the scaling factor by if the initial gradient is zero.

derivEps

Forward difference tolerances, which is a vector of relative difference and absolute difference. The central/forward difference step size h is calculated as:

h = abs(x)*derivEps[1] + derivEps[2]

derivMethod

indicates the method for calculating derivatives of the outer problem. Currently supports "switch", "central" and "forward" difference methods. Switch starts with forward differences. This will switch to central differences when abs(delta(OFV)) <= derivSwitchTol and switch back to forward differences when abs(delta(OFV)) > derivSwitchTol.

derivSwitchTol

The tolerance to switch forward to central differences.

covDerivMethod

indicates the method for calculating the derivatives while calculating the covariance components (Hessian and S).

covMethod

Method for calculating covariance. In this discussion, R is the Hessian matrix of the objective function. The S matrix is the sum of individual gradient cross-product (evaluated at the individual empirical Bayes estimates).

  • "r,s" Uses the sandwich matrix to calculate the covariance, that is: solve(R) %*% S %*% solve(R)

  • "r" Uses the Hessian matrix to calculate the covariance as 2 %*% solve(R)

  • "s" Uses the cross-product matrix to calculate the covariance as 4 %*% solve(S)

  • "" Does not calculate the covariance step.

hessEps

is a double value representing the epsilon for the Hessian calculation.

eventFD

Finite difference step for forward or central difference estimation of event-based gradients

eventType

Event gradient type for dosing events; Can be "gill", "central" or "forward"

centralDerivEps

Central difference tolerances. This is a numeric vector of relative difference and absolute difference. The central/forward difference step size h is calculated as:

h = abs(x)*derivEps[1] + derivEps[2]

lbfgsLmm

An integer giving the number of BFGS updates retained in the "L-BFGS-B" method, It defaults to 7.

lbfgsPgtol

is a double precision variable.

On entry pgtol >= 0 is specified by the user. The iteration will stop when:

max(\| proj g_i \| i = 1, ..., n) <= lbfgsPgtol

where pg_i is the ith component of the projected gradient.

On exit pgtol is unchanged. This defaults to zero, when the check is suppressed.

lbfgsFactr

Controls the convergence of the "L-BFGS-B" method. Convergence occurs when the reduction in the objective is within this factor of the machine tolerance. Default is 1e10, which gives a tolerance of about 2e-6, approximately 4 sigdigs. You can check your exact tolerance by multiplying this value by .Machine$double.eps

eigen

A boolean indicating if eigenvectors are calculated to include a condition number calculation.

addPosthoc

Boolean indicating if posthoc parameters are added to the table output.

diagXform

This is the transformation used on the diagonal of the chol(solve(omega)). This matrix and values are the parameters estimated in FOCEi. The possibilities are:

  • sqrt Estimates the sqrt of the diagonal elements of chol(solve(omega)). This is the default method.

  • log Estimates the log of the diagonal elements of chol(solve(omega))

  • identity Estimates the diagonal elements without any transformations

sumProd

Is a boolean indicating if the model should change multiplication to high precision multiplication and sums to high precision sums using the PreciseSums package. By default this is FALSE.

optExpression

Optimize the RxODE expression to speed up calculation. By default this is turned on.

ci

Confidence level for some tables. By default this is 0.95 or 95% confidence.

useColor

Boolean indicating if focei can use ASCII color codes

boundTol

Tolerance for boundary issues.

calcTables

This boolean is to determine if the foceiFit will calculate tables. By default this is TRUE

noAbort

Boolean to indicate if you should abort the FOCEi evaluation if it runs into troubles. (default TRUE)

interaction

Boolean indicate FOCEi should be used (TRUE) instead of FOCE (FALSE)

cholSEtol

tolerance for Generalized Cholesky Decomposition. Defaults to suggested (.Machine$double.eps)^(1/3)

cholAccept

Tolerance to accept a Generalized Cholesky Decomposition for a R or S matrix.

resetEtaP

represents the p-value for reseting the individual ETA to 0 during optimization (instead of the saved value). The two test statistics used in the z-test are either chol(omega^-1) %*% eta or eta/sd(allEtas). A p-value of 0 indicates the ETAs never reset. A p-value of 1 indicates the ETAs always reset.

resetThetaP

represents the p-value for reseting the population mu-referenced THETA parameters based on ETA drift during optimization, and resetting the optimization. A p-value of 0 indicates the THETAs never reset. A p-value of 1 indicates the THETAs always reset and is not allowed. The theta reset is checked at the beginning and when nearing a local minima. The percent change in objective function where a theta reset check is initiated is controlled in resetThetaCheckPer.

resetThetaFinalP

represents the p-value for reseting the population mu-referenced THETA parameters based on ETA drift during optimization, and resetting the optimization one final time.

diagOmegaBoundUpper

This represents the upper bound of the diagonal omega matrix. The upper bound is given by diag(omega)*diagOmegaBoundUpper. If diagOmegaBoundUpper is 1, there is no upper bound on Omega.

diagOmegaBoundLower

This represents the lower bound of the diagonal omega matrix. The lower bound is given by diag(omega)/diagOmegaBoundUpper. If diagOmegaBoundLower is 1, there is no lower bound on Omega.

cholSEOpt

Boolean indicating if the generalized Cholesky should be used while optimizing.

cholSECov

Boolean indicating if the generalized Cholesky should be used while calculating the Covariance Matrix.

fo

is a boolean indicating if this is a FO approximation routine.

covTryHarder

If the R matrix is non-positive definite and cannot be corrected to be non-positive definite try estimating the Hessian on the unscaled parameter space.

outerOpt

optimization method for the outer problem

innerOpt

optimization method for the inner problem (not implemented yet.)

rhobeg

Beginning change in parameters for bobyqa algorithm (trust region). By default this is 0.2 or 20 parameters when the parameters are scaled to 1. rhobeg and rhoend must be set to the initial and final values of a trust region radius, so both must be positive with 0 < rhoend < rhobeg. Typically rhobeg should be about one tenth of the greatest expected change to a variable. Note also that smallest difference abs(upper-lower) should be greater than or equal to rhobeg*2. If this is not the case then rhobeg will be adjusted.

rhoend

The smallest value of the trust region radius that is allowed. If not defined, then 10^(-sigdig-1) will be used.

npt

The number of points used to approximate the objective function via a quadratic approximation for bobyqa. The value of npt must be in the interval [n+2,(n+1)(n+2)/2] where n is the number of parameters in par. Choices that exceed 2*n+1 are not recommended. If not defined, it will be set to 2*n + 1

rel.tol

Relative tolerance before nlminb stops.

x.tol

X tolerance for nlmixr optimizers

eval.max

Number of maximum evaluations of the objective function

iter.max

Maximum number of iterations allowed.

abstol

Absolute tolerance for nlmixr optimizer

reltol

tolerance for nlmixr

resetHessianAndEta

is a boolean representing if the individual Hessian is reset when ETAs are reset using the option resetEtaP.

stateTrim

Trim state amounts/concentrations to this value.

gillK

The total number of possible steps to determine the optimal forward/central difference step size per parameter (by the Gill 1983 method). If 0, no optimal step size is determined. Otherwise this is the optimal step size determined.

gillStep

When looking for the optimal forward difference step size, this is This is the step size to increase the initial estimate by. So each iteration the new step size = (prior step size)*gillStep

gillFtol

The gillFtol is the gradient error tolerance that is acceptable before issuing a warning/error about the gradient estimates.

gillRtol

The relative tolerance used for Gill 1983 determination of optimal step size.

gillKcov

The total number of possible steps to determine the optimal forward/central difference step size per parameter (by the Gill 1983 method) during the covariance step. If 0, no optimal step size is determined. Otherwise this is the optimal step size determined.

gillStepCov

When looking for the optimal forward difference step size, this is This is the step size to increase the initial estimate by. So each iteration during the covariance step is equal to the new step size = (prior step size)*gillStepCov

gillFtolCov

The gillFtol is the gradient error tolerance that is acceptable before issuing a warning/error about the gradient estimates during the covariance step.

rmatNorm

A parameter to normalize gradient step size by the parameter value during the calculation of the R matrix

smatNorm

A parameter to normalize gradient step size by the parameter value during the calculation of the S matrix

covGillF

Use the Gill calculated optimal Forward difference step size for the instead of the central difference step size during the central difference gradient calculation.

optGillF

Use the Gill calculated optimal Forward difference step size for the instead of the central difference step size during the central differences for optimization.

covSmall

The covSmall is the small number to compare covariance numbers before rejecting an estimate of the covariance as the final estimate (when comparing sandwich vs R/S matrix estimates of the covariance). This number controls how small the variance is before the covariance matrix is rejected.

adjLik

In nlmixr, the objective function matches NONMEM's objective function, which removes a 2*pi constant from the likelihood calculation. If this is TRUE, the likelihood function is adjusted by this 2*pi factor. When adjusted this number more closely matches the likelihood approximations of nlme, and SAS approximations. Regardless of if this is turned on or off the objective function matches NONMEM's objective function.

gradTrim

The parameter to adjust the gradient to if the |gradient| is very large.

maxOdeRecalc

Maximum number of times to reduce the ODE tolerances and try to resolve the system if there was a bad ODE solve.

odeRecalcFactor

The factor to increase the rtol/atol with bad ODE solving.

gradCalcCentralSmall

A small number that represents the value where |grad| < gradCalcCentralSmall where forward differences switch to central differences.

gradCalcCentralLarge

A large number that represents the value where |grad| > gradCalcCentralLarge where forward differences switch to central differences.

etaNudge

By default initial ETA estimates start at zero; Sometimes this doesn't optimize appropriately. If this value is non-zero, when the n1qn1 optimization didn't perform appropriately, reset the Hessian, and nudge the ETA up by this value; If the ETA still doesn't move, nudge the ETA down by this value. By default this value is qnorm(1-0.05/2)*1/sqrt(3), the first of the Gauss Quadrature numbers times by the 0.95% normal region. If this is not successful try the second eta nudge number (below). If +-etaNudge2 is not successful, then assign to zero and do not optimize any longer

etaNudge2

This is the second eta nudge. By default it is qnorm(1-0.05/2)*sqrt(3/5), which is the n=3 quadrature point (excluding zero) times by the 0.95% normal region

stiff

a logical (TRUE by default) indicating whether the ODE system is stiff or not.

For stiff ODE systems (`stiff = TRUE`), `RxODE` uses the
LSODA (Livermore Solver for Ordinary Differential Equations)
Fortran package, which implements an automatic method switching
for stiff and non-stiff problems along the integration
interval, authored by Hindmarsh and Petzold (2003).

For non-stiff systems (`stiff = FALSE`), `RxODE` uses
DOP853, an explicit Runge-Kutta method of order 8(5, 3) of
Dormand and Prince as implemented in C by Hairer and Wanner
(1993).

If stiff is not specified, the `method` argument is used instead.
nRetries

If FOCEi doesn't fit with the current parameter estimates, randomly sample new parameter estimates and restart the problem. This is similar to 'PsN' resampling.

seed

seed for random number generator

resetThetaCheckPer

represents objective function % percentage below which resetThetaP is checked.

etaMat

Eta matrix for initial estimates or final estimates of the ETAs.

repeatGillMax

If the tolerances were reduced when calculating the initial Gill differences, the Gill difference is repeated up to a maximum number of times defined by this parameter.

stickyRecalcN

The number of bad ODE solves before reducing the atol/rtol for the rest of the problem.

gradProgressOfvTime

This is the time for a single objective function evaluation (in seconds) to start progress bars on gradient evaluations

addProp

one of "combined1" and "combined2"; These are the two forms of additive+proportional errors supported by monolix/nonmem:

combined1: transform(y)=transform(f)+(a+b*f^c)*eps

combined2: transform(y)=transform(f)+(a^2+b^2*f^(2c))*eps

singleOde

This option allows a single ode model to include the PK parameter information instead of splitting it into a function and a RxODE model

badSolveObjfAdj

The objective function adjustment when the ODE system cannot be solved. It is based on each individual bad solve.

Value

The control object that changes the options for the FOCEi family of estimation methods

Details

Note this uses the R's L-BFGS-B in optim for the outer problem and the BFGS n1qn1 with that allows restoring the prior individual Hessian (for faster optimization speed).

However the inner problem is not scaled. Since most eta estimates start near zero, scaling for these parameters do not make sense.

This process of scaling can fix some ill conditioning for the unscaled problem. The covariance step is performed on the unscaled problem, so the condition number of that matrix may not be reflective of the scaled problem's condition-number.

See also

Author

Matthew L. Fidler