Calculations
Calculations describe the specific options of the calculation, including the basis set, level of theory, and properties to calculate.
The meta: class_name:
option is used to specify which computational engine the calculation
is designed for:
calculation:
meta:
class_name: Gaussian
calculation:
meta:
class_name: Turbomole
calculation:
meta:
class_name: Orca
However, most of the main calculation options are the same for all engines. This means the same
calculation definition can often be re-used across multiple programs, so long as the class_name
is set appropriately.
Calculation method
The method defines the type of calculation, ie what level of theory to use. Only one method can be chosen at a time.
Hartree–Fock (HF)
To run a calculation at the Hartree–Fock (also known as the self-consistent field) level,
set the method: hf: calc:
option:
calculation:
method:
hf:
calc: True
There are no other settings for the HF method, but the SCF procedure itself can be configured with the SCF block.
Density-functional theory (DFT)
To run a calculation at the DFT level, set the method: dft: calc:
option:
calculation:
method:
dft:
calc: True
The functional is set with method: dft: functional:
, which is the common name
of the functional to use. Digichem will make reasonable attempts to convert the given
name to the program specific version, if applicable. For example:
calculation:
method:
dft:
calc: True
functional: PBE0
calculation:
method:
dft:
calc: True
functional: PBE1PBE
Will both select the famous hybrid functional by Perdew, Burke, Ernzerhof[14, 15] and Adamo.[1] Digichem does not place restrictions on the allowed functional names. Any functional that is exposed by the underlying computational program can be set here. Refer to the manual of your chosen computational engine for a full list of supported functionals.
An empirical dispersion correction can be included with the method: dft: dispersion:
option:
calculation:
method:
dft:
calc: True
dispersion: GD3BJ
The following values are recognised:
Note, however, that not all dispersion models are compatible with all computational engines and/or functionals.
Note
Some DFT functional explicitly include empirical dispersion as part of their definition
(for example B97D
in Gaussian). For these functionals, the method: dft: dispersion
should be omitted.
The size of the numerical integration grid can be controlled with the method: dft: grid:
option:
calculation:
method:
dft:
calc: True
grid: Default # Or Medium, Large, Huge etc...
Common grid sizes can be selected using the keywords Default
, Tiny
, Small
, Medium
, Large
, and
Huge
, and appropriate values will be selected for the computational program. Larger (more dense) grids generally
result in higher integration accuracy (up to a plateau) at the cost of increased calculation duration.
Alternatively, a grid size can be specified in the format native to the program, for example:
calculation:
meta:
class-name: Gaussian
method:
dft:
calc: True
grid: 99302 # Gaussian's (99,302) grid.
calculation:
meta:
class-name: Turbomole
method:
dft:
calc: True
grid: m5 # Turbomole's multiple grid 5.
Møller–Plesset perturbation theory
To run a calculation at the MPn level, set the method: mp: calc:
option:
calculation:
method:
mp:
calc: True
The ‘order’ of the MP calculation can be selected with the method: mp: level:
option:
calculation:
method:
mp:
calc: True
level: MP2 # Or MP3, MP4, MP5 etc.
MP2 is the default. Higher levels (up to MP5) can be selected with the relevant option (MP3
, MP4
, or MP5
),
if supported by the chosen calculation engine. Some engines (Gaussian) also support truncated MP methods, including MP4(DQ)
(only double and quadruple expansions) and MP4(SDQ)
(only single, double, and quadruple expansions), and these can also
be selected with the relevant keyword.
Coupled-cluster theory
To run a calculation at the CC level, set the method: cc: calc:
option:
calculation:
method:
cc:
calc: True
Similar to MP theory, the type of coupled-cluster calculation can be selected with the method: cc: level:
option:
calculation:
method:
cc:
calc: True
level: CCSD # Or CCSD(T) etc.
The most common options are CC2
(second-order approximate coupled-cluster[6]), CCSD
, and CCSD(T)
, but specific calculation engines may support
additional CC calculations. Refer to the manual of your chosen computational engine for a full list of supported CC methods.
Self consistent field
Note
The scf
options will move in a future version.
Options for controlling the self-consistent field (SCF) procedure are set in the scf
block.
These settings apply to all calculation methods that rely, at least in part, on HF or DFT (HF, DFT, MP and CC).
Note
In most cases, the SCF options can be left as their defaults.
The SCF algorithm can be selected with the scf: method:
option.
calculation:
scf:
method: Default
All programs support a direct SCF algorithm (Simple
) and the Direct Inversion in the
Iterative Subspace (DIIS
) method, and for Orca and Turbomole this latter is
the default. Gaussian instead defaults to a hybrid method based on the energy-DIIS (eDIIS) and
standard DIIS methods. Pure DIIS can instead be selected with the CDIIS
keyword:
calculation:
scf:
method: CDIIS
The available SCF algorithms are highly dependent on the chosen calculation engine:
Gaussian supports a number of SCF method keywords, some of which are challenging to decipher. See the Gaussian manual for more information. Digichem recognises the following keywords for Gaussian:
calculation:
scf:
# Pick one:
method: Simple # Direct SCF
method: Default # eDIIS and DIIS hybrid
method: DIIS # Enable DIIS, may have no effect
method: CDIIS # DIIS only (disable eDIIS)
method: DM # Direct minimization, may be equivalent to Simple
method: SD # Steepest descent
method: SSD # Scaled steepest descent
method: QC # Quadratically converged, recommended for difficult conversion
method: XQC # Attempt QC if standard convergence failes
method: YQC # A hybrid method involving QC
Orca supports the Simple
, DIIS
, KDIIS
[10], DIIS-SOSCF
and TRAH
methods.
Additionally, Orca can start with one of the DIIS
-like methods, and switch to TRAH
if this initial convergence fails. This mode is supported with the auto_trah:
option
(which has no effect if TRAH
is explicitly chosen):
calculation:
scf:
# Pick one:
method: Simple # Direct SCF
method: DIIS # Default
method: KDIIS # Kollmar's variant DIIS
method: DIIS-SOSCF # (Approximate) second-order SCF
method: TRAH # Trust-region augmented hessian, recommended for difficult convergence
# Optional:
auto_trah: True # Defaults to True, disable with False
Turbomole only supports the DIIS
and Simple
(direct SCF) procedures:
calculation:
scf:
# Pick one:
method: Simple # Direct SCF
method: DIIS # Default
The maximum number of SCF iterations can be set with the scf: iterations:
option.
If omitted (or set to null
) then program specific defaults will be used:
calculation:
scf:
iterations: 100 # null for default
The SCF convergence criteria can be modified with the scf: convergence:
option,
which accepts one of a number of keywords to automatically set the energy and density (if relevant)
convergence criteria, for any program:
calculation:
scf:
# Pick one:
convergence: Loose
convergence: Weak
convergence: Medium
convergence: Strong
convergence: Tight
convergence: VTight
convergence: VVTight
convergence: Extreme
Alternatively, the energy and density convergence criteria can be set manually with the
scf: energy:
and scf: density
options:
calculation:
scf:
energy: -8 # < 10-8
density: -7 # < 10-7
These options set the exponent for the convergence criteria (x10n). Larger values result in exponentially tighter (smaller change) criteria.
Note
scf: convergence:
is mutually exclusive with either scf: energy:
or scf: density:
.
A damping procedure (in which part of the matrix from the previous SCF cycle is mixed into the next cycle)
can be enabled with the scf: damping: calc:
option:
calculation:
scf:
damping:
calc: True
Specific options for the damping procedure depend on the underlying computational engine:
Only one option for Gaussian is supported, which is iterations:
. This
setting controls how many initial SCF iterations to apply damping for. After this number,
the damping procedure is stopped.
calculation:
scf:
damping:
calc: True
iterations: 10 # The default.
Orca supports two different damping procedures: ‘Static’, in which the same damping
procedure is performed for every SCF cycle, and ‘Dynamic’, in which stronger damping
is applied at the start of the calculation, and less (or none) towards the end. Dynamic
damping is normally recommended. The damping method can be selected with the
scf: damping: method:
option, using either the Static
or Dynamic
keywords.
For the Static
method, there are two additional options. The fraction of the old
matrix to mix with the new matrix is controlled by the scf: damping: weight:
option, while the threshold after which to disabled damping is controlled by
scf: damping: threshold:
calculation:
scf:
damping:
calc: True
method: Static # Same damping weight throughout.
weight: 0.5 # Fraction of old matrix to incorporate.
threshold: 0.1 # Disable damping after DIIS error drops below 0.1 eH.
Any decimal number can be used for weight
, but values above 1.0
will likely make convergence
impossible, and are strongly not recommended. The exact interpretation of threshold
depends
on the SCF converger being used, for DIIS this value refers to the error in the calculated energy.
For DIIS-SOSCF
, it is the ‘orbital gradient value’. Refer to the Orca manual for more information.
For the Dynamic
damping procedure, the weight
option sets the initial mixing fraction
to use at the start of the calculation. Orca will then automatically vary this weighting depending
on the estimated convergence of the SCF procedure, with progressively less weight
being applied
as SCF approaches convergence. The amount which weight
is allowed to vary is controlled by
the scf: damping: max:
and scf: damping: min:
options. scf: damping: weight:
should
fall between these values.
calculation:
scf:
damping:
calc: True
method: Dynamic # Variable damping weight.
weight: 0.5 # Starting weight.
max: 0.75 # Don't rise above this weight.
min: 0.25 # Don't drop below this weight.
threshold: 0.1 # Disable damping after DIIS error drops below 0.1 eH.
Note
Digichem does not currently support deactivating SCF damping in Turbomole. If this feature would be useful for you, please consider letting us know.
In Turbomole, SCF damping is enabled by default and cannot be turned off. The damping procedure uses a ‘Dynamic’
scheme, in which progressively less damping is applied as the SCF approaches convergence. The starting mixing
fraction is controlled by the scf: damping: weight:
option. This is decreased in each subsequent iteration
by scf: damping: step:
, until scf: damping: min:
is reached, at which point the weight
value
remains constant:
calculation:
scf:
damping:
calc: True # The default, cannot be turned off.
weight: 0.5 # Starting weight.
step: 0.1 # Decrease by this amount each step.
min: 0.1 # Don't drop below this weight.
Spin-component scaling
Spin-component scaling (SCS) is a correction scheme in which electron correlation energies between electron pairs with the same spin are weighted differently to those with opposite spin. The correction originated in MP2 calculations, and is most commonly applied to post-HF wavefunction methods. SCS often gives better accuracy than the uncorrected wavefunction.
SCS settings are controlled in the method: scs:
block. SCS is enabled with the calc
sub-option:
calculation:
method:
scs:
calc: True # Use spin-component scaling.
If no other options are specified, the calculation engine will use appropriate default values for the same-spin and opposite-spin scaling factors. Alternatively, these can be tuned with the following program specific options:
Important
No SCS is supported for Gaussian.
In Orca, the scaling factors are controlled with the opposite
and same
sub-options:
calculation:
method:
scs:
calc: True
same: 0.3333
opposite: 1.2
In Turbomole, spin-opposite scaling (in which the same-spin component is entirely ignored) and spin-component
scaling (in which both components are scaled) are recognised separately. The SCS method can be switched between
them with the method
sub-option:
calculation:
method:
scs:
calc: True
# Choose one:
method: SCS # Scale both, the default.
method: SOS # Ignore same-spin.
The exact weights can still be controlled with the method: scs: opposite:
and method: scs: same:
options. If method
is SOS
, then the same
option is ignored:
calculation:
method:
scs:
calc: True
#
# Choose one:
method: SCS # Scale both, the default.
same: 0.3333
opposite: 1.2
#
#
method: SOS # Ignore same-spin.
opposite: 1.3
Resolution of the identity
The resolution of the identity approximation (RI), also known as density fitting, is a name given to a family of methods that can drastically reduce the duration of a calculation, normally doing so while only introducing minor errors.
Note
No resolution of the identity approximations are supported for Gaussian
RI can be used to accelerate all types of computational methods (HF, DFT, MP, and CC), so long as it is supported by the underlying computational engine, and different parts of the method can separately choose to use RI or not.
RI for the coulomb interaction (J) can be used for HF, DFT, MP, and CC calculations. This type of RI is controlled with the
method: ri: coulomb
block:
calculation:
method:
ri:
coulomb:
calc: True
RI can also be applied to both the coulomb (J) and exchange (K) interactions, for HF, DFT, MP, and CC calculations.
This type of RI is controlled with the method: ri: hartree_fock:
block:
calculation:
method:
ri:
hartree_fock:
calc: True
For post-HF calculations (double-hybrid DFT, MP, CC), the RI approximation can additionally be used for the correlated wavefunction part.
This type of RI is controlled with the method: ri: correlated:
block:
calculation:
method:
ri:
correlated:
calc: True
RI-J or RI-JK can optionally also be included in a post-HF calculation:
calculation:
method:
ri:
hartree_fock:
calc: True # Use RI for HF...
correlated:
calc: True # And MP/CC
For each type of RI, an appropriate auxiliary basis set will be automatically chosen by the underlying computational engine.
An explicit auxiliary basis set can be set for each RI block with the basis_set:
sub-option:
calculation:
method:
cc:
calc: True
level: CCSD
ri:
hartree_fock:
calc: True # Use RI for HF.
basis_set: def2 # Use the general density-fitting basis set for JK.
correlated:
calc: True # And MP/CC
basis_set: def2-SVP # Use the def2-SVP auxiliary basis set for RI-CC.
Orca additionally supports the ‘chain-of-spheres’ (COSX) approximation, which is a different (but related)
acceleration scheme. COSX can be enabled with the method: ri: chain_of_spheres: calc:
option:
calculation:
method:
ri:
chain_of_spheres:
calc: True
The auxiliary basis set can be specified with the basis_set:
sub-option as normal (or omitted to use a default):
calculation:
method:
ri:
chain_of_spheres:
calc: True
basis_set: def2 # Use the general density-fitting basis set.
Basis set
The basis set (definition of atomic-like orbitals) is controlled by the basis_set
block.
In most cases, a basis set can be specified using its common name in the basis_set: internal:
option:
calculation:
basis_set:
internal: 6-31G**
Digichem doesn’t restrict the possible basis set names that can be specified to basis_set: internal:
,
so any keyword that is recognised by the underlying computational engine can be specified here. In addition,
Digichem will attempt to translate common basis set names to program specific versions, if applicable:
calculation:
basis_set:
# Pick one:
# Pople style:
internal: 6-31G(d,p) # Same as 6-31G**
internal: 6-31G** # Same as 6-31G(d,p)
internal: 6-31++G** # With diffuse.
# etc...
#
# Karlsruhe style:
internal: def2-SV(P) # Equivalent to Def2SVPP in Gaussian.
internal: def2-SVP # Equivalent to Def2SVP in Gaussian.
internal: def2-TZVP
# etc.
#
# Correlation consistent:
internal: cc-pVDZ
internal: aug-cc-pVDZ # With diffuse.
# etc.
The basis_set: internal:
option sets the same basis set for all atoms in the molecule. Alternatively,
different basis sets can be specified for different elements using the basis_set: exchange:
option.
This option accepts a list of basis sets, with each specifying the elements it should apply to. Elements
can be identified either by atomic (proton) number, or by element symbol. For both, ranges can be specified
with the -
(dash) character:
Note
The basis_set: exchange:
option is currently only supported for Gaussian. If supporting this feature
in another calculation engine would be useful for you, please consider letting us know
calculation:
basis_set:
exchange:
- "6-31G**": 1-18 # 6-31G** for light elements (hydrogen to argon)
- LANL2DZ: K-Bi, La, U, Np, Pu # LANL2DZ for heavy elements
Note
The basis_set: internal:
and basis_set: exchange:
options are mutually exclusive.
Calculated properties
The properties
block is used to specify what properties (of the molecule) to calculate.
Single point
A ‘single point’ calculation calculates the total energy of the molecule at the current geometry, and little else. Other properties that are trivial to calculate from the converged density/wavefunction may also be calculated, depending on other options (such as multi-poles, orbital occupations, population analysis etc.).
A single point calculation can be requested with the properties: sp: calc:
option:
calculation:
properties:
sp:
calc: True
There are no other options for a single point calculation, and this type of calculation is mutually exclusive with all others.
Gradient of the energy
The first derivative of the energy (the gradient) can be calculated with the
properties: grad: calc:
option:
calculation:
properties:
grad:
calc: True
The gradient can either be calculated analytically or numerically. In most cases,
the analytical gradient should be preferred if it is available, as calculating the gradient
numerically can be extremely time consuming for large systems. The method to use to calculate
the gradient can be set with the properties: grad: numerical:
option. If unset (or set to null
),
then the calculation engine will normally default to the analytical gradient, falling back to a
numerical method if an analytical one is not available.
A numerical gradient can be explicitly requested by setting properties: grad: numerical:
to True
:
calculation:
properties:
grad:
calc: True
numerical: True
Likewise a numerical gradient can be disallowed by setting numerical:
to False
. In this case,
the calculation will fail if no analytical gradient is available for the chosen method.
Note that several other properties, most notably opt
and freq
, also dependon the gradient.
When calculating these properties, the gradient will be automatically calculated without the
properties: grad: calc:
option being set. However, an analytical/numerical gradient can still
be selected with the numerical
option:
calculation:
properties:
opt:
calc: True
grad:
numerical: True # Force use of a numerical gradient for the optimisation, not recommended.
Geometry optimisation
A geometry optimisation can be requested with the properties: opt: calc:
option:
calculation:
properties:
opt:
calc: True
The maximum number of optimisation cycles to attempt can be set with the
properties: opt: iterations:
option. If the geometry does not converge before this limit,
then the calculation will be aborted:
calculation:
properties:
opt:
calc: True
iterations: 100 # Useful for difficult convergence cases.
If the iterations
option is not set, then program specific defaults will be used instead.
Note that these defaults may not be sufficient for larger and/or difficult to converge
structures.
Vibrational frequencies
The vibrational frequencies, and intensities of the vibrational transitions, can be calculated
with the properties: freq: calc:
option:
calculation:
properties:
freq:
calc: True
There are no other options for the freq
block, but the method used to calculate the derivatives
(analytical or numerical) can be set in the grad block:
calculation:
properties:
freq:
calc: True
grad:
numerical: True # Force use of a numerical gradient for the frequencies, not recommended.
A common use for calculating vibrational frequencies is to check for the presence of imaginary (negative) frequencies at an optimised geometry, which would indicate the geometry has not actually reached a local minimum.
This type of compound job can be requested by specifying both properties: freq: calc:
and properties: opt: calc:
simultaneously.
calculation:
properties:
opt:
calc: True # First, optimise the geometry.
freq:
calc: True # Then check the frequencies.
Note
The order of the opt
and freq
blocks does not matter.
Nuclear magnetic resonance
Important
NMR calculations are currently only supported for Orca. If supporting this feature in another calculation engine would be useful for you, please consider letting us know.
Settings for controlling an NMR calculation are set in the properties: nmr:
block. Chemical shifts
can be calculated with the properties: nmr: calc:
and properties: nmr: nuclei:
options:
calculation:
properties:
nmr:
calc: True
nuclei:
# Pick as many as you like:
#
# Calculate chemical shifts for protons and carbon-13.
- 1H
- 13C
Spin-spin coupling constants can additionally be calculated by setting properties: nmr: coupling:
to True
:
calculation:
properties:
nmr:
calc: True
coupling: True
nuclei:
# Pick as many as you like:
#
# Calculate chemical shifts and coupling constants for protons and carbon-13.
- 1H
- 13C
Coupling constants will be calculated between all elements that NMR properties have been calculated for.
Excited states
Options for calculating excited states can be set in the properties: es:
block. An excited states calculation
can be requested by setting properties: es: calc:
to True
and properies: es: method:
to the type
of excited states calculation that is desired.
All calculation engines support the TD-HF
(time-dependent hartree-fock, also known as RPA) and CIS
(configuration interaction singles) methods in conjugation with a hf
wavefunction:
calculation:
method:
hf:
calc: True
properties:
es:
calc: True
# Pick one:
method: TD-HF # Time-dependent hartree-fock theory (RPA).
method: CIS # Configuration interaction singles.
As well as the TD-DFT
(time-dependent density functional theory) and TDA
(tamm-dancoff approximation)
methods in conjugation with a dft
density:
calculation:
method:
dft:
calc: True
functional: B3LYP
properties:
es:
calc: True
# Pick one:
method: TD-DFT # Time-dependent DFT.
method: TDA # Tamm-Dancoff approximation.
At the coupled-cluster level, all engines also support the equations of motion (EOM) methodology:
calculation:
method:
cc:
calc: True
level: CCSD
properties:
es:
calc: True
method: EOM-CCSD
Other excited state methods are supported on a program-by-program basis:
Gaussian also supports the CIS(D)
method (configuration interaction singles with doubles correction):
calculation:
method:
hf:
calc: True
properties:
es:
calc: True
method: CIS(D)
Important
There are several Orca-specific options in this block that are currently under review, and will likely be renamed and/or moved in a future version.
In Orca, the CIS(D)
method (configuration interaction singles with doubles correction) can be selected
with the orca-specific dcorr
option:
calculation:
method:
hf:
calc: True
properties:
es:
calc: True
method: CIS
dcorr: 2 # Doubles correction algorithm.
This option can also be used to include doubles correction in double-hybrid DFT calculations:
calculation:
method:
dft:
calc: True
functional: wPBEPP86
properties:
es:
calc: True
method: TDA
dcorr: 2 # Doubles correction algorithm.
There are four possible values to this option (1
, 2
, 3
, and 4
) corresponding to four different
possible algorithms. Refer to the Orca manual for more information.
At the coupled cluster level, Orca supports a modified EOM method called STEOM-CCSD
(similarly transformed
equations of motion):
calculation:
method:
cc:
calc: True
level: CCSD
properties:
es:
calc: True
method: STEOM-CCSD
For both EOM-CCSD
and STEOM-CCSD
, double excitations can be included through the mdci: double_excitations:
option:
calculation:
method:
cc:
calc: True
level: CCSD
properties:
es:
calc: True
method: EOM-CCSD # Or STEOM-CCSD
mdci:
double_excitations: True
Turbomole supports a number of excited states methods. The CIS(D)
method (configuration interaction singles
with doubles correction), as well as a modified, iterative variant called CIS(D∞) which can be
requested with CIS(Dinf)
:
calculation:
method:
hf:
calc: True
properties:
es:
calc: True
# Pick one:
method: CIS(D)
method: CIS(Dinf)
The algebraic diagrammatic construction method (second order) can be requested with the ADC(2)
keyword.
For this type of calculation, the method
should be set to mp
, at the ADC(2)
level:
calculation:
method:
mp:
calc: True
level: ADC(2)
properties:
es:
calc: True
method: ADC(2)
Excited states can be calculated with the approximate coupled cluster method using the CC2
keyword.
The calculation method should be set to cc
, at the CC2
level:
calculation:
method:
cc:
calc: True
level: CC2
properties:
es:
calc: True
method: CC2
The number and type of excited states to calculate is controlled by the properties: es: num_states:
and
properties: es: multiplicity:
options. The multiplicity
option accepts three values, Singlet
,
Triplet
, and 50-50
, with the latter option calculating both singlets and triplets at the same time.
num_states
refers to the number of excited states to calculate of each multiplicity, so if 50-50
is
specified, then num_states
singlets and num_states
triplets will be calculated.
calculation:
properties:
es:
calc: True
num_states: 5
# Pick one:
multiplicity: Singlet # Calculate the first five singlet excited states.
multiplicity: Triplet # Calculate the first five triplet excited states.
multiplicity: 50-50 # Calculate five singlet and five triplet states.
Note
Some excited state methods with some calculation engines do not support calculating singlets and triplets simultaneously
(50-50
). This includes the EOM-CCSD
and STEOM-CCSD
methods in Orca.
For unrestricted and/or open-shell calculations, the multiplicity
option should be omitted (or set to Singlet
,
the default). The calculation will then attempt to calculate excited states of the same multiplicity
as the ground state, with spin being conserved. Note that spin-contamination and other effects can result in some
excited states having unexpected multiplicities, or even non-integer multiplicity.
Spin-orbit coupling can be calculated with the properties: es: soc:
option:
calculation:
properties:
es:
calc: True
num_states: 5
multiplicity: 50-50 # Calculate five singlet and five triplet states.
soc: True
SOC naturally requires both singlet and triplet excited states to be calculated (multiplicity: 50-50
). The
properties: es: soc:
option is only supported for Orca, but SOC will also be automatically calculated from
a Gaussian calculation using PySOC if both singlets and triplets are available. Calculations with PySOC also require
the following additional keywords to be specified:
calculation:
properties:
es:
calc: True
num_states: 5
multiplicity: 50-50 # Calculate five singlet and five triplet states.
# Gaussian specific keywords.
keywords:
# Options required for PySOC.
6D:
10F:
GFInput:
No SOC for Turbomole is currently supported.
Excited state geometries
The geometry of an excited state can be calculated by requesting an optimisation and an excited states
calculation simultaneously. The excited state to optimise is indicated with the properties: es: state_of_interest:
option:
calculation:
properties:
opt:
calc: True
es:
calc: True
multiplicity: Singlet
num_states: 5 # Calculate five singlet excited states.
state_of_interest: 1 # And optimise the geometry of the lowest energy state (S1).
During the geometry optimisation, the order of the calculated excited states can change significantly. Because the
target excited state is only tracked via its index, this means the nature of the state_of_interest
can change from one optimisation cycle to the next. For this reason, it is recommended to only calculate one type
of excited state (multiplicity: Singlet
or multiplicity: Triplet
), so the multiplicity of the state_of_interest:
remains constant. It is also recommended to calculate more excited states than are needed, for the same reason.
Performance
Options relating to resource usage and performance (ie, how long does the calculation take) are set in the performance
block. The number of CPUs to use in parallel for the calculation is set with the performance: num_cpu:
option:
calculation:
performance:
num_cpu: 16 # Use 16 CPUs in parallel.
Meanwhile, the maximum amount of memory that should be used by the calculation is set with the performance: memory:
option.
Memory amounts can be given using common units, such a terabyte (TB
), gigabyte (GB
), megabyte (MB
), kilobyte (KB
),
and byte (B
):
calculation:
performance:
# Pick one:
#
# Equivalent ways of specifying 40 GB:
memory: 0.04 TB
memory: 40 GB
memory: 40000 MB
memory: 40000000 KB
memory: 40000000000 B
The memory
option refers to the total amount of memory shared by all CPUs.
When using a batch submission system, the scheduler also needs to know how much memory and how many CPUs to allocate to the
calculation before it enters the queue. By default, Digichem will request the same number of resources from the scheduler
as are set in the calculation method. However, essentially all calculation engines tend to use slightly more memory than
is allocated to them in their input file. Because of this, it is normally recommended to request more memory in the batch
script than the calculation is expected to use. This is supported by the performance: memory_over_allocate:
option,
which specifies a multiplier that will be applied when submitting to the batch program:
calculation:
performance:
num_cpu: 16 # Use 16 CPU in parallel.
memory: 40 GB # Use max 40 GB in the calculation.
memory_over_allocate: 1.15 # 15%, the default.
In the above specification, 40 GB
× 1.15
= 46 GB
would be requested from the scheduler, even though the
calculation should only use ~ 40 GB
maximum. Because 16
CPUs have been requested, this equates to 46 GB
÷ 16
=
2875 MB
of memory per CPU.
Important
A performance: memory_over_allocate:
value of less than 1.0 will request less memory from the scheduler than the
calculation is expected to use. In most cases, this is strongly not recommended and will likely lead to the calculation
being cancelled by the scheduler when it exceeds its memory allocation.
Polling frequency
Digichem automatically monitors the resource usage (CPU load, memory usage, free file space etc.) of the calculation
as it progresses. The frequency with which Digichem logs this data can be controlled with the performance: poll_interval:
option:
calculation:
performance:
poll_interval: 60 # In seconds (every 1 min, the default).
A shorter poll_interval
results in more accurate logging, but will consume more file space. Very short polling intervals (< 1 s)
may place noticeable load on the CPU and/or filesystem, and are likely to negatively impact the performance of the calculation itself.
The default interval of 60
s is normally sufficient.
Implicit solvation
Options for using a simulated solvent environment are set in the solution
block.
The solvent to use is specified by the solution: solvent:
option. The solvent can be identified
either via its name (water
, toluene
, ethylEthanoate
, etc.), or via its dielectric
constant:
calculation:
solution:
calc: True
# Chose one:
solvent: water # Solvent keyword.
solvent: 78.3553 # Dielectric constant.
# etc.
The following solvent keywords are supported:
Name |
Aliases |
Epsilon |
Refractive Index |
---|---|---|---|
1-Bromo-2-MethylPropane |
7.7792 |
||
1-BromoOctane |
5.0244 |
||
1-BromoPentane |
6.269 |
||
1-BromoPropane |
8.0496 |
||
1-Butanol |
17.332 |
||
1-ChloroHexane |
5.9491 |
||
1-ChloroPentane |
6.5022 |
||
1-ChloroPropane |
8.3548 |
||
1-Decanol |
7.5305 |
||
1-FluoroOctane |
3.89 |
||
1-Heptanol |
11.321 |
||
1-Hexanol |
12.51 |
||
1-Hexene |
2.0717 |
||
1-Hexyne |
2.615 |
||
1-IodoButane |
6.173 |
||
1-IodoHexaDecane |
3.5338 |
||
1-IodoPentane |
5.6973 |
||
1-IodoPropane |
6.9626 |
||
1-NitroPropane |
23.73 |
||
1-Nonanol |
8.5991 |
||
1-Pentanol |
15.13 |
||
1-Pentene |
1.9905 |
||
1-Propanol |
20.524 |
||
1,1,1-TriChloroEthane |
7.0826 |
||
1,1,2-TriChloroEthane |
7.1937 |
||
1,2-DiBromoEthane |
4.9313 |
||
1,2-EthaneDiol |
40.245 |
||
1,2,4-TriMethylBenzene |
2.3653 |
||
1,4-Dioxane |
Dioxane |
2.2099 |
|
2-BromoPropane |
9.361 |
||
2-Butanol |
15.944 |
||
2-ChloroButane |
8.393 |
||
2-Heptanone |
11.658 |
||
2-Hexanone |
14.136 |
||
2-MethoxyEthanol |
17.2 |
||
2-Methyl-1-Propanol |
16.777 |
||
2-Methyl-2-Propanol |
12.47 |
||
2-MethylPentane |
1.89 |
||
2-MethylPyridine |
9.9533 |
||
2-NitroPropane |
25.654 |
||
2-Octanone |
9.4678 |
||
2-Pentanone |
15.2 |
||
2-Propanol |
19.264 |
||
2-Propen-1-ol |
19.011 |
||
2,2,2-TriFluoroEthanol |
26.726 |
||
2,2,4-TriMethylPentane |
1.9358 |
||
2,4-DiMethylPentane |
1.8939 |
||
2,4-DiMethylPyridine |
9.4176 |
||
2,6-DiMethylPyridine |
7.1735 |
||
3-MethylPyridine |
11.645 |
||
3-Pentanone |
16.78 |
||
4-Heptanone |
12.257 |
||
4-Methyl-2-Pentanone |
12.887 |
||
4-MethylPyridine |
11.957 |
||
5-Nonanone |
10.6 |
||
a-ChloroToluene |
6.7175 |
||
AceticAcid |
EthanoicAcid |
6.2528 |
|
Acetone |
Propanone |
20.493 |
1.359 |
Acetonitrile |
35.688 |
1.344 |
|
AcetoPhenone |
17.44 |
||
Aniline |
6.8882 |
||
Anisole |
4.2247 |
||
Argon |
1.43 |
||
Benzaldehyde |
18.22 |
||
Benzene |
2.2706 |
1.501 |
|
BenzoNitrile |
25.592 |
||
BenzylAlcohol |
12.457 |
||
BromoBenzene |
5.3954 |
||
BromoEthane |
9.01 |
||
Bromoform |
4.2488 |
||
Butanal |
13.45 |
||
ButanoicAcid |
2.9931 |
||
Butanone |
18.246 |
||
ButanoNitrile |
24.291 |
||
ButylAmine |
4.6178 |
||
ButylEthanoate |
4.9941 |
||
CarbonDiSulfide |
2.6105 |
||
CarbonTetraChloride |
2.228 |
1.466 |
|
ChloroBenzene |
5.6968 |
||
Chloroform |
4.7113 |
1.45 |
|
Cis-1,2-DiMethylCycloHexane |
2.06 |
||
Cis-Decalin |
2.2139 |
||
CycloHexane |
2.0165 |
1.425 |
|
CycloHexanone |
15.619 |
||
CycloPentane |
1.9608 |
||
CycloPentanol |
16.989 |
||
CycloPentanone |
13.58 |
||
Decalin-mixture |
2.196 |
||
DiBromomEthane |
7.2273 |
||
DiButylEther |
3.0473 |
||
DiChloroEthane |
10.125 |
||
DiChloroMethane |
DCM |
8.93 |
1.424 |
DiEthylAmine |
3.5766 |
||
DiethylEther |
4.24 |
||
DiEthylSulfide |
5.723 |
||
DiIodoMethane |
5.32 |
||
DiIsoPropylEther |
3.38 |
||
DiMethylDiSulfide |
9.6 |
||
DiMethylSulfoxide |
DMSO |
46.826 |
1.479 |
DiPhenylEther |
3.73 |
||
DiPropylAmine |
2.9112 |
||
e-1,2-DiChloroEthene |
2.14 |
||
e-2-Pentene |
2.051 |
||
EthaneThiol |
6.667 |
||
Ethanol |
24.852 |
1.361 |
|
EthylBenzene |
2.4339 |
||
EthylEthanoate |
5.9867 |
||
EthylMethanoate |
8.331 |
||
EthylPhenylEther |
4.1797 |
||
FluoroBenzene |
5.42 |
||
Formamide |
108.94 |
||
FormicAcid |
51.1 |
||
Heptane |
1.9113 |
||
HexanoicAcid |
2.6 |
||
IodoBenzene |
4.547 |
||
IodoEthane |
7.6177 |
||
IodoMethane |
6.865 |
||
IsoPropylBenzene |
2.3712 |
||
IsoQuinoline |
11 |
||
Krypton |
1.519 |
||
m-Cresol |
12.44 |
||
m-Xylene |
2.3478 |
||
Mesitylene |
2.265 |
||
Methanol |
32.613 |
1.329 |
|
MethylBenzoate |
6.7367 |
||
MethylButanoate |
5.5607 |
||
MethylCycloHexane |
2.024 |
||
MethylEthanoate |
6.8615 |
||
MethylMethanoate |
8.8377 |
||
MethylPropanoate |
6.0777 |
||
n-ButylBenzene |
ButylBenzene |
2.36 |
|
n-Decane |
Decane |
1.9846 |
|
n-Dodecane |
Dodecane |
2.006 |
|
n-Hexadecane |
Hexadecane |
2.0402 |
|
n-Hexane |
Hexane |
1.8819 |
1.375 |
n-MethylAniline |
MethylAniline |
5.96 |
|
n-MethylFormamide-mixture |
MethylFormamide-mixture |
181.56 |
|
n-Nonane |
Nonane |
1.9605 |
|
n-Octane |
Octane |
1.9406 |
|
n-Octanol |
Octanol |
9.8629 |
1.421 |
n-Pentadecane |
Pentadecane |
2.0333 |
|
n-Pentane |
Pentane |
1.8371 |
|
n-Undecane |
Undecane |
1.991 |
|
n,n-DiMethylAcetamide |
DMA |
37.781 |
|
n,n-DiMethylFormamide |
DMF |
37.219 |
1.43 |
NitroBenzene |
34.809 |
||
NitroEthane |
28.29 |
||
NitroMethane |
36.562 |
||
o-ChloroToluene |
4.6331 |
||
o-Cresol |
6.76 |
||
o-DiChloroBenzene |
9.9949 |
||
o-NitroToluene |
25.669 |
||
o-Xylene |
2.5454 |
||
p-IsoPropylToluene |
2.2322 |
||
p-Xylene |
2.2705 |
||
Pentanal |
10 |
||
PentanoicAcid |
2.6924 |
||
PentylAmine |
4.201 |
||
PentylEthanoate |
4.7297 |
||
PerFluoroBenzene |
2.029 |
||
Propanal |
18.5 |
||
PropanoicAcid |
3.44 |
||
PropanoNitrile |
29.324 |
||
PropylAmine |
4.9912 |
||
PropylEthanoate |
5.5205 |
||
Pyridine |
12.978 |
1.51 |
|
Quinoline |
9.16 |
||
sec-ButylBenzene |
2.3446 |
||
tert-ButylBenzene |
2.3447 |
||
TetraChloroEthene |
2.268 |
||
TetraHydroFuran |
THF |
7.4257 |
1.407 |
TetraHydroThiophene-s,s-dioxide |
43.962 |
||
Tetralin |
2.771 |
||
Thiophene |
2.727 |
||
Thiophenol |
4.2728 |
||
Toluene |
2.3741 |
1.497 |
|
trans-Decalin |
2.1781 |
||
TriButylPhosphate |
8.1781 |
||
TriChloroEthene |
3.422 |
||
TriEthylAmine |
2.3832 |
||
Water |
78.3553 |
1.33 |
|
Xenon |
1.706 |
||
Xylene-mixture |
2.3879 |
||
z-1,2-DiChloroEthene |
9.2 |
The solvent model can be selected with the solution: model:
option:
calculation:
solution:
calc: True
model: PCM
solvent: water
The list of supported solvent models depends on the underlying computational engine:
Gaussian support several variants of the polarizable continuum model (PCM), as well as the solvation model based on density (SMD), which are selected with the following keywords:
calculation:
solution:
calc: True
solvent: water
# Pick one:
model: PCM # The default, uses an IEFPCM method.
model: CPCM
model: IPCM
model: SCIPCM
model: SMD
Refer to the Gaussian manual for more information.
Orca supports two variants of the polarizable continuum model (PCM), as well as the solvation model based on density (SMD). These are selected with the following keywords:
calculation:
solution:
calc: True
solvent: water
# Pick one:
model: PCM # The default.
model: PCM-COSMO
model: SMD
Note
The PCM-COSMO
method is a CPCM method using a COSMO epsilon function. Versions prior to Orca 5
supported a true conductor-like Screening Solvation Model (COSMO), but this was removed in version 5
and is not supported by Digichem.
Turbomole only supports the conductor-like Screening Solvation Model (COSMO):
calculation:
solution:
calc: True
solvent: water
# Only one choice:
model: COSMO # The default.
Excited states
When calculating excited states, the solvent model can either be applied in an equilibrium or non-equilibrium fashion. In non-equilibrium solvation, the solvent environment does not react to the change in electron density from the excitation, while in equilibrium solvation it does.
Equilibrium solvation can be requested with the solution: solvent_equilibrium:
option:
calculation:
solution:
calc: True
solvent: water
solvent_equilibrium: True
Equilibrium solvation is normally preferred when modelling ‘slow’ effect, such as excited state geometries.
Note
Equilibrium solvation is currently only supported for Gaussian. If supporting this feature in another calculation engine would be useful for you, please consider letting us know.
Symmetry
The use of symmetry in the calculation is controlled by the symmetry: calc:
option:
calculation:
symmetry:
# Pick one:
calc: True # The default for Gaussian and Turbomole.
calc: False # Don't use symmetry.
Other options to the symmetry
block are program specific:
The symmetry: rotate:
option control whether to rotate the molecular geometry according to the
determined symmetry:
calculation:
symmetry:
calc: True
# Pick one:
rotate: True # The default.
rotate: False # Don't rotate.
The sensitivity of the symmetry algorithm can be controlled with the symmetry: threshold:
option:
calculation:
symmetry:
calc: True
# Pick one:
threshold: Tight # The default.
threshold: False # Be more generous when determining what is symmetric.
In Orca, symmetry detection is disabled by default. If symmetry detection is used, the sensitivity
of the algorithm can be controlled via the threshold:
option:
calculation:
symmetry:
calc: True
threshold: 0.05
Refer to the Orca manual for the exact meaning of this threshold value.
In Turbomole, symmetry detection cannot be disabled. There are no additional options.
Electron occupations
By default, the charge and multiplicity of the molecule is determined from the input coordinate file (ie, is not
set in the method at all). However, it is sometimes useful to force a specific charge/multiplicity in the calculation,
such as when calculating ΔSCF excited states or ionisation potentials/electron affinities. This can be achieved with
the electron: multiplicity:
and electron: charge:
options:
calculation:
electron:
# Pick one:
#
# Force a triplet ground state calculation.
multiplicity: 3
charge: 0
#
# Force a radical cation calculation.
multiplicity: 2
charge: 1
#
# Force a radical anion calculation.
multiplicity: 2
charge: -1
#
# etc.
By default, open-shell molecules will use an unrestricted wavefunction/functional, while closed-shell molecules with use a restricted wavefunction/functional.
An unrestricted calculation can be explicitly requested with the electron: unrestricted:
option:
calculation:
electron:
unrestricted: True
Scratch files
Scratch filespace is used for IO (input/output) heavy operations during the calculation. This normally involves the reading and writing of large, temporary files that contain intermediate results of the calculation. The scratch filespace should be large (several 100 GB per calculation, at least) and fast, so it should not be mounted over the network (such as an NFS or similar) if at all possible.
Options to control the use of scratch files are set in the scratch:
block. The two most important sub-options are
use_scratch:
, which enables/disables the use of scratch files entirely, and path:
, which specifies the directory
to store the scratch files:
Warning
Disabling the use of scratch is not recommended. Most calculation engines will resort to a default location which is difficult to control, and may result in a significant drop in performance. Gaussian in particular acts unpredictably when a scratch directory is not specified, and is highly likely to crash.
calculation:
use_scratch: True # The default. Do not turn this off unless you know what you are doing.
path: /tmp # The top-level location under which scratch files will be stored.
The path:
option specifies the top-level scratch directory. Each calculation will create and use a separate sub-directory under
this top-level directory, with a name that is guaranteed to be unique.
On shared computing clusters, it is common to group each user’s calculations together in the scratch directory. This can be
easily achieved by including the $USER
environmental variable in the path name. Likewise, some computing clusters use
a dynamic scratch location which is stored in the $SCRATCH
variable. These (or any other) environmental variables
can be set in the path::
option as needed:
calculation:
scratch:
use_scratch: True # The default. Do not turn this off unless you know what you are doing.
#
# Examples:
#
# Use the /tmp/ directory, with a sub-directory for each user.
path: /tmp/$USER
#
# Use a dynamic scratch location (set externally in $SCRATCH), with a sub-directory for each user.
path: $SCRATCH/$USER
#
# Use a custom location.
path: $MY_SUPER_SECRET_LOCATION
Normally, only only a subset (defined by each calculation program) of the files written during the calculation are stored in the
scratch directory. Instead, the all_output:
sub-option can be specified so ALL files are written directly to the scratch directory.
The only exception is the main calculation output file (the .log file) which is always written to the main calculation output folder:
calculation:
scratch:
all_output: True # Write everything to scratch, except the .log file.
The sub-options keep:
, rescue:
, and force_delete:
control what happens to the scratch files at the end of the calculation.
Most scratch files are temporary files and can be safely deleted at the end of the calculation, and many (although not all, and not all of the time)
calculation engines will delete them automatically. However, if the calculation stops suddenly, the scratch files may be useful in restarting
the calculation, in which case it may be beneficial to keep them.
keep:
and rescue:
specify whether to copy any leftover scratch files back to the main calculation directory once the calculation has stopped.
keep:
controls this behaviour only if the calculation finishes successfully, while rescue:
only takes effect if the calculation stops unexpectedly.
The default behaviour is like this:
calculation:
scratch:
keep: False # Delete scratch files if the calculation finishes successfully.
rescue: True # But copy the scratch files back to the starting calculation directory if something goes wrong.
Note
The keep:
and rescue:
options only apply to the temporary scratch files written by the calculation.
The files written by all_output: True
are always copied back to the main directory.
Additionally, force_delete:
specifies what Digichem should do if something goes very badly wrong and the scratch files couldn’t be copied back
to the main calculation directory even if you asked them to. This situation is rare, but can happen (for instance) if the filesystem of the main
calculation directory runs out of space.
To avoid losing data, the default option is force_delete: False
. However, this can lead to a domino effect if the scratch directory and
main calculation directory are mounted on the same filesystem, as the large scratch files consume valuable file-space that could be freed
to prevent other calculations from crashing. This can be avoided by setting force_delete: True
:
calculation:
scratch:
keep: False # Delete scratch if everything is fine.
rescue: True # Save the scratch if we stop suddenly so we can restart later.
force_delete: True # Unless we're out of file-space, in which case delete anyway.
Warning
If all_output:
is True
, force_delete: True
WILL also delete these other output files if they cannot be copied back successfully.
When transferring files between the scratch and main calculation directories, the compress:
option can be used to first compress (archive)
the files before copying. If one (or both) of the scratch directory or main directory are mounted on a networked filesystem, this option can
be useful to save bandwidth.
calculation:
scratch:
keep: True
rescue: True # Always copy scratch files.
compress: True # And compress when doing so.
Folder structure
Each calculation submitted with Digichem is run in a separate directory. The name of this directory is based on the chosen calculation options
(the base name is taken from the meta: name:
option), and can be modified in the structure
block. These options are purely cosmetic.
By default, the name of the chosen calculation program is added at the start of folder name.
This can be disabled by setting structure: prepend_program_name:
to False
:
calculation:
structure:
prepend_program_name: False # Do not include the program name.
Alternatively, the program name can be added at the end of the folder name:
calculation:
structure:
prepend_program_name: False
append_program_name: True # Include the program name at the end of the folder.
Or, a separate sub-directory can be created for each calculation:
calculation:
structure:
prepend_program_name: False
program_sub_folder: True
This results in a folder structure like this:
Benzene/Orca/Optimisation
Although uncommon, these options can be combined as desired:
calculation:
structure:
append_program_name: True
prepend_program_name: True # Include the program name at the start and end of the folder name.
program_sub_folder: True # and use a separate sub-directory.
Resulting in:
Benzene/Orca/Orca Optimisation (Orca)
Two options can be set to further modify the folder name. The safe_name:
option replaces non alphanumeric characters
with underscores, while short_name:
uses a shorter overall folder name. One or both of these options can be enabled
to make browsing the folders easier:
calculation:
structure:
safe_name: True # Remove 'strange' characters.
short_name: True # Use a short name.
Note
Enabling safe_name:
and/or short_name:
may help to avoid bugs in some versions of MobaXterm.
Post processing
Options to control the post processing of the completed calculation results are set in the post:
block.
By default, all the options in this block are set to True
, and you do not normally need to change them.
The text files normally found in the ‘Results’ folder can be disabled by setting write_summary:
to False
.
Likewise, the main Digichem output file (in .sir format) can be disabled by setting write_result:
to False
:
calculation:
post_process:
write_summary: False # Don't write .txt or .csv files.
write_result: False # Don't write the .sir file.
The PDF report file (and associated images) can be disabled by setting write_report:
to False
:
calculation:
post_process:
write_report: False # Don't write the PDF report or any images.
This option may be useful to speed-up post-processing, as the image rendering included as part of the report generation process can be slow.
Finally, the storing of the completed calculation results to any configured databases can be disabled by setting
store_in_db:
to False
:
calculation:
post_process:
store_in_db: False # Don't save in any databases.
This option can be useful when running test calculations, to avid polluting the database with unnecessary results.