Skip to content
Open
55 changes: 36 additions & 19 deletions ipython/local_uncertainty.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -114,11 +114,29 @@
"\n",
"$$\\Delta G = \\frac{1}{\\sqrt{12}}(G_{max} - G_{min})$$\n",
"\n",
"Several parameters are used to formulate $\\Delta G$. These are $\\Delta G_\\mathrm{library}$, $\\Delta G_\\mathrm{QM}$, $\\Delta G_\\mathrm{GAV}$, and $\\Delta _\\mathrm{group}$.\n",
"But first, to formulate $G$, we need to compile all the uncertainty sources. These are $ G_\\mathrm{library}$, $ G_\\mathrm{QM}$, $ G_\\mathrm{GAV}$, and $G _\\mathrm{group}$. We treat each as continuous random variable and add them to get the species uncertainty.\n",
" \n",
"$$\\Delta G = \\delta_\\mathrm{library} \\Delta G_\\mathrm{library} + \\delta_\\mathrm{QM} \\Delta G_\\mathrm{QM} + \\delta_\\mathrm{GAV} \\left( \\Delta G_\\mathrm{GAV} + \\sum_{\\mathrm{group}\\; j} d_{j} \\Delta G_{\\mathrm{group},j} \\right)$$\n",
"$$G = \\delta_\\mathrm{library} G_\\mathrm{library} + \\delta_\\mathrm{QM} G_\\mathrm{QM} + \\delta_\\mathrm{GAV} \\left( G_\\mathrm{GAV} + \\sum_{\\mathrm{group}\\; j} d_{j} G_{\\mathrm{group},j} \\right)$$\n",
"\n",
"Here $\\delta$ is the Kronecker delta function which equals one if the species thermochemistry parameter contains the particular source type and $d_{j}$ is the degeneracy (number of appearances) of the thermo group used to construct the species thermochemistry in the group additivity method.\n",
"\n",
"To compute $\\Delta G$, we'll use the property that the variance of the sum of two random variables, $X$ and $Y$, is:\n",
"$$Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)$$\n",
"\n",
"If we assume that each uncertainty source is uncorrelated with the others, then the covariance term drops out:\n",
"\n",
"$$Var(X+Y)=Var(X)+Var(Y), \\ \\ \\ \\ \\ X \\neq Y$$\n",
"\n",
"And we can compute the variance of $G$:\n",
"\n",
"$$Var(G) = \\delta_\\mathrm{library} Var(G_\\mathrm{library}) + \\delta_\\mathrm{QM} Var(G_\\mathrm{QM})+\\delta_\\mathrm{GAV} \\left(Var(G_\\mathrm{GAV}) +\\sum_{group\\ j} d_j^2 Var(G_\\mathrm{group\\ j}) \\right)$$\n",
"\n",
"\n",
"The standard deviation $\\Delta G$ is then the square root of that (this is an example of adding in quadrature):\n",
"$$\\Delta(G) = \\sqrt{\\delta_\\mathrm{library} Var(G_\\mathrm{library}) + \\delta_\\mathrm{QM} Var(G_\\mathrm{QM})+\\delta_\\mathrm{GAV} \\left(Var(G_\\mathrm{GAV}) +\\sum_{group\\ j} d_j^2 Var(G_\\mathrm{group\\ j}) \\right)}$$\n",
"\n",
"\n",
"\n",
"where $\\delta$ is the Kronecker delta function which equals one if the species thermochemistry parameter contains the particular source type and $d_{j}$ is the degeneracy (number of appearances) of the thermo group used to construct the species thermochemistry in the group additivity method.\n",
"\n",
"### Kinetics Uncertainty\n",
"\n",
Expand All @@ -130,13 +148,20 @@
"\n",
"$$\\Delta \\ln(k) = \\frac{1}{\\sqrt{12}}(\\ln k_{max} - \\ln k_{min})$$\n",
"\n",
"The parameters used to formulate $\\Delta \\ln k$ are $\\Delta \\ln k_\\mathrm{library}$, $\\Delta \\ln k_\\mathrm{training}$, $\\Delta \\ln k_\\mathrm{pdep}$, $\\Delta \\ln k_\\mathrm{family}$, $\\Delta \\ln k_\\mathrm{non-exact}$, and $\\Delta \\ln k_\\mathrm{rule}$.\n",
"The sources used to formulate $ \\ln k$ are $ \\ln k_\\mathrm{library}$, $ \\ln k_\\mathrm{training}$, $ \\ln k_\\mathrm{pdep}$, $ \\ln k_\\mathrm{family}$, $ \\ln k_\\mathrm{non-exact}$, and $ \\ln k_\\mathrm{rule}$.\n",
"\n",
"For library, training, and pdep reactions, the kinetic uncertainty is assigned according to their uncertainty type. For kinetics estimated using RMG's rate rules, the following formula is used to calculate the uncertainty:\n",
"\n",
"For library, training, and pdep reactions, the kinetic uncertainty is assigned according to their uncertainty type. For kinetics estimated using RMG's rate rules, the following formula is used to calculate the uncertainty:\n",
"$$ \\ln k_\\mathrm{rate\\; rules} = \\ln k_\\mathrm{family} + \\log_{10}(N+1) \\left(\\ln k_\\mathrm{non-exact}\\right) + \\sum_{\\mathrm{rule}\\; i} w_i \\ln k_{\\mathrm{rule},i}$$\n",
"\n",
"$$\\Delta \\ln k_\\mathrm{rate\\; rules} = \\Delta\\ln k_\\mathrm{family} + \\log_{10}(N+1) \\left(\\Delta\\ln k_\\mathrm{non-exact}\\right) + \\sum_{\\mathrm{rule}\\; i} w_i \\Delta \\ln k_{\\mathrm{rule},i}$$\n",
"where N is the total number of rate rules used and $w_{i}$ is the weight of the rate rule in the averaging scheme for that kinetics estimate.\n",
"\n",
"where N is the total number of rate rules used and $w_{i}$ is the weight of the rate rule in the averaging scheme for that kinetics estimate. "
"Once again we use the property of adding variances with the assumption that sources are uncorrelated to each other. The standard deviation then adds in quadrature.\n",
"\n",
"$$\\Delta\\ln k_\\mathrm{rate\\; rules} = \\sqrt{(\\Delta \\ln k_\\mathrm{family})^2 + \\left(\\log_{10}(N+1) \\right)^2\\left(\\Delta \\ln k_\\mathrm{non-exact}\\right)^2 + \\sum_{\\mathrm{rule}\\; i} w_i^2 (\\Delta \\ln k_{\\mathrm{rule},i})^2}$$\n",
"\n",
"\n",
"\n"
]
},
{
Expand Down Expand Up @@ -178,9 +203,7 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"metadata": {},
"outputs": [],
"source": [
"result = uncertainty.local_analysis(sensitive_species, correlated=False, number=5, fileformat='.png')\n",
Expand All @@ -190,9 +213,7 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"metadata": {},
"outputs": [],
"source": [
"# Show the uncertainty plots\n",
Expand Down Expand Up @@ -230,9 +251,7 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"metadata": {},
"outputs": [],
"source": [
"uncertainty.assign_parameter_uncertainties(correlated=True)\n",
Expand All @@ -243,9 +262,7 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"metadata": {},
"outputs": [],
"source": [
"# Show the uncertainty plots\n",
Expand Down
9 changes: 6 additions & 3 deletions rmgpy/data/kinetics/family.py
Original file line number Diff line number Diff line change
Expand Up @@ -4440,8 +4440,9 @@ def extract_source_from_comments(self, reaction):
[Family_Label, {'template': originalTemplate,
'degeneracy': degeneracy,
'exact': boolean_exact?,
'rules': a list of (original rate rule entry, weight in average)
'training': a list of (original rate rule entry associated with training entry, original training entry, weight in average)}]
'rules': a list of (original rate rule entry, weight in average),
'training': a list of (original rate rule entry associated with training entry, original training entry, weight in average),
'autogenerated': boolean for whether kinetics come from autogenerated subgraph isomorphic decision tree}]


where Exact is a boolean of whether the rate is an exact match, Template is
Expand All @@ -4455,6 +4456,7 @@ def extract_source_from_comments(self, reaction):
rules = None
training_entries = None
degeneracy = 1
autogenerated = False

training_reaction_pattern = r'Matched reaction\s*(\d+).*in.*training'
degeneracy_pattern = r'Multiplied by reaction path degeneracy\s*(\d+)'
Expand Down Expand Up @@ -4494,6 +4496,7 @@ def extract_source_from_comments(self, reaction):
autogen_node_matches = re.search(autogen_node_search_pattern, full_comment_string)
template_matches = re.search(template_pattern, full_comment_string)
if autogen_node_matches is not None: # autogenerated trees
autogenerated = True
template_str = autogen_node_matches.group(1).split('Multiplied by reaction path degeneracy')[0].strip()
template_str = template_str.split('in family')[0].strip()
tokens = template_str.split()
Expand All @@ -4510,7 +4513,7 @@ def extract_source_from_comments(self, reaction):
raise ValueError(f'Could not find rate rule in comments for reaction {reaction}.')
rules, training_entries = self.get_sources_for_template(template)
source_dict = {'template': template, 'degeneracy': degeneracy, 'exact': exact_rule,
'rules': rules, 'training': training_entries}
'rules': rules, 'training': training_entries, 'autogenerated': autogenerated}

# Source of the kinetics is from rate rules
return False, [self.label, source_dict]
Expand Down
38 changes: 36 additions & 2 deletions rmgpy/data/thermo.py
Original file line number Diff line number Diff line change
Expand Up @@ -2724,7 +2724,8 @@ def extract_source_from_comments(self, species):

source = {'Library': String_Name_of_Library_Used,
'QM': String_of_Method_Used,
'GAV': Dictionary_of_Groups_Used
'GAV': Dictionary_of_Groups_Used,
'ADS': Dictionary_of_Adsorption_Group_Used,
}

The Dictionary_of_Groups_Used looks like
Expand All @@ -2743,9 +2744,42 @@ def extract_source_from_comments(self, species):
# Store the level of the calculation, which is the 2nd token in the comments
source['QM'] = tokens[1]

elif comment.startswith('Gas phase thermo'):
# Handle adsorption correction thermo data of the following format:
# Library example
# Gas phase thermo for C(T) from Thermo library: primaryThermoLibrary.
# Adsorption correction: + Thermo group additivity estimation: adsorptionPt111(Cq*)

# GAV example
# Gas phase thermo for [CH]CC from Thermo group additivity estimation: group(Cs-CsCsHH) + group(Cs-CsHHH) + group(Cs-CsHHH) + radical(CCJ2_triplet).
# Adsorption correction: + Thermo group additivity estimation: adsorptionPt111(C=*RCR3)"

comment = comment.replace(r'\n', ' ')
comment = comment.replace('\n', ' ')
assert 'Adsorption correction:' in comment, f'adsorption correction in unrecognized format {comment}'

# Handle the gas-phase portion first
gas_comment = comment.split('Adsorption correction: + ')[0].strip()
gas_comment = gas_comment.replace('.', '', -1) # delete the . at the end if it exists
gas_comment = gas_comment[gas_comment.find('from ', len('Gas phase thermo for ')) + len('from '):]
dummy_gas_phase_species = Species()
dummy_gas_phase_species.thermo = NASA()
dummy_gas_phase_species.thermo.comment = gas_comment
source = self.extract_source_from_comments(dummy_gas_phase_species)

# This is an adsorption correction
# comment is split into two parts: the gas phase, and the surface adsorption corection
ads_correction_comment = comment.split('Adsorption correction: +')[-1].strip()
dummy_adsorption_correction_species = Species()
dummy_adsorption_correction_species.thermo = NASA()
dummy_adsorption_correction_species.thermo.comment = ads_correction_comment
source['ADS'] = self.extract_source_from_comments(dummy_adsorption_correction_species)['GAV']

return source

# Check for group additivity contributions to the thermo in this species

# The contribution of the groups can be either additive or substracting
# The contribution of the groups can be either additive or subtracting
# after changes to the polycyclic algorithm

comment = comment.replace(' + ', ' +')
Expand Down
7 changes: 5 additions & 2 deletions rmgpy/tools/loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ def load_rmg_job(
use_java=False,
use_chemkin_names=False,
check_duplicates=True,
surface_path=None,
):
# The argument is an RMG-Py input file
rmg = load_rmg_py_job(
Expand All @@ -59,13 +60,14 @@ def load_rmg_job(
generate_images,
use_chemkin_names=use_chemkin_names,
check_duplicates=check_duplicates,
surface_path=surface_path,
)

return rmg


def load_rmg_py_job(input_file, chemkin_file=None, species_dict=None, generate_images=True,
use_chemkin_names=False, check_duplicates=True):
use_chemkin_names=False, check_duplicates=True, surface_path=None):
"""
Load the results of an RMG-Py job generated from the given `input_file`.
"""
Expand All @@ -83,7 +85,8 @@ def load_rmg_py_job(input_file, chemkin_file=None, species_dict=None, generate_i
species_dict = os.path.join(os.path.dirname(input_file), 'chemkin', 'species_dictionary.txt')
species_list, reaction_list = load_chemkin_file(chemkin_file, species_dict,
use_chemkin_names=use_chemkin_names,
check_duplicates=check_duplicates)
check_duplicates=check_duplicates,
surface_path=surface_path)

# Created "observed" versions of all reactive species that are not explicitly
# identified as "constant" species
Expand Down
Loading
Loading