Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporting to MUSICA configuration #337

Open
K20shores opened this issue Jul 16, 2024 · 4 comments
Open

Exporting to MUSICA configuration #337

K20shores opened this issue Jul 16, 2024 · 4 comments

Comments

@K20shores
Copy link

Hi @stulacy

I'm a software developer in NCAR's ACOM lab on the MUSICA project. We have much interest in running the MCM in our box model called music box.

We read json files that define the mechanism used to run our box model. Rather than providing equations compiled into code, our format defines the data needed to run the mechanism. For instance, an arrhenius reaction would be defined like this:

                {
                    "type": "ARRHENIUS",
                    "A": 8.018e-17,
                    "reactants": {
                        "O": {},
                        "O2": {}
                    },
                    "products": {
                        "O3": {}
                    },
                    "MUSICA name": "R2"
                },

I see that you have a database file which seems to define all of your reactions. It seems that the rates are defined as equations (e.g., 2.20D-13*KMT06*EXP(600/TEMP)) rather than as data.

Is there possibly a separate database which contains only the data without the equations written out? If not, would you happen to have any documentation which explains your database schema? We are interested in creating a valid musica mechanism that would allow the MCM to be run in music box and having the reaction data or understanding your schema would greatly help us do so.

@stulacy
Copy link
Contributor

stulacy commented Jul 19, 2024

Hi @K20shores , this looks like a really interesting project and we'd be happy to help integrate the MCM.

We unfortunately don't currently have any publicly accessibly documentation regarding the database schema, but if you don't mind getting your hands dirty a little you can run .schema in the SQLite REPL to view the full schema. Or you can run .schema <table> for a specific table.

As an example then, the main table is Reactions, which stores all the reactions in the MCM (and the CRI). This has a field called Rate, where yes our rates are stored as plain text rather than as discrete parameters. This is to allow easy integration with modelling software that can directly parse these rates. This field is an FK to the Rates table.

There are 17,224 reactions in the MCM.

SELECT COUNT(DISTINCT ReactioNID) FROM Reactions WHERE Mechanism = 'MCM';
17224

which correspond to 2,955 unique rate constants.

SELECT COUNT(DISTINCT RATE) FROM Reactions INNER JOIN Rates USING(Rate) WHERE Mechanism = 'MCM';
2955

Looking at the schema for Rates shows that this has an FK RateType in the RateTypes table. The vast majority of these don't have a rate type (we could probably have put a default value here for explicitness' sake) while the other 2 types are Tokenized and Photolysis.

SELECT RateType, COUNT(*) FROM Rates GROUP BY RateType;
RateType    COUNT(*)
----------  --------
NULL        2705    
Photolysis  92      
Tokenized   462     

The rates that don't have a rate type are a mixture of what would be Arrhenius in your taxonomy (3.10D-12*EXP(340/TEMP)*0.2), or first-order loss (2.23D-13) and possibly others.

The Photolysis rates are parameterised as described on the website, with these parameters defined in PhotolysisRates and PhotolysisParameter tables.

The Tokenized rates are what we refer to on the website as simple/generic rates and complex rates. These are rates that are defined hierarchically, and I think encompass the ternary and troe groups that you use.

Let me know if you need anything explaining in more detail.

@K20shores
Copy link
Author

K20shores commented Jul 23, 2024

Hi @stulacy thank you for that. Several of the reactions have the @ symbol in them, like this one: 5.6D-34*N2*(TEMP/300)@(-2.6)*O2

Is @ used to indicate a power here so that the rate would be this?

$$ 5.6\cdot10^{-34} \cdot [\mathbf{N2}] (\frac{\mathbf{TEMP}}{300})^{-2.6} \cdot [\mathbf{O2}] $$

This code leads me to believe this is true, but the comment seems to suggest that there's another case when what's in the parentheses is not only a number

# Replace @ with exponent when it's just a number to the power
parsed = rate.gsub(/@\(([0-9.+-]+)\)/, '^{\\1}')
parsed = parsed.gsub(/\*\*\(([0-9.+-]+)\)/, '^{\\1}')
# Use Latex exp markup
parsed = parsed.gsub('EXP', '\\exp')
# Replace a / b with marked up fractions
parsed = parsed.gsub(%r{([a-zA-Z0-9.+-{}]+)/([a-zA-Z0-9.+-{}]+)}, '{\\frac{\1}{\2}}')
# Replace LOG10 with log_10
# Ideally would use lookahead/behind
parsed = parsed.gsub(/LOG10\(([a-zA-Z0-9.]+)\)/, '\\log_{10}(\\1)')
# Convert D to scientific notation
parsed = parsed.gsub(/([0-9.+-]+)[D|E]([0-9+-]+)/, '\1\\times10^{\2}')
# Replace @ with exponent when there's an expression in parentheses
parsed = replace_capture_group_multiple(parsed, /@\((.+)\)/, '^{\\1}')
# Replace TEMP with T for brevity
parsed = parsed.gsub('TEMP', '{T}')

Also, just out of curiosity, is there a rhyme or reason for the J values? For example, J<9> doesn't appear to exist

@K20shores
Copy link
Author

Ah, I see further down about the expression inside of an @ expression.

@stulacy
Copy link
Contributor

stulacy commented Jul 24, 2024

Exactly that regarding the @. The rates are written in a FACSIMILE compatible expression, the Technical Reference has full details of the syntax.

There was a reason for the photolysis rates being non-sequential, but it was before my time so I'm afraid I can only clock it up to 'historical reasons'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants