Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAO functionality, FCIDUMP & spherical->cartesian converter #30

Merged
merged 45 commits into from
Dec 15, 2023

Conversation

joguenzl
Copy link
Contributor

This pull request contains a number of smaller changes.

The converter from TREXIO to FCIDUMP is implemented into the Python codebase. The Fortran implementation is removed since it did not handle core orbitals correctly and is now obsolete. The new implementation was tested extensively with FCIQMC calculations.

The conversion script from spherical harmonics to cartesian angular functions is changed. The counting of orbitals in the target format is altered so that it works if the values in the ao_shell array are not monotonous, which is an implied assumption in the current implementation. Futhermore, basis_shell_factor and basis_r_power are modified during the conversion. This breaks compatibility with older code, but is to my understanding necessary as discussed in part with @scemama. This was tested with DMC calculations using QMCkl and QMC=Chem on NAO basis sets.

Functionality for NAO basis sets is added to check_basis. check_basis is made to print the diagonal entries at the end since those are the most important ones.

Some of the commits are about a converter script to generate a TREXIO file from FHI-aims output. This converter has since been made obsolete because FHI-aims now directly generates TREXIO output. It is kept in the commit history should it be of relevance
for future testing.

A TREXIO file containing the water molecule in a NAO basis set (in spherical harmonics) is provided in the attached archive for testing: water-nao-tier2.zip

@q-posev
Copy link
Member

q-posev commented Dec 11, 2023

@joguenzl thank you! Great contribution 👍
@scemama are you going to review it or should I do it?

@scemama
Copy link
Member

scemama commented Dec 12, 2023

Thanks @joguenzl !
@q-posev : I will review it, but as I am involved in the project with @joguenzl , it will be good it you can also make a second pass after me.

@scemama scemama self-assigned this Dec 12, 2023
@scemama scemama self-requested a review December 12, 2023 07:49
@scemama scemama requested a review from q-posev December 12, 2023 07:50
@q-posev
Copy link
Member

q-posev commented Dec 12, 2023

OK @scemama, ping me when you are done. We need to upload also test files to the data folder and adapt the CI accordingly to test the new functionalities. Would be nice also to add a rigorous test of spherical->cartesian too since to avoid regressions in the future.

out_index = np.array([i+1 for i in range(n_act)])

# If the orbitals are spin-dependent, the order alpha-beta-alpha-beta...
# should be used for ouput (as it is expected e.g. by NECI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the FCIDUMP file that you will produce will not correspond any more to the TREXIO file. If there is already a CI wave function inside the file, the signs of the CI coefficients will be wrong because the phase of the determinants will change.

I don't think that the convention chosen by NECI is universal, because when you use the non-interleaved ordering, you can organize easily the integral tensor in blocks.

It would be better to have a command line option that enables swapping orbitals, which would be false by default. When the orbitals are swapped, we should print a warning on the screen to inform the user that the orbital indices in the FCIDUMP don't correspond to those of the TREXIO file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we fixing this here or in another MR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that we fix it here because it is important.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I wait for the fix before reviewing

src/trexio_tools/converters/convert_to.py Show resolved Hide resolved
src/trexio_tools/converters/convert_to.py Outdated Show resolved Hide resolved
@q-posev
Copy link
Member

q-posev commented Dec 12, 2023

@scemama I just remembered that delete_group in TEXT back end does not remove the sparse datasets, only the per-group file. I did not find a straightforward solution back in the days because TEXT back end was already very convoluted. For HDF5 back end it should work as expected, but would be nice if you or @joguenzl can verify.

As a temporary workaround for the text back end - remove the sparse data files manually in the converter. You should be able to get the file back end from the trexio.File class attributes.

@scemama
Copy link
Member

scemama commented Dec 13, 2023

@scemama I just remembered that delete_group in TEXT back end does not remove the sparse datasets, only the per-group file. I did not find a straightforward solution back in the days because TEXT back end was already very convoluted. For HDF5 back end it should work as expected, but would be nice if you or @joguenzl can verify.

As a temporary workaround for the text back end - remove the sparse data files manually in the converter. You should be able to get the file back end from the trexio.File class attributes.

OK. Let's not do this in this PR. For the moment, we can leave

    if trexio.has_ao_2e_int_eri(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int_eri(out)

    if trexio.has_ao_2e_int_eri_lr(inp):
      trexio.delete_ao_2e_int_eri_lr(out)

as it was before. I will try to find a solution in TREXIO for the text backend, and then we can make the change in trexio_tools.

@q-posev
Copy link
Member

q-posev commented Dec 13, 2023

OK. Let's not do this in this PR. For the moment, we can leave

It does not make sense. The way it was before is wrong and broken. It was removing the dataset (ao_2e_int_eri), but this functionality is not supported. You can only remove groups (ao_2e_int), so you need to do this:

    if trexio.has_ao_2e_int(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int(out)

And remove manually sparse dataset files in case of text back end.

@scemama
Copy link
Member

scemama commented Dec 13, 2023

OK. Let's not do this in this PR. For the moment, we can leave

It does not make sense. The way it was before is wrong and broken. It was removing the dataset (ao_2e_int_eri), but this functionality is not supported. You can only remove groups (ao_2e_int), so you need to do this:

    if trexio.has_ao_2e_int(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int(out)

And remove manually sparse dataset files in case of text back end.

OK, I didn't get it. So if we do:

    if trexio.has_ao_2e_int(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int(out)

and don't remove the files, the TREXIO file will be correct, and there will just be unused files present in the directory, but the group will not be accessible anymore. Correct?

@q-posev
Copy link
Member

q-posev commented Dec 14, 2023

@scemama sorry for the confusion. Look at the current code (and also what you suggested above):

    if trexio.has_ao_2e_int_eri(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int_eri(out)

    if trexio.has_ao_2e_int_eri_lr(inp):
      trexio.delete_ao_2e_int_eri_lr(out)

The trexio.delete_ao_2e_int_eri(out) and trexio.delete_ao_2e_int_eri_lr(out) lines are wrong. If you have a look at the API - we do not have these functions (we only remove groups, not datasets). So this code will never work. The fact that this went unnoticed is because this part was never executed and Python is interpreted, not compiled. That's what I meant.

Now even if we use the existing function trexio.delete_ao_2e_int(out) instead of non-existing trexio.delete_ao_2e_int_eri(out), you still need to manually remove sparse dataset TXT files when TEXT back end is used.

@scemama
Copy link
Member

scemama commented Dec 14, 2023

@q-posev OK, so let's make the code correct by removing the group, and then we will fix the text back-end to remove the files.

@joguenzl
Copy link
Contributor Author

The two points discussed above have been addressed.

@q-posev
Copy link
Member

q-posev commented Dec 15, 2023

Thanks @joguenzl ! I can merge this PR as is, we have a problem in the CI due to a bug in PySCF->TREXIO converter, but this will be fixed elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants