Releases: PhasesResearchLab/pySIPFENN
v0.16.2
Minor Changes:
- The
OPTIMADEAdjuster
model tuning class now includes more explicit provenance tracking through itsreferences
attribute, populated if the provider response includes "relations" to the references. As of now, many popular providers do not include these, but some, likealexandria
, can be used and should populatereferences
with a list of lists of DOI strings. - The
KS2022_randomSolutions
's optional metadata dictionary, returned whenreturnMeta=True
is passed togenerate_descriptor
function, now includes a list of KS2022 featurizations of each supercell expansion (i.e., iteration) under theindividualResults
field, which can be used in addition to the converged global ensemble. Among several use cases, users can use these to run ML models over subsets of the global ensemble and look at the prediction distributions to gain additional insights. E.g.:
KS2022_randomSolutions
implementation and code style have been generally improved.- Appropriate documentation and tests were added.
Full Changelog: v0.16.1...v0.16.2
v0.16.1
Minor Changes:
- Mitigate
pymatgen
's incompatibility (currently preventing installation) withnumpy>=2
by setting the maximum version - Added a convenience function in misc for patching pymatgen's
core/periodic_table.json
for exotic elements like Livermorium of Hessium. - Added a simple cubic (
SC
) prototype of Po to the library. - Other minor updates.
Full Changelog: v0.16.0...v0.16.1
v0.16.0
This version introduces 3 exciting changes! (1) The all new ModelAdjusters
submodule automates tuning and can fetch data directly from OPTIMADE API
; (2) A new manuscript detailing advantages of our featurization tools has been put on arXiv:2404.02849; and (3) the name of the software was updated to python toolset for Structure-Informed Property and Feature Engineering with Neural Networks to retain the pySIPFENN
acronym but better reflect our strengths and development direction.
Major Changes:
- Submodule of
ModelAdjusters
has been set up for all kinds of model adjusting efforts. - The
LocalAdjuster
implements local model tuning, plotting of results, and hyperparameter matrix search. - The
OPTIMADEAdjuster
class combines it with the powerfulOPTIMADE API
to automate data fetching. Now, you can quicklyor to perform a hyperparameter search, replace thefrom pysipfenn import Calculator, OPTIMADEAdjuster c = Calculator(autoLoad=False) c.loadModels("SIPFENN_Krajewski2022_NN30") ma = OPTIMADEAdjuster(c, model="SIPFENN_Krajewski2022_NN30", provider="mp", targetPath=("attributes", "_mp_stability", "gga_gga+u", "formation_energy_per_atom"), device="mps" # MPS is for Apple M-series GPU ) ma.fetchAndFeturize( 'elements HAS "Hf" AND elements HAS "Mo" AND NOT elements HAS ANY "O","C","F","Cl","S"', parallelWorkers=4) ma.adjust() ma.plotStarting() # See the starting performance ma.plotAdjusted() # See the adjusted performance
ma.adjust()
with:ma.matrixHyperParameterSearch() ma.adjust(learningRate=0.0001, optimizer='AdamW', weightDecay=1e-05, epochs=37)
- The new manuscript on Efficient Structure-Informed Featurization and Property Prediction of Ordered, Dilute, and Random Atomic Structures has been uploaded to arXiv:2404.02849 and will be submitted to journal in a couple days after comments from collaborators.
Minor Changes:
- Added
writeDescriptorsToNPY
function to streamline persisting feature data into NumPy for our end-users. Appropriate tests were added. Thanks @rdamaral for making this contribution! - Improved numerous docstrings.
- Random solution featurizer now exits gently on
KeyboardInterrupt
- The
LocalAdjuster
andOPTIMADEAdjuster
were (optionally) connected to ClearML for neat tracking of ML training histories. - Minor bugfixes in several spots.
New Contributors
- @rdamaral made their first contribution in #15
Full Changelog: v0.15.1...v0.16.0
v0.15.1
Major Changes:
- The printout of
Calculator
initialization, model parsing, prototype parsing, etc., is now colorized (grantedverbose=True
) for quick and effortless reading of critical settings and information. - Model download is now handled by our in-house-maintained
pysmartdl2
, instead ofpySmartDL
. It allows us to better test CI with modern Python versions and ensure long term stability.
Minor Changes:
- Requires
pymatgen
version2024.2.20
, which fixed a serious security issue inpymatgen.io
. - Small bug fixes for two rare cases.
- Workshop materials in
examples
updated to be compatible with the currentpymatgen
version. - Minor style improvements.
Full Changelog: v0.15.0...v0.15.1
v0.15.0
Welcome to pySIPFENN v0.15, which is our biggest singular jump since 2020 spanning 196 commits since v0.13.0 🎉 It covers improvements in nearly every place of the codebase, including new features, better performance and more convenient high-level API. Two key new features are (1) a neat prototype library (v0.14) simplifying the handling of atomic structures, and (2) random solid solution (RSS) featurizer / descriptor calculator (v0.15), which allows you to featurize chemical composition occupying a prototype structure.
Now, you can efficiently utilize pySIPFENN to, for instance, featurize high entropy alloys (HEAs) occupying BCC, FCC, C14, and C15 structures. Have a look below for details!
Major Changes:
Random Solid Solution Featurization
- KS2022 Random Solution Featurization (
KS2022_randomSolutions
) added by @amkrajewski (mostly in #10), which allows users to quickly obtain KS2022 feature vector corresponding to a random solid solution with compositioncomp
occupying arbitrary structurestruct
(including libraryBCC
,FCC
,HCP
) through iterative expansion of ensemble of local chemical environments (LCEs) until all features, i.e., statistics over LCEs, and composition converge onto stable values.- This method is philosophically similar to the SQS approach (10.1103/PhysRevLett.65.353), which one can generate with Alloy Theoretic Automated Toolkit (ATAT) (10.1016/j.calphad.2013.06.006) or approximate with sqsgenerator; however, here, we are not only encoding the correlations between anonymous species types but also (1) consider differences between them, thus, e.g., solutions of elements with similar electronic configurations will converge faster) and (2) allow arbitrary complexity of the chemical composition under a single methodology.
- It should also be better at capturing randomly occurring uncommon extreme-case events, which may not occur in low-correlation cases.
- We have tuned the default values of the method for complex cases of (a) high entropy alloys (HEAs) with 6+ components and (b) multicomponent steels with several minor alloying elements.
- Added matching tests for all methods of
KS2022_randomSolutions
featurizer. Added documentation for all tests, so that they can serve as runnable examples. - Added complete API documentation of all methods and all parameters, with both meaning and motivation, for
KS2022_randomSolutions
. - Added top-level
Calculator.calculate_KS2022_randomSolutions()
andCalculator.runModels_randomSolutions()
for conveniently calling everything on complex problems, even with mixed-type inputs like:c.calculate_KS2022_randomSolutions( ['FCC', myDistortedBCC, 'BCC', 'HCP'], ['WMo', Composition.from_weight_dict({'Mo': 20, 'W': 70, 'Zr': 10}), 'FeNi', 'CrNi'], mode='parallel', max_workers=4)
Prototype Library:
- The new prototype library functionality allows you to store commonly used structures in a consistent fashion and pleasant schema under
misc/prototypeLibrary.yaml
file in your pySIPFENN installation. It is initialized alongside everyCalculator
instance. - Parsing custom libraries is very easy and can be done from a file in your directory or from a remote repository like Zenodo! Simply use the
parsePrototypeLibrary
to fetch it. Or, to append your library with an external one and persist it your installation, useappendPrototypeLibrary
.- name: BCC origin: https://www.oqmd.org/materials/prototype/A2_W POSCAR: | W 1.0 -1.58250 1.58250 1.58250 1.58250 -1.58250 1.58250 1.58250 1.58250 -1.58250 W 1 Direct 0.00000 0.00000 0.00000
- Added complete API documentation of all methods and all parameters, with both meaning and motivation, for everything prototypeLibrary-related.
- Added matching tests for all new prototypeLibrary-related methods. Added documentation for all tests, so that they can serve as runnable examples.
General:
- Complete overhaul of the documentation in terms of consistent styling, type-hinting, and extent of the content. Most functions can be well-understood just by looking at information displayed by IDE like PyCharm or VSCode.
- The
Calculator
class now can self-destruct and deallocate by call to itsdestroy()
function, allowing users to better manage memory used in calculations on low-power systems. - Performance improvements in chemical attributes calculation of all
KS2022
-calculating featurizers. - Performance improvements and minor fixes in
KS2022_dilute
featurizer. - Performance and feature improvements in the
modelExporters
, including imports optimization so that are now they only run when needed by given exporter class. - Optimized testing workflows in terms of events triggering them and general performance.
- Improved testing clarity and platform compatibility, by splitting them into individual workflows.
- Dependencies were analyzed for compatibility and set in a way to ensure long-term maintainability and near-future compatibility with Python 3.12.
- Added instructions for contributing to pySIPFENN, explaining philosophy of the software and several design choices.
Minor Changes:
- Improved printouts from
Calculator
initialization and current state reporting. - Improved verbosity handling in
Calculator
instances. CITATION.cff
added for clear citation guidance in the future.- Documentation now includes miscellaneous notes users may find useful.
.gitignore
improvements as requested by users.- Added testing for several smaller functions like
ward2ks2022
, to increase coverage by 1%. Now it is over 95%. - Code style improvements.
- Commenting improvements scattered throughout the codebase to make modifying our code easier.
- And many more added on the way :)
Notices:
- In the near future (likely next release), pySIPFENN will drop official support for Python 3.9, due to (1) noticeably lower performance, (2) deprecation of 3.9 on some platforms (MacOS14), and (3) next release using several pure-Python features added in Python 3.9.
- In the near future (likely next release), pySIPFENN will add support for Python 3.12 to keep it up-to-date and take advantage of performance boost. Our codebase is ready for the switch. We are waiting for PyTorch support for it in its stable release.
Full Changelog: v0.13.1...v0.15.0
v0.13.1
Major Changes
- All tests are now documented under API pySIPFENN Tests page with hyperlinks to their sources, acting as minimalistic examples of using the software.
Minor Changes
- Improvements to FAQ, docs index, and README.
- Polishing of the documentation page and general maintenance.
Full Changelog: v0.13.0...v0.13.1
v0.13.0
Major Changes:
- Per [add] modelExporters by @amkrajewski, 3 model exporter classes have been added to pySIPFENN in the
modelExporters
module:ONNXExporter
allowing (1) exporting back to ONNX after models were adjusted to new datasets or new properties (transfer learning), (2) automated simplification of model architecture for improved performance, and (3) adjusting model internal precision to FP16 to reduce its size by half with only minor performance impact.TorchExporter
to export models toPyTorch
which are used internally by pySIPFENN.CoreMLExporter
to export models to Apple's CoreML format developed for use in their devices, where it provides the most seamless integration with existing apps and can harvest very efficient Neural Engine hardware acceleration. At the same time, it can be used on other platforms as well, such as Linux or Windows. Note that by default, models will be converted to FP16 precision, similar to one of theONNXExporter
options.
The above changes are a step in our effort to make advanced use of pySIPFENN easier for end-users. Next items on the roadmap include (1) automation of transfer learning on small datasets and (2) OPTIMADE integrations, which will make pySIPFENN API a one-stop solution for model fine-tuning.
Minor Changes:
- Small improvements in the
Calculator
object printout with theverbose
option (true by default) added to its initialization. - Matching tests were added to the CI pipelines.
dev
extra dependencies have been established, alongside appropriate documentation, for all future dependencies that are not required for core pySIPFENN functionalities to keep it as light as possible while not limiting advanced users. One of the future directions will be to make core dependencies lighter.- Model exports and motivations were added to the documentation under Exporting pySIPFENN Models page.
Full Changelog: v0.12.2...v0.13.0
v0.12.2
Major Changes:
-
This is a minor release that changes the license to LGPLv3 in order to allow for integration with proprietary software developed by the CALPHAD community (shipping pySIPFENN and then calling it from closed-source tools will be allowed) while supporting the development of new pySIPFENN features for all users, as modifications to pySIPFENN (except for models, see below) itself need to be open.
-
This change does not affect the default pySIPFENN models, which are distributed under Creative Commons 4.0 that already comes with no restrictions.
-
Many thanks to our colleagues from GTT-Technologies and other participants of 50th CALPHAD 2023 conference in Boston for fruitful discussions.
Full Changelog: v0.12.1...v0.12.2
v0.12.1
Minor Changes:
- FIX: This update addresses network loading issues, affecting some SIPFENN models (with Dropout), caused by
onnx2torch=1.5.7
recently introducing a new naming schema for the "insides" of ONNX neural networks translated to PyTorch. - Improved testing, including automated benchmarking of featurizer performance across Python versions and platforms.
Full Changelog: v0.12.0...v0.12.1
v0.12.0
Major Changes:
- Automated matrix-testing on Linux / Mac / Windows with Python 3.9 / 3.10 / 3.11 through GitHub Actions CLI. Core functions are tested across all of them, and badges in the README indicate test status after every code change
- Automated test coverage analysis through GitHub Actions CI and reporting through Codecov service
- Many improvements in the testing procedures and additional tests bringing the coverage up from 74% (in v0.11.0) to 86%.
- (affects backward compatibility) The models download and run functions built around MxNet, which have been deprecated for a while since v0.9.0, have been removed.
- (affects backward compatibility) Small change in the behavior of the runModels_dilute() function. Now it expects the descriptor / feature vector input "KS2022" to run the "KS2022_dilute" descriptor calculator / featurizer. This change is due to a few new featurizers being in the works, including for approximating random solid solutions and quasicrystals, and all of them will use the "KS2022" descriptor, so this will make workflows much more clear.
- Added official Python 3.11 support and tests using it.
- Added small automated benchmarking on Linux using different Python versions so that users can select one that works best. Generally, Python 3.10 is the fastest. Across all 3 featurizers (KS2022, KS2022_dilute, and Ward2017), relative to the Python 3.9 baseline, 3.10 is around 35-40% faster, while 3.11 is 25-30% faster, based on the tests in GitHub Actions CI.
Minor Changes:
- Minor bug fixes, mostly in tests, not the user code.
- The wget dependency has been removed, as we moved to the multi-threaded pySmartDL package for model download.
- Documentation updates.
Full Changelog: v0.11.0...v0.12.0