-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base class inheritance #158
base: master
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 6169389406
💛 - Coveralls |
The converter already picks up on this. It's just that it will nexus.get_node_at_nxdl_path with the following NXDL path: We need to make sure nexus.py understands that NXbeam_optical inherits from NXbeam and such can be replaced. But is this what we want? Do we want to allow specialized beams be used in AppDefs that list the parent beam? Also, then do we allow using a different specialized beam or just the parent? |
Yes, I think this is the better formulation of this point. I will update it.
I would say yes. It should be possible to use any sub-class for a parent class (e.g., for |
We will then need scripts to make sure we set up such compatibilty at all times. This can be hectic especially if I want to have a more specialized NXbeam_optical that switches around a bunch of stuff. The framework then should be able to either reject such changes and explain why or provide it as an incompatible to an NXbeam entry during writing. I also want to bring another point which always puzzles me with NeXus: Why don't we tell what base class/type we exactly used in the data instance aka the h5 file. Currently, we only provide an AppDef and the framework has to resolve the rest of the dataset types using the AppDef as a source of truth. This means whenever I would like to add another group that is used in another technique or so, I have to go and make a new AppDef. This seems to have been designed for applications to know exactly what to expect. But my argument for that is that we can easily write an application that looks through the h5 file with its pointers to what type/concept every dataset is. And then act upon it if it knows how to deal with this concept. If it doesn't, it just ignores it for specialized functionality but still provides access to the user if they know what to do with it. The entanglement of data instance/storage structure dictating what concept we are dealing with is quite a hassle. With the above strategy, one can easily use NXbeam_optical or whatever they like. If the app understands it aka finds compatible fields in this group, it works with it. But if it doesn't, then it ignores it. |
That's why I would mainly use inheritance for adding additional fields to the parent class and not renaming or moving anything (because this changes our notion of the information contained in this class). The question is when we write it as
This information is written, no? There is the
This is one of my main reasons to have base class inheritance. With this appdefs can be as easy as just stating which combination of classes I expect. We even could go towards a generic experiment which we manually compose. However, we just offload the specifics into the base classes then (i.e., we would still want to require certain fields for an, e.g., mpes experiment with a collection of fields we don't want to (re-)use in any other experiment). I would also argue that our current process is removing the differences between appdefs and base classes, except that we still name them such.
We really can push towards trying also having a generic representation without relying on an appdef (or only a generic appdef, or not using the terms of appdefs and base classes anymore and just consider them using the same modeling principles). And I tend to like this idea as it gets closer to what is done in owl, that is, only relating some properties to an experiment. But again, I think we just offload the variability into the base class taxonomy.
I think these are all good points. I think we should just discuss/test some of them and see which one works best for us. I'm currently reading also a bit in owl/ontologies modelling so maybe I can draw some inspiration from this as well. |
So you mean that an AppDef asks for an NXbeam_optical but a user provides an NXbeam in the data? I think we should discuss this over a meeting when we have a consensus on the examples you've setup on the definitions repo. I would say if NXbeam_optical is incompatible with NXbeam then we cannot just accept an NXbeam as the data. But I'm not too confident with this. I need more clarification.
The resolution depends on how we modify that NX_class in our AppDef. If we change the doc string etc or we add fields to it this becomes a separate entity in the AppDef than what it is in the base class. I think with this base_class inheritance this will become simpler as we can take the specialized base class and list it in the AppDef rather than modifying it there. The only limitation would be that some fields could be named the same in different groups. Their meaning comes from their hierarchy. That is much easier to handle if we have direct pointers for every single field. So I could mix match my data and still point to the right concept.
I agree with you. The current AppDefs seem to be an unnecessary limitation. If we have good base classes, any combination of them can be read correctly and understood. AppDefs would just be our h5 files using these base classes.
It will be nice to experiment with this. Please share some resources if you find something interesting. Thanks! |
No. My thought was, that the appdef requires an IMO an
The easiest would be to tell the writer which classname to write to NX_class, like in the example above.
Yes, we still might want to have some notion of an experiment, i.e., as a collection of N scan-axis with M sensors at each scan point + sample + operator + instrument. But we could combine it as we like with what we scan and what instrument we use.
I think the W3C primers on rdf, rdfs and owl are very extensive. Currently I'm reading this book which is very nice to get an overview. |
I did not follow all the discussions so far, but I somehow have the feeling that we should separate two things in the discussion:
The reason for doing the second is not so eminentily clear to me, as indeed it somewhat mixes the purpose of application definition and base classes, and I think there are very good reasons so have the two. We should think very carefully about why we want to introduce requirements in the first place, because any requirement very severely limits the usability of everything we crease, as we have seen in the discussion at the mpes-workshop. |
My feeling is that having a base class inheritance without having required fields does not solve too much. Essentially, we simply add some additional vocabulary in an additional file. While I see that this can declutter our glossary a little it does not give us the notion of something representing an entity (e.g., a detector). We could very well use a class and its subclass interchangeably and just requiring the same fields in an appdef. For me it feels just adding some more files and complicating where a person has to look for glossary terms when writing or reading an appdef without adding any logical abstraction.
I agree that it is good to have a glossary of terms as we currently have it in the base classes. And I'm not fully arguing that we should get rid of the base classes entirely (even though we can discuss whether we can collect glossary terms also in a hierarchical way). Besides this fact, it is not so evident to me that its good to have the distinction of base classes and appdefs. We had several occurrences in the past where we had problems due to their nature, here are some of the problems we already had (which is not a complete list):
I think the last point is a crucial one for the requirements you mention. I think we agree that we have to require some fields, but not too much to not be too restrictive. With a taxonomy we can build a very general top-level class where we don't require much and then add more and more specifics into the sub-classes. This is the point where a general glossary might be useful, because we might want to add only one specific field to the parent class, but we don't have everything available for a specific sub-class. We still want to write this field properly defined and show that its part of this taxonomy. We could also keep the current base classes entirely but simply add this taxonomy features we are currently thinking about as an additional layer in between appdefs and base classes. In this case the base classes would give us the vocabulary for a taxonomy and an appdef would basically just be a specific class in the taxonomy which we attach the notion of being a standardised data collection. This would also allow to be compatible with the current way things are in nexus and we could seamlessly add taxonomies where we need it. |
It is possible by appdef inheritance, but I agree it's probably not elegant, if it only applies to a small part of the tree
Well, I think we do that e.g. with NXelectronanalyzer, which has NXdetectioncolumn, NXenergydispersion, etc. as parts, no? That's all defined in the base classes, it would be sufficient to even only cite NXintrument.
One could just write optionality everywhere explicitly, that can easily be done with the yaml2nxdl
Okay, I think I understand your point now. You want to cite NXbeam, and have every subclass require a few fields to specify as e.g. NXbeam_poalized, and only if these fields are present, you are allowed to replace NXbeam by NXbeam_polarized, right? In principle, that is not a bad idea, as it would allow to build very generic applications, that could work based on the existence of these specialized versions of the generic base classes. Still, we need to make sure to not restrict the file format too much, and not require unnecessary fields in the "base" version of our taxonomy. |
It's also relevant between different methods, i.e., most experiments have some form of detector or sensor without having an inheritance between themselves (even though we have to ask whether different detectors in different fields have some commonality or should remain in their specific domain - but there are other examples, e.g., NXsample).
I mean it slightly different here. We have this different classes, yes. However, one could not just build an NXenergydispersion without getting the requirements from the appdef. With what I have in mind we would have a base class which automatically requires the fields to be a proper NXenergydispersion and we would just have to state in the appdef that we want one of such NXenergydispersions. The benefit is that an application could just write an NXenergydispersion which is in compliance with NXmpes without ever needing to know about NXmpes (I know we could just require additional fields in the base class. This is admittedly still a problem here, because we always need to build the union between a class or one of it's possible subclasses and the required fields in the appdef). Currently, we can only do it by using the
That's true. I also have to admit that part of the confusion is also about how the web view of our definitions is structured and that everything is nested and one has to know that there is relevant information somewhere else. So this might rather be a point of documentation.
Exactly. Yes, we have to be careful to not require to much if we are far up in the tree. But I think when we carefully design this, it can be extremely helpful. I also would like this glossary+taxonomy approach as this would allow us to test and change stuff without breaking any of our current definitions. Because I think we need to do a few iterations to get a good solution which balances requiredness and optionality. There is also an current example in NXpeem which could benefit from this: there are certain fields only required/set for the dispersive mode of the NXenergydispersion. With inheritance there this is very easily solvable. Currently, we just keep things optional and state the situation in the docstrings (there are also other examples where we did it in this fashion already). |
I very much want this. It will also help us with comparing/searching across data. We know that no matter where we find this entity, it's supposed to be read and written in the same way. |
This is an example implementation for base class inheritance.
It is related to FAIRmat-NFDI/nexus_definitions#66
What we need to do:
NXbeam
group can be filled with, e.g., the path/ENTRY[entry]/BEAM_OPTICAL[beam]
.