Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Complex access rights for records #70

Open
egabancho opened this issue Jul 14, 2016 · 11 comments
Open

RFC Complex access rights for records #70

egabancho opened this issue Jul 14, 2016 · 11 comments
Milestone

Comments

@egabancho
Copy link
Member

egabancho commented Jul 14, 2016

Current (legacy) situation

On Invenio 1.X records are restricted based on collection rights (if the record is not in any collection it is restricted to the owner and the admins).
Also there is the possibility to grant access to a specific record inside a restricted collection by using certain MARC tags, i.e 506

Migration problem

Sometimes two collections were created artificially to restrict a set of records, i.e. "Videos" and "Restricted Videos".
Moreover in the situation where one gives access to a record inside the restricted collection to a given person, this person can't search for the record she/he is allow to see inside the restricted collection. This kind os technic allows the users to see the record only if they know the URL.
This artificial separation between public and restricted records can now be achieve using facets.

Proposal

Store the access rights inside the metadata of the record, i.e.

{
  "_access": {
    "read": {
      "system": ["group_a", "group_b"], 
      "user": ["[email protected]"]
    }
  }
}

The so called system field stores the access rights derived by the system from, for example, the collections/communities that the record belongs to. This field should not be modified by anyone except by the system.
On the other hand the user field is meant to be modify by the users, for example upon submission, so the user submitting something can decide to share her/his work with some else which is not part of the community.

If both fields are empty, the record is public.

Possible problems with this solution

  • Potentially the records will get modify just to change the access rights and this will generate unnecessary versions.
  • If we combine this solution with collections (or any other thing defining access rights) we need to be sure the access rights are correctly propagated to the record.

Possible benefits of this solution

  • Potentially the records will get modify just to change the access rights and this will generate a useful trace of the evolution of the access rights (maybe easier to combine with memento API if we want to expose all the versions).
  • Makes migration much easier eliminating the headache of cleaning the collection tree and the access right definitions for viewrestrcoll action.
  • Access rights are defined in one place, so answering questions like "why don't I have access to this record?" are much easier to answer.

We have made a small proof of concept on the CDS overlay

Counterproposal
Migrate the viewrestrcoll action as it is, creating a parametrised action, and when migrating the records also migrate the collections they belong to so they can be used to decide whether a user has access to a given record or not.

Possible problems with this solution

  • The "user" access rights definitions have to be migrated and treated in the same way as in the first proposal.
  • All the access rights definitions are going to be migrated, meaning a cleanup in the current system needs to be done (always talking from the point of view of v1.2 -> v3.0 migration)

Possible benefits of this solution

  • Access rights don't need to be propagated to the record.

Note for the reader
The RFC might seems "migration driven" but, regardless if one starts clean or from an existing system, the way access rights are defined IMHO should be the same for the sake of everyone 😉

CC @inveniosoftware/triagers @omelkonian

@tiborsimko
Copy link
Member

Storing ACL in records has many benefits indeed. Two questions:

  • What about serialisation/deserialisation?

    1. Would you keep ACL completely separated from metadata in a separate input/output section, using our own ACL language?
    2. Or would you try to map them into "master format" ACL language if possible, say MARC21 field 540 and friends?

    (E.g. thinking of a use case when a cataloguer may want to edit ACL rights when editing a record in the separated MARC editor such as used by TIND. CC @audub @Kennethhole @PXke)

  • When musing about these topics, it may be useful to ponder W3C WebAccessControl practices that e.g. Fedora Commons uses.

@aw-bib
Copy link

aw-bib commented Aug 1, 2016

In our current system we do access rights by a lot of collection juggling. That is, even today we have access rights in the record (sort of), however in a quite obscured way as one needs to fiddle out what rights result from the intersection of the collections exposed in the bibliographic. Indeed, even today this results in quite a bunch of versions that result only from fixing access rights. (Point is, that in our setups at join² it is a requirement that records are not publicly available till they reach a certain curation stage.)

Right now, all this is done by using websubmit with quite a bit of program logic that is not accessible to the cataloguers + the restriction to most of the cataloguers to use websubmit only to ensure all rights are handled properly.

Given this, I follow @egabancho argument. I would argue that access rights should live in the record from the start in a clear manner, replacing the current collection handling by an easier readable method. Furthermore I do not see the drawbacks outlined by @egabancho as such grave issues while I quite agree with the benefits he pointed out above. Even a history of changes that stem only from access right handling is probably not that much of an drawback. Probably the largest issue is the outlined forced compliance with collections, however this should happen anyway.

Concerning serialization: if one want to give cataloguers an easy view and access to the ACLs I think mapping it to the master format is imperative. From our experience I'd strongly argue to give cataloguers access to the ACL. (Maybe this can be restricted to a sub set of specially privileged cataloguers, if in doubt, but it should not require programmers or admins.) In fact we wrote quite a bit of websubmit-code to make this sort of possible. We'd happily drop this code ;)

Some thoughts:

  • What happens on exchange of records? Ie. how to ensure that rights are propagated properly. In 1.x InstallInvenio ensures in join² that all partners have the same collections with the same access rights to them. As access is driven by collections we just leave them untouched thus the records get proper rights. If I understand @egabancho proposal correctly this would not be necessary anymore and everything would just work. I'd consider this an advantage.
  • One should probably consider using sort of templates for ACL assignments. To hook it up with cataloguing tools for cataloguers a simple association of an authority record comes to mind. E.g. I could well imagine enhanding 540 by the currently still missing $0 to assign an authority ID. Effectively, this linked record can contain all the technical stuff like static firerole definitions.
  • One should consider to handle files similarly to the bibliographic. E.g. if one goes for som 5xx in Marc to handle bibliographic ACL one should consider going for holdings in 8xx instead of bibdocfile to handle files and ACLs.

@audub
Copy link

audub commented Aug 1, 2016

For me the proposal of @egabancho looks like the way to go. And I don't think the "problem" of modified records on change in access rights is a problem at all. I would say it is good preservation and one can easily see who has had access to the record in the past as well.

We have not decided on how we will deal with ACL in the MARC editor yet, but I am pretty sure we will end up with another way than setting it in the MARC tags in the editor.

@aw-bib
Copy link

aw-bib commented Aug 2, 2016

I am pretty sure we will end up with another way than setting it in the MARC tags in the editor.

I admit I'd prefer to see it "like any other field". In the past, especially when it came to invenios file handling, not having it in the Marc was always one of the major drawbacks we experienced.

@lnielsen
Copy link
Member

lnielsen commented Aug 2, 2016

If we go with storing ACL in the record, then the implementation should be agnostic to your preferred data model (i.e. something along the lines in the example). However, it should be fairly straight forward to provide some extra dojson rules in Invenio-MARC21 which exposes the ACL in MARC.

When musing about these topics, it may be useful to ponder W3C WebAccessControl practices that e.g. Fedora Commons uses.

👍

@aw-bib
Copy link

aw-bib commented Aug 2, 2016

If we go with storing ACL in the record, then the implementation should be agnostic to your preferred data model

Agree. Just %s/Marc21/your data model/g and %s/bibedit/your data editor/g.

Just bear with me poor librarian who only has The Data Model(tm) (singular) that for some reason it is called Marc21. Nothing else relevant or in sight for the forseeable future.

What I wanted to say is that the fancy interfaces that are not accessible in your data editor are basically inaccessible in our experience.

Additionally, my librarians tell me that editing Marc is fast but clicking through buttons and masks usually is slow. So I'd strongly vote for making stuff available through the editor in their usual language. (Most of the time this is just a MLE element showing Marc tags and subfields in free form and some non-proporional font.)

@nharraud
Copy link
Member

nharraud commented Aug 2, 2016

First a question: If this design is accepted, would it replace the existing code or be an alternative implementation, giving to overlay developers the possibility to choose one or the other?

And 3 different notes:

1/ It seems to me that this proposal describes how the permissions are represented in the record, but not how they would be checked in Invenio-access's code.

We have a requirement for B2Share which will probably be overkill for most other Invenio services but maybe it can work with the design suggested by @egabancho.
In B2Share some actions on the REST API can require one or more permissions depending.

Example:
We have a publication status field in our deposits. To publish a record the user changes this field's value from draft or submitted to published. This will trigger the creation of the final record.
Thus a PATCH action can either:

  • modify the deposit's metadata
  • change the state of the deposit.
  • do both
    And the tricky part is that the permissions are not the same for changing the state of the deposit and to modify the its metadata. A user can have one and not the other, none or both.

2/ @aw-bib I see one problem if the users start modifying permissions: access rights are given for specific reasons. Sometime multiple reasons give the same access right (example: user is owner and is in a specific group of users). By manipulating permissions instead of the original reasons, the result can be extremely difficult to understand or change programmatically (ex: during migrations).

3/ Another thing which could be useful, but is more a nice to have, is to somehow be able to recompute why a given user has a given permission.
Here is an example of how salesforce UI looks like when explaining permissions.

@egabancho
Copy link
Member Author

When musing about these topics, it may be useful to ponder W3C WebAccessControl practices that e.g. Fedora Commons uses.

👍

1/ It seems to me that this proposal describes how the permissions are represented in the record, but not how they would be checked in Invenio-access's code.

@nharraud we have a small proposal in our overlay, is this what you meant?

@aw-bib
Copy link

aw-bib commented Aug 3, 2016

@nharraud

I see one problem if the users start modifying permissions: access rights are given for specific reasons.

Sure. I do not think about exposing access rights to every user as I would not expose bibedit to them. However, I have three kinds of users in mind (actually we have four).

  • Real Users. In our case at join² these are scientists, secretaries and so on. People who originally submit stuff to the database by some fancy web form (aka websubmit or webdeposit etc.)
  • Priviledge users. These people still only have the same ability as users, but they can approve a record to make it public.
  • Librarians. So to say second level users. Trained in what they do in cataloguing.
  • Programmers.

The number of people in these ranks decrease dramatically from top to bottom.

By manipulating permissions instead of the original reasons, the result can be extremely difficult to understand or change programmatically

I see your point. Basically what I have in mind is in our workflow:

  • websubmit by users: they see a predefined form that hides all the technicalities. They only get a pretty limited subset of valid fields and there is a programmatic logic that would assign rights to the records and stuff. (At join² this is basically done by websubmit-functions that implement what upstream would become the workflow.) These users would never ever get an interface to change permissions or do complex things beyond what is allowed in the GUI. They live in the GUI: they may use any button or field, but they are restricted to the buttons and fields. So: all rights assignments and what not is done programmatically.
  • The same is true for the more priviledge users. Just the automatically assigned rights change.
  • My librarians act as sort of privileged users as well. They use this very same GUI for >90% of their tasks, ie. they see nothing else than the normal user. They do the same things, it's just in our worflow their "submit" again changes some access rights (e.g. they can approve an attached document to be public.)

However, for this <10% of their tasks, ie. once my fellow libraians (and only they have access to it) open bibedit I would like to give them the possibility to see and assign rights if needs be. I see your point that this may get messy, however, at this stage I have trained people who know what they do. And they'd usually only fix some issue that arose.

In my world bibedit is the low level editor and it should allow low level by having access to the necessary stuff. So, Basically, I'd like to see sort of full access on the record level in an interface easier than ipyhton.

Why? There are tasks I could easily give to a librarian to fix that otherwise require a programmer to the same end. We are always quite a bit short of the latter. (Not that we have too many librarians either.) So, it can be much easier and faster to fix 500 records manually than getting someone writing the python for it. (Frankly: till about ~1000 records I tend not to write code, they are faster.) If there is an interface they understand. One of the worst issues in Invenio1.x bibedit e.g. is it's entire inability to handle bibdocs and their permissions. By assigning records to collections we can work around the limitations of not accessing the rights directly. However, it is as @egabancho describes, sometimes quite obscure what rights result from which assignment.

@nharraud
Copy link
Member

nharraud commented Nov 4, 2016

@egabancho What is the status of this issue?
Would this new feature be added to invenio-access or in a separate module?
Maybe we can change the milestone to I3B-API-Docs as this issue includes probably a new API and the current "I3B-Milestone" contains only this issue.

@lnielsen lnielsen modified the milestones: v1.1, someday Jun 14, 2017
@lnielsen
Copy link
Member

I'm adding milestone v1.1 to deal with this shortly after a first final v3 release. In principle this is not such much about Invenio-Access but more about defining actions and default permission factories in Invenio-Records.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants