CINECA publishes recommendations on authentication and authorisation
In spring 2019 CINECA WP2 carried out an interview of the cohorts participating in the project concerning their practices to authenticate and authorise a researcher who wants to access data. The goal was to study the commonalities and differences in cohorts’ practices and to make recommendations for future work items.
The interview focused on three aspects
Authentication i.e. how the cohorts validate a researcher’s identity when they apply for data access for secondary use
Data access authorisation i.e. how a researcher applies for and the data access committee reviews the applications for controlled access data
Access control enforcement i.e. how the researcher can use their access rights, for instance to download the datasets or to access them in a secure environment where they are available for the researcher
The project published seven recommendations to cohorts, federated authentication providers, DAC tools providers and the GA4GH’s DURI workstream that prepares relevant standards for global use.
Support federated authentication. Cohorts and DAC tools providers should support federated authentication, which would reduce the number of credentials researchers need to manage. Use of home organisation credentials increases trustworthiness on the credentials and attributes attached to them. There are existing federated authentication systems in place, such as ELIXIR AAI.
Develop and improve electronic tools to manage DAC process. DAC tools providers should develop tools for the researchers to electronically apply and for data access committee to review data access applications. This should cover the whole process from filing and amending data access applications to approving the application, signing a data access agreement and delivering the data access permissions to computing environments where the access rights can be used. During the process, the tools should enable each co-applicant and their home organisation representative to sign the data access application/agreement.
Define a claim to express researcher’s home organisation’s signing official. Most cohorts expect the data access application to be signed both by the principal investigator and an authorised representative of their home organisation, often called a signing official. To support automation of the process described above, GA4GH should define and federated authentication providers deploy a claim that describes that its holder is the signing official of their home organisation. The signing official could then log in to the electronic tool described above and approve the data access application/agreement on behalf of the applicant’s home organisation.
Develop a way to manage a researcher’s fresh affiliation information. All cohorts reported that access to datasets is granted to them as a representative of their home organisation and if they later depart their access rights must be revoked. Federated authentication providers should develop ways to manage a researcher’s fresh affiliation information and the DAC tools providers integrate the affiliation to the researcher’s data access permissions. Together these two can be consumed by the computing environments and used for revoking researchers’ access to data promptly when they depart from their home organisation.
Develop a way to express data access permissions to complex data objects. The interview revealed that the cohorts have deployed three strategies for data access granularity; (1) permissions are granted to access all samples in a dataset, (2) permissions are granted to particular variables (attributes) of all samples or (3) permissions are granted to particular variables of particular samples (for instance, a researcher has permission to access age, body-mass-index and genotype for all samples where a patient has positive diagnosis on diabetes and has consented their data for non-profit research). The GA4GH should agree on and DAC tools providers deploy an interoperable syntax and semantics for expressing a researcher’s permission to access a complex controlled access object. This would facilitate programmatic access control enforcement in computing environments with replicas of cohorts while retaining the control of the data access permissions centrally in a DAC.
Develop a data model to describe the relation of projects, researchers and data access permissions. GA4GH should develop and federated authentication providers deploy a data model describing research projects, researchers’ roles in the projects and the data access permissions granted to the combination of the two. When a researcher logs in to the computing environment, they needed to select the project they are working on during that session, enabling the computing environment apply dynamic separation of duties (for instance, if a researcher has a permission to dataset X in the context of project A and dataset Y in the context of project B, the computing environment could block their access to dataset Y while currently active in project A).
Adopt common qualifications for registered access. Cohorts are considering registered access (e.g. beacons) as a light-weight data access model for instance for accessing summary and frequency level data. Cohorts should adopt GA4GH’s criteria for researchers who want to access registered access data. The lack of widely adopted criteria would lead to fragmented approaches, hindering cohort interoperability and complicating data access for researchers. The criteria should cover a definition for a researcher in good standing (bona fide researcher) and the attestations they need to make. Federated authentication providers should adopt tools to manage the researchers’ registered access qualifications and deliver them to cohorts.
The recommendations have been presented to GA4GH DURI workstream in the 7th GA4GH plenary in Boston, 21 October 2019.