About the FDP | Funder Data Platform

Table of contents

Please note that this page is very much a work in progress, and currently used to document the most important information for users.

2. Information architecture

On this page you can read about how the different components of the Funder Data Platform are connected, their function, and how they enable funders to control data sharing. The figure below outlines an abstraction of these elements. Further information on each element and its' interactions is listed below the figure.

Illustration of connections between the main components of the Funder Data Platform. Click to enlarge.

2.1. Main Structure

The Funder Data Platform is built around two main components:

A PostgreSQL database, and
A JupyterHub workspace

The database is split into two schemas:

fdp - contains administrative and metadata (e.g. users, organisations, collaboration agreements, and data policies)
data - contains the actual research data tables and their associated access views

Data access is always mediated by project-defined views and role-based access, ensuring that data is only shared within clearly defined boundaries.

2.2. Details

2.2.1 Organisations

Organisations are the highest level unit of the Funder Data Platform. All other elements are ultimately connected to an organisation. All organisations in the Funder Data Platform are trusted, official organisations that have signed legally binding terms with the Funder Data Platform, outlining what they can do, and what they can expect from the system. Organisations are typically either research funding organisations or research producing organisations, such as universities. All organisations must have at least one administrator to maintain their users and stay connected to the FDP administrators.

2.2.2 Users

All users in the Funder Data Platform are members of an organisation (and only one). Users can either be regular members or organisation administrators. All users can create projects and data sets, but only administrators can invite new users into their organisation.

2.2.3 Data sets

Data sets are an abstract entity that do not actually contain data - rather they describe one or more data tables, what these tables contain and how they can be accessed. Some of these descriptions are formal machine definitions that regulate access, while others are human readable descriptions that allow users to understand the characteristics of the linked data.

Populating data sets with data tables is currently done in collaboration with the FDP team in a rather handheld process, in order to ensure correct formatting and access regulation, and also shifting some of the work out of the hands of the organisations.

2.2.4 Projects

Projects can in many ways be seen as the working directories of users - this is where data sets can be shared, and access granted to specific data views (see below). This is also where researchers and funders agree on the terms of their collaboration. We are very flexible with these terms - you can work with a common collaboration agreement for all parties in a project, or you can define bilateral agreements outside of the Funder Data Platform, and simply refer to this in your project definition.

2.2.5 Project members

Project owners are able to set different per-project user roles (which are different from the organisational roles described above). These roles allow project members different actions:

Project owners - are able to invite new users to the project, approve/reject project membership applications, modify user roles, attach datasets they have created or belong to their organisation, access the workspace folder and any granted data views.
Investigators - can do the same as project owners, member-related actions (inviting, changing roles etc.).
Org - can attach datasets they have created or belong to their organisations, and grant/revoke access to views.
Members - can not do anything in the project. This is a temporary role until a project owner assigns the correct role for a member.

2.2.6 Data tables

Data tables contain the data that users and organisations supply. They can not be generated or created by the users, but are created by the FDP administrators, upon request. Users can make data available through their own systems (if in place) or through a secure file upload system offered by the FDP administrators. We create the tables as supplied, with some optimisation and sanitation, to ensure data are imported consistently and correctly, with appropriate data types.

Noone except for the administrators can access data tables directly, meaning they are unaffected by any user operations and maintain integrity, regardless of how many projects they are attached to.

2.2.7 Data views

Access to reading data in data tables is delegated through views. A "view" is a technical SQL term for a way of defining what data can be read in a given table. In other words: a data table can have multiple associated views that each expose specific columns, or only make available slices of observations - maybe one project only requires grant amounts, while another needs names of applicants; views allow us to only make relevant data available on a per-project basis.

Views also allow us to join tables before making their data available. If a funder has tables with applicants, grants, and publications, all linked through a grant ID, we can join these tables into one, and let users read the combined data.

2.2.8 JupyterHub

In the JupyterHub workspace, users have access to python notebooks which can be used to read and analyse available data. We set project-specific access keys for data sets, that allows users to read the associated views into their analysis, but without the key, any attempt at reading the data in a view gives an empty result.

All users have a personal work folder ("your workspace") and a folder per project, which is shared between all project owners and investigators. Other project members do not have access to this folder, which also means "Org" members are not able to see data of other members (only their own).

Top of the page

2.1. Main Structure

2.2. Details

- 2.2.1 Organisations

- 2.2.2 Users

- 2.2.3 Data sets

- 2.2.4 Projects

- 2.2.5 Project members

- 2.2.6 Data tables

- 2.2.7 Data views

- 2.2.8 JupyterHub