Norwegian roadmap for Research Infrastructure
E-infrastructure
Electronic infrastructure (e-infrastructure) consists of ICT-based infrastructures that enable advanced, collaboration-oriented research. Data infrastructures are relevant to most subject areas, and there is an increasing need for them to become interoperable across geographical and subject boundaries.
Research objectives
Examples of e-infrastructure include high-capacity data networks and pertaining services such as authentication and authorisation, tools for efficient workflow and software for simulations and data analysis. E-infrastructure also refers to digital registries and databases for storing large amounts of data as well as computational resources for High Performance Computing (HPC). HPC is an important tool for addressing major scientific and societal challenges, including in the fields of marine research, climate research and health research.
E-infrastructure that promotes data sharing and reuse is often referred to as data infrastructures.
E-infrastructure is especially important for research that requires high-performance computing or where simulation and analysis activities generate massive amounts of data. E-infrastructure must also be able to securely and efficiently handle sensitive data that cannot or must not be freely shared, and specially-adapted data platforms are needed to this end.
The objectives for e-infrastructure are divided into three parts:
- deliver services for research projects and other research infrastructures;
- deliver area-specific infrastructure;
- provide secure storage and accessibility of data in line with the international FAIR Principles [1]
The magnitude of the resources invested in procuring and analysing data makes it necessary to ensure that the data is protected, that its value is enhanced through cataloguing and generation of metadata and that it is made accessible to other users in keeping with the FAIR principles. Thus, access to, and the effective use of, e-infrastructure in all subject areas is a cornerstone of data-intensive research.
Better access to research data will enhance the quality of research in that results can be validated and verified in a more effective manner and data sets can be used in new ways and in combination with other data sets. Open access to research data helps to prevent unnecessary duplication of results or efforts, and paves the way for more wide-ranging interdisciplinary research. Open access to research data is a national and international priority area. In 2017, the Ministry of Education and Research launched a national strategy on access to and sharing of research data. From 2021, the European Open Science Cloud (EOSC) will be important in realising Horizon Europe’s open research goals. Norwegian institutions participate in EOSC Nordic, which coordinates relevant EOSC initiatives in the Nordic and Baltic countries.
The Research Council’s Policy on Open Access to Research Data (pdf) emphasises making research data more accessible to relevant users, on equal terms, and at the lowest possible cost while adhering to the international FAIR principles for added value from data. This is also an express goal of the Research Council’s Policy on Open Science, applicable from 2020. Projects receiving Research Council funding need to draw up a data management plan as a means of providing a framework for the secure management of research data [2] not just during the lifetime of a project, but also for future reuse. The policy guidelines apply to all data generated under projects that receive funding from the Research Council with a few exceptions. Different e-infrastructures have developed digital tools that research projects can use to generate data management plans.
Existing research infrastructure
Norwegian research institutions currently have the benefit of cost-effective, coordinated e-infrastructures for research and higher education in many subject areas. UNINETT AS develops and operates the Norwegian high-performance network for research and education, connecting over 200 Norwegian institutions and over 300,000 users and linking them to international research networks. UNINETT AS is a non-commercial enterprise owned by the Ministry of Education and Research. Affiliation with the research network forms the basis for most other services provided by UNINETT AS.
UNINETT Sigma2 AS, a subsidiary of UNINETT AS, is responsible for the procurement, operation and further development of the generic national e-infrastructure for high-performance computing and data storage in Norway. In the period from 2016 to 2019, the four national high-performance computing facilities acquired in 2012 were phased out and replaced by two new computational facilities (E-INFRA at UNINETT Sigma2 AS). The National e-Infrastructure for Research Data (NIRD) is directly connected to the computing facilities, which enables more efficient delivery of data analysis and visualisation services. NIRD provides storage resources that are upgraded annually, data security through dual-site storage, support for multiple storage protocols and migration to third-party cloud service providers.
Through close cooperation with Norway’s four oldest universities, Sigma2 offers several related high-performance computing and data storage services to Norwegian universities and university colleges, as well as to other publicly funded research organisations. Sigma2 also heads and coordinates Norway’s participation in international collaborations on e-infrastructure, such as the Nordic e-Infrastructure Collaboration (NeIC), the Partnership for Advanced Computing in Europe (PRACE) and the European Data Infrastructure (EUDAT).
In certain areas where management of sensitive personal data is involved, solutions are needed to meet requirements concerning secure data while also providing researchers access to the data for the purpose of analysis. Solutions of this type are provided by Services for Sensitive Data (TSD), among others, which is operated and developed in a collaboration between Sigma2 and the University of Oslo. Funding from the National Financing Initiative for Research Infrastructure has been allocated for investments in new equipment for both computing and data storage facilities for sensitive personal data (TSD).
The Norwegian Centre for Research Data (NSD) archives and prepares data for dissemination to research groups, both nationally and internationally, and develops technological solutions to provide open access to research data within the research sector. NSD serves as Data Protection Officer for the nation’s universities, most Norwegian university colleges and numerous health trusts and research institutes. In 2003, NSD was established as a limited company owned by the Ministry of Education and Research. NSD has received funding from the National Financing Initiative for Research Infrastructure for the Norwegian Open Research Data Infrastructure (NORDi), a solutions provider for research data storage and access.
Another generic data infrastructure of note is UiT Open Research Data, an open research data archive established at UiT The Arctic University of Norway. This infrastructure is available to UiT researchers and other institutions as well as individual researchers. In addition, BIBSYS BIRD is a generic tool for storing, documenting, sharing and publishing research data, developed by BIBSYS (now part of UNIT, the Norwegian Directorate for ICT and Joint Services in Higher Education & Research) in cooperation with BI Norwegian Business School.
There are a number of other subject area-specific data infrastructures that provide services targeting specific needs among different communities of users. These subject-specific data infrastructures are adapted for data that is to be made accessible within the various subject areas. In order to achieve the highest possible reuse of previously collected data, it is imperative to have good infrastructures that make it easy to locate relevant data and link together different data sets. More information about subject-specific data infrastructures is provided in the various area strategies.
Need for new infrastructure, upgrades and coordination
Steadily improving measurement and sensor technologies, more extensive measurements, and more advanced data analysis tools all add to the need for high-performance computing and storage of massive amounts of research data. This does not just apply to traditionally data-intensive subject areas; an increasingly wider range of research fields are generating or using very large amounts of data. A more data-driven research sector combined with a tendency towards more open research entails an increasing need for good infrastructure that enables access to and reuse of data. This also encompasses better utilisation of data collected for management purposes that would be of great value to research if made accessible. There are also less distinct boundaries between traditional subject areas in the increasingly digitalised research sector, and this flow of data between the subject areas provides new opportunities for innovative research. Investment in efficient secure data infrastructure that safeguards inherent capabilities while also ensuring interoperability between datasets will contribute to this flow of data.
Machine learning and artificial intelligence are research fields where ICT researchers and researchers from other subjects and disciplines meet, for example in the context of precision medicine, economics and finance, societal security and media and consumer research. Artificial intelligence R&D requires research infrastructure with large storage and processing capacity that satisfies the requirements for protection of privacy, security and ownership of data and results. In particular, artificial intelligence, machine learning and deep learning often require a combination of modern processors with powerful data-parallel accelerators, e.g. Graphics Processing Unit (GPU) capacity, and the expertise to utilise them, which is not part of a traditional high-performance computing facility.
Computing facilities need to be replaced approximately every four years due to the rapid technological development in HPC, and to ensure cost-efficient operation and provide cutting-edge research services. Sigma2 works continuously on replacing and upgrading both the computing and data storage facilities used in Norwegian research. Based on projections using historical data on demand and requests from new user groups, Sigma2 calculates the amount of computing capacity the new facilities should have in order to meet the needs of Norwegian researchers. In recent years, the need for e-infrastructure services for research communities in most subject areas has increased and we expect a further increase in demand in the coming years [3] To secure opportunities for Norwegian researchers going forward, Norway also participates in international collaborations, such as the European collaboration on establishing joint HPC resources through the EuroHPC Joint Undertaking.
Coordination of initiatives and collaboration between actors will be essential in the future development of e-infrastructure and data infrastructures. It makes sense to see data infrastructures as part of a digital ecosystem in which equipment components and services are decentralised within a common framework that comprise a unit. In this context, data infrastructures that meet the EOSC (European Open Science Cloud) requirements will be particularly relevant. Such infrastructures will contribute to improving the flow of data across national borders and disciplines.
Interface with other areas
The Research Council encourages cooperation between actors in establishing services for data management in order to capitalise as much as possible on prior investments. Such cooperation may take the form of project collaboration or direct use of existing services. Collaborations of this type are not limited to national efforts. In some areas, the best approach will be to cooperate on international data infrastructures, as exemplified by the many ESFRI projects on data management.
The Research Council will not normally contribute funding for investment in, and operation of, computing resources for data-intensive computing unless the investment has been coordinated with, or comes entirely from, Sigma2. Research groups with a need for computing resources are advised to contact Sigma2 at the outset in order to clarify whether their needs can be met through existing or planned Sigma2 investments. For applications for new national research infrastructure requiring storage or computing resources, the Research Council expects the Project Owner to establish a dialogue with Sigma2 on how these needs can be met and to incorporate the costs into the budget for the infrastructure being sought.
Project | Status |
---|---|
E-INFRA ved UNINETT Sigma 2 – a national e-Infrastructure for science | Under establishment/in operation |
Project | Status |
---|---|
eX3 – Experimental Infrastructure for Exploration of Exascale Computing | Under establishment/in operation |
Microdata.no – Microdata Platform for Norwegian and International Research and Analysis | Under establishment/in operation |
NORDi – Norwegian Open Research Data Infrastructure | Under establishment/in operation |
[1] De internasjonale FAIR-prinsippene er utarbeidet som et sett av retningslinjer for å tilrettelegge for økt dataverdi. FAIR er et akronym for ordene findable, accessible, interoperable og reusable. Data og metadata bør være gjenfinnbare, tilgjengelige, gjenbrukbare og kunne håndteres maskinelt.
[2] Forskningsdata som kan tilgjengeliggjøres er ikke bare selve datasettet men kan også være metadata, metodebeskrivelse, algoritmer, kode, ol.
Messages at time of print 6 June 2023, 01:40 CEST