Background
Improving the systematic management of research data is expected to be a journey, because:
Adoption will occur at different rates in different institutions;
The goals and approaches will evolve as understanding improves; and
Different beneficiaries will have different interests in the gains provided.
Multiple meetings held by the RDCC have confirmed a growing interest in addressing the challenge. A strong agreement about the current paradox in research data management practice exists, as follows.
Research Data Management Plans (RDMPs, as researcher provided information) created at research initiation are not used to inform subsequent decision making.
The contrary would be expected. The continuous improvement of a ‘container of information around data, about that data’, targeting the capture of value from that data over the course of its lifetime, must be able to inform the decision making required by that data’s life cycles.
RDMPs previously focussed on the data layer. It is anticipated that decisions at the collection layer can reduce the labour required to manage data and facilitate a more cohesive response across a variety of institutional approaches to the management of research data. To support meeting these macro-perspective challenges, an RDMP-2.0 should focus on collection layer management.
An RDMP-2.0 process is envisaged, whereby research data management would transition from a plan (waterfall) based approach to a continuous improvement-based approach.
The RDMP-2.0 process would:
Translate best practice improvements arising from institutional efforts into common national practice by coordinating aligned activities;
Realise FAIR principles for data management through the harmonisation of metadata and improving the coherence of the multiple systems supporting access to that data; and
Improve efficiency and compliance within support “pillars” (archives, data privacy, eResearch, ethics, IT, library, records, the research office and security):
By using a machine-driven approach which leverages common metadata to inform better data management decisions as data progresses through its life cycles; and
By providing actionable reporting relating cost to benefit.
Challenges
-
Improvement is dependent on the implementation of systems/processes in institutions and between institutions. This is very dependent on institutional and all pillars involvement. The showbag activity started by the RDCC is helpful to this.
─────────────────────────────────────────────
Dimensioning and then measuring current state
Over time delivering trends and hot spots etc
This is a repetitive ongoing activity
Implementation of information capture around data
Enabling smart decision making in all the pillars that support research data management within a university
This is likely to be a bespoke activity dependant on organisational structures, information systems and technologies in place specific to each university
Architecting a data migration framework (that meets policy and access realities)
Where duplication is reduced, more data is held on lower cost solutions and in which the business case for low durability options can be determined
This is a collaborative activity aimed at creating a common framework that does apply to the more bespoke systems in each university
-
Improvement is dependent on the knowledge and capabilities of researchers. We need to define more effective life cycle concepts and community agreements around them. This is work that needs to be carried forward with the participation of researchers and domain specific data investments.
─────────────────────────────────────────────
Articulation of ’normal practice’
Starting with a small number of exemplary data streams, identify decision points where associated changes in access, durability and use occur. Broaden to an increasing variety and volume and complexity of data over time.
Data stream selection could be arranged to ensure a spread exists across FOR codes, data sources, scales of data produced, and reuse potential and data streams related to ARDC supported collections can be included.
This is a stepwise refinement activity on a research domain by domain, or research discipline by discipline basis. It must be expected that some learnings in specific cases can have broad value across many other cases.
Maturing of Smart Decision Making in research data management
Articulating statements of functionality, extracted from bespoke implementations, to create technology independent guidelines and how-to’s for decision making automation.
This is a knowledge capture, encoding and dissemination process, which builds a stock of knowledge regarding effective and affordable practice in research data management.
-
Improvement is dependent on the perceived value of research data. We need to understand the measures and decisions required in assigning value, clarify the goals of research data management with respect to the value and cost of data, and cross-inform and connect 1, and 2.
─────────────────────────────────────────────
Development of RDMP-2.0
Identify or develop reference implementations of components of RDMP-2.0.
Develop implementation and practitioner guidelines.
Establish compliance expectations and standards for RDM processes in universities.
This is a collaborative standards development process
The means for making improvements would need to be developed. Some indicative concepts are set out below.
Concept
Many government policy, university and national infrastructure goals could be advanced. For example: improved research integrity; more certain access to data underpinning published research, university held prestige collections, the flow of data into science agency collections and support for the data life cycles associated with data in all of the NCRIS capabilities.
An overall summary of the themes clustering around the development of the RDMP-2.0 is depicted below.