Data Management
Dataset Provision and Citation in the Digital Age
In the Cluster of Excellence UWA, many datasets are provided as part of the research data repository (RDR) of the University of Hamburg. However, just submitting (archives of) files to a research data repository is only a necessary condition to fulfil the FAIR principles. Data must be made transparent to human users such that the usefulness for a particular project can be evaluated. Thus, suitable presentations of reserach data are required (not listings of directories in archives or generic table displays). In addition, presentations of datasets must ensure that single data items, which are indeed persistent in RDR, are citable by researchers (and others). Citations must be tracable, i.e., given a citation DOI, the presentation of the cited data item should be made accessible with minimum effort.
Therefore, in UWA Cluster of Excellence, scholars work together with computer scientists to adapt generic viewers to project-specific requirements or develop project-specific viewers of datasets. Scholars can generate archives (so-called CSMC files, which are similar to DOCX files) for their data acquired in their scholarly work. CSMC files can then be submitted to RDR by scholars with project-specific views on data shown on the web right from the RDR system of the University of Hamburg. In co-operation with computer scientists, so-called CSMC dataset generators are provided for scholarly work, and thus, the CSMC generator can be used by scholars without having to contact a computer scientist. Once a CSMC dataset file is submitted to RDR, the returned DOIs can be used to access data presentations on the web.
At https://doi.org/10.25592/mdq0-7x79 a published CSMC dataset from the NETamil project is shown in RDR as an example of work in the Cluster of Excellence UWA. Clicking on the DOI link above, you will see an archived CSMC file submitted to the RDR of the University of Hamburg:
After clicking on View Data (below the feather icon, please scroll down and look under Files if required) the data will be displayed as intended by the scholars who created the dataset:
The link behind the View Data button can also be put on a web page as shown here.
The system for creating a CSMC file in a project runs on a web server and is developed in co-opeation with a computer scientist, while afterwards scholars can continue working on their own generating new dataset packages. With the software technology developed by UWA in Research Field F (CSMC App), humanties researchers can also check their data in CSMC files locally on their computer first. Given that they have access to a CSMC file and the installer file for the CSMC App (e.g., sent on a DVD), scholars without internet access can also benefit from UWA datasets. The same holds for UWA researchers when working under conditions without internet connection (e.g., on excavation trips).
The CSMC App can be downloaded for Windows and MacOS (Apple Silicon).
Scholars can easily make a CSMC file available to the public by submitting it to the RDR system of the University of Hamburg. The CSCMC submission is automatically recognized in RDR and a View Data button is provided by RDR once such a dataset is retrieved by any user on the web. Thus, submitting data to the RDR of the University of Hamburg, scholars automatically receive a representation of their data on the web in a form that they have defined themselves (with some initial co-operation with computer scientists for creating project-specific views based on generic components and for making the CSMC file generator availabe).
With CSMC files any kind of data can be visualised within RDR or using the CSMC App. In the next figure a CSMC file containing a dataset from FTIR spectroscopy is shown as another example (CSMC App).
See https://staging-rdm.fdr.uni-hamburg.de/records/rp3g0-6zy30 for the RDR entry.
Another example concerning relational data is avaliable from this DOI https://doi.org/10.25592/5mx6-1k15.
After clicking on View Data (below the feather icon), the data will be displayed as the creating scholars intended.
Fitlering and navigation facilities for the dataset are provided in this case. Clicking on a row shows a detailed view of the clicked data item.
The link behind the View Data button can also be integrated into a web page, as shown here.
External users intersted in specific data from RDR datasets can also cite an individual data record (and not just the entire archive via the RDR system). See, e.g., the citation links https://doi.org/10.25592/mdq0-7x79#8 or https://doi.org/10.25592/5mx6-1k15#1033 for citation DOIs Simply click on the DOIs or copy them into the input window of your browser, and when the dataset is shown in RDR just click on the button Show Citation (to be found just below the feather icon). Citation links can be obtained by clicking on a data item, which are then shown in detail (click the button Copy Citation).
Citing datasets this way makes sense because with RDR submission the data is persistent and cannot be changed (one can only upload new versions, but the old ones are retained). Keeping data in a database, however, as was often pursued in the past, and having a web interface built for data presentation, does not lead to citable presentations because databases can be changed, and thus, citations then become pointless.