FAQ - Frequently Asked Questions


General

Can I change files in already published uploads?

Once your record has been published, the files in the record can no longer be changed.

Do I need to register for Rodare?

No, all HZDR employees can access Rodare using their existing HZDR account. Once you click Log in with HZDR you will be redirected to the HZDR identity provider. There you need to authorize Rodare to access your user information. Username, email and your full name will be transfered.

Which license should I chose?

For open access or embargoed records you need to select a suitable license. We suggest the usage of a Creative Commons license for your dataset. Use the License Choser to get more information about Creative Commons licenses. The HZDR approved licenses for research data are:

Those licenses should not be used for software. For OpenSource software different licenses are commonly used, e.g. GNU GPLv3, MIT License or Apache License 2.0. The HZDR approved and suggested licenses for OpenSource software are: Also have a look at Chosealicense.com. It is a good start to learn more about licensing of OpenSource software.

In the upload form just start typing recommended to get the list of approved licenses for software and data.

Can anyone create communities?

The HZDR institutes as well as the research topics energy, health and matter are available beginning with the public availability of Rodare. Apart from that, anybody can create his own community with his own curation policy. Each community has an OAI-PMH interface for metadata harvesting.

I want to download a very large file. What is the best way to do this?

We suggest the usage of a download manager which supports pausing and resuming when downloading files. For Windows or Mac OS there is Free Download Manager. On Linux based system you can have a look at the multi-platform download manager uget.

Where does the name Rodare come from?

Rodare is an acronym and stands for Rossendorf Data Repository.

How to provide access to records under embargoed or restricted access conditions in a double-blind peer-review process?

Read the tutorial which explains in detail how to anonymously give access to records for reviewers in a double-blind peer-review process.

Policies

What are the file and data size limits in Rodare?

We currently accept up to 50GB per single file and 100 GB per dataset (you can have multiple files and datasets); If you would like to upload larger files, please contact us, and we will do our best to help you. Please be aware that we cannot offer infinite space.

Which type of data can I upload to Rodare?

All research output from all fields of research at HZDR is welcome. The upload form allows you to chose between dataset, images, video/audio and software.

Can I consider my data to be safe on Rodare?

Yes, your data is stored in HZDR data centre. Both metadata and files are kept in multiple independent locations. Data files are backed up to tape on a nightly basis.

Which restrictions do I have, if I wish to publish in Rodare?

Before publishing your files you must make sure, that your content is suitable for open dissemination, and that it complies with these terms and applicable laws, including, but not limited to, privacy, data protection and intellectual property rights.

Technical

How can I change the files of my published record?

Once your record is published you are no longer able to change the uploaded files. To edit or add new files please create a new version. Therefore, visit the record's metadata landing page and click the green New Version button in the top-right corner. You will be redirected to a prefilled upload form. You will only be able to publish your new version if at least one file was updated/added.

How can I edit the metadata of a published record?

To edit the metadata of your published record visit the record's landing page and click the orange Edit button in the top-right corner of the page. You will be redirected to the upload form. There you can edit almost all of the record's metadata. Once you are done, click Save and then Publish afterwards.

I want to get access to a dataset with embargoed or restricted access, how can I do that?

Visit the landing page of the record you want to get access to. Click the blue Request Access... button and enter the required information.

  • Ensure that you fulfil the conditions under which the owner grants access to the upload.
  • Fill in the form including a proper justification.
  • You will receive an email to confirm your request.
  • You will receive an email once the owner either grants or denies your request. If you are granted access this email contains a secret link you can use to access the files.


DOI versioning

What is DOI versioning?

DOI versioning allows you to

  • add, edit or update files after they have been published.
  • cite a specific version of a record.
  • cite all version of a record.

When should I create a new version?

You should create a new version if you wish to add, edit or update files of your record after it has been published.

How does DOI versioning work?

The first time you publish a record in Rodare, two DOIs will be registered:

  • one DOI representing the specific version of your record.
  • another DOI representing all versions of your record.
Afterwards, Rodare registers one DOI for every new version of your upload.

Let us demonstrate this by an example of a software package. The software has two releases (v1.0 and v1.1) on Rodare. The following DOIs would have been registered:

  • v1.0 (specific version): 10.14278/rodare.100
  • v1.1 (specific version): 10.14278/rodare.152
  • Concept (all versions): 10.14278/rodare.99
The first two DOIs represent the specific versions of the software, the third DOI represents all the versions of the given Rodare entry. The DOI representing a specific version is called a Version DOI, the DOI representing all versions is called Concept DOI. Technically, both are just normal DOIs. The Concept DOI currently resolves to the latest version of your record.

Which DOI should I use in citations?

You should usually use the DOI for the specific version of your record. Other researchers whish to be able to properly reproduce your research output and therefore need access to the exact research artifacts. When creating citations Rodare always uses the DOI for the specific version.

The Concept DOI can be used in situations when you wish to link to an evolving resarch artifact. You can e.g. add a badge to your GitHub/GitLab repository linking to the latest version of your record in Rodare.

I only want to change the title of my upload, do I need to create a new version?

No, you do not need to create a new version. You still can edit the metadata of your published record. You should only create a new version if you need to update the files of your record.

Are the files duplicated for every new version of a record?

No, they are not. If you add a 100kB README file to your 100 GB dataset, only the additional 100kB are stored on disk. The underlying storage system is efficiently handled by the used Invenio 3 library framework.


Background upload

What is meant by background upload?

Uploading files via web browser is a well-known feature from many services, such as Dropbox, Google, etc. Rodare also provides file upload via your web browser. Nevertheless, many files are already located in the HZDR data center. Especially for large files, uploading via web browser can be error-prone. Furthermore, many data is already located on central storage systems in the HZDR date center. For those data, uploading via web browser is not the best option. Therefore, we implemented the background upload feature in Rodare. Background upload downloads your files via SFTP from the storage to the Rodare servers. You just need to wait, until you receive the notification mail.

How can I upload a file via background upload?

Visit the upload user interface. At the top of the form there is the upload form. Click the blue Background upload button. A window will be opened showing the available remote servers. If you did not connect your account with the remote server you are invited to connect your account. If your account is connected, you can browse the filesystem. Navigate to the file you wish to upload and click the blue Upload file button beneath the file. Once the background task is finished you will be notified via Email.

I have files on a server which is not yet available via Rodare. What can I do?

If you have files which are not accessible via the available remote servers, please contact us. Please provide detailed information which server is required to be available. Also provide information about the administrator of the system in case we need to contact the person to properly test the integration.

Does the file browser collect all elements in directories with thousands of files and folders?

The filebrowser will collect at most 10.000 elements within one directory. If you wish to upload a specific file from such a large directory, please move it first or zip the whole directory.

How can I solve common errors on my own?

  • SSH authentication failed: this error commonly occurs if the public key was removed from authorized_keys on the remote server.
    • Possible solution: Visit the Remote Server settings page and copy the SSH public key. Add it to ~/.ssh/authorized_keys and try again. Contact us if the problem persists.
  • The given file is too large: Currently, the maximum file size is 50 GiB and 100 GiB per dataset by default. Please contact us and request to increase the limits for your upload. Do not forget to mention the id of your deposit.

How does background upload work?

Before you are able to use the background upload feature you need to connect your account with a registered SFTP capable remote server. Connect your account here. When connecting your account the following steps are performed:

  1. You enter your username and password for the remote server. This information is used to connect to the remote server once to transfer an SSH public key. The information is not stored and only used for this specific purpose.
  2. Rodare generates an SSH keypair. The SSH private key is stored encrypted in the database. The public key is transfered to the remote server.
  3. Rodare is now able to use the private key for SSH key-based authentication to perform SFTP operations.
When disconnecting your account, the private key is deleted from the database and the public key is removed from the remote server.

How can I upload files from e.g. Dropbox, ...?

In the top of the upload form, click the blue Background upload button. A window with two tabs will be opened. Click on the URL tab. You can now see two input fields, one for the URL and one for a filename. In the URL field enter the URL to the file you wish to be uploaded. By default the filename will be extracted from the last part of the URL. If you like to use another name, enter your filename into the second input field. Finally, click the green Start Background Upload button. Then, wait until you receive a notification mail.


GitHub integration

What do I need to do, to activate the integration for my repository on GitHub?

  1. Connect your account first.
  2. Register your software in ROBIS using the publication form Software in the HZDR data repository RODARE. Remember the numeric ID you receive.
  3. Add a file called .rodare.jsonin the root directory of your repository in GitHub. An example can be found here. Important: Change the value pub_id to the ID you received from ROBIS. If you do not specify this ID, the publication in RODARE will fail.
  4. Once you added the .rodare.json file in the root of your repository in GitHub toggle the switch to activate preservation of your software.
  5. Create a new release on GitHub. RODARE will then automatically download a .zip-ball of each new release and register a DOI for you.
If you need help, please feel free to contact us!

My organizational repository does not show up on the GitHub list.

In order to see and archive your organizational repositories on Rodare you will need to have "Admin" permissions on said repository, either as an Admin of the organization or an Admin of one of your organization's repositories. Additionally, please make sure that the OAuth application on GitHub is granting permissions not only to your personal repositories but also to your organizational ones - to verify that go to your GitHub OAuth settings in your profile, and click on the Rodare application to see more details. Make sure that Rodare is given access (green tick) to your organization under "Organization access".

After that, navigate to your Rodare GitHub settings page and click the "Sync now" button at the top.


Usage Statistics

What information is collected?

Two types of events are tracked:

  1. Visits to a record page.
  2. Download of a file.
For both event types, 4 values are tracked:
  1. Visitor: An anonymized visitor ID.
  2. Type of visitor: a) Human, b) machine or c) robot
  3. Country: The request's country of origin (based on the IP address).
  4. Referrer: The referrer domain.

What is a view?

A user (human or machine) visiting a record. Robots and double-clicks are excluded.

What is a unique view?

A unique view is defined as one or more visits by a user within a 1 hour time frame. If the same user accesses a record multiple times within the same time frame it is considered a unique view.

What is a download?

A user (human or machine) downloading a file from a record, excluding double-clicks and robots. If a record has multiple files and you download all of them, each file counts as one download.

What is a unique download?

A unique download is defined as one or more file downloads from files of a single record within a 1-hour time frame. If a user download multiple files of a single record within the same time frame it is counted as one unique view.

What is downloaded data volume?

The total data volume that has been downloaded for all files in a record by a user (human or machine) excluding robots. If a user cancels a download mid-way, still the total file size is counted towards the total downloaded volume.

When are usage statistics updated?

Every hour.

Can I see the most viewed records in RODARE?

You can sort the search results by the "most viewed" filter.

How are robots identified?

Requests made by robots (e.g. crawlers, bots) are filtered out from the usage statistics. Robots are detected based on the COUNTER-Robots module made by Invenio. This Python library provides a tiny API to check if a given user agent string is considered a robot according to the Code of Practice for Research Data and the COUNTER Code of Practice.

How are robots and machines differentiated?

A machine request is a request initiated by a human user, e.g. a script downloading a file via curl and performing an analysis on the data. A robot is considered an automated request, e.g. by a monitoring engine or search engine crawlers.

Can I deactivate usage statistics?

No, it is not possible to opt-out. Usage statistics is fully anonymized and is done on server-side.

How are users anonymized?

For each event (download, view) an anyomized visitor ID is generated. The anonymized visitor ID changes every 24 hours for a user. Thus, a user visiting a record on two different days will have two different anonymized visitor IDs. It is necessary to track this ID to count unique views and downloads.

How is the anonymized visitor ID generated?

The anonymized visitor ID is generated from a personal identifier, e.g.:

  1. a user ID (your user ID when you are logged in to RODARE),
  2. a session ID,
  3. or an IP address and your browser's user-agent string.
The personal identifier is combined with a random text value (which is called a salt) and afterwards a one-way cryptographic hash function is applied in order to scramble the information. The salt is thrown away and regenerated every 24 hours. The usage and throwaway of the salt ensures that the anonymized visitor ID is fully random.

Which time period is covered by the usage statistics feature?

Usage statistics are captured since July 20, 2018. The short time period between the launch of RODARE in April 2018 and the launch of the statistics feature cannot be incorporated, since the information was not captured in any system before.

Back