Skip to Main Content

Library - Research Data Management (RDM): Research Data Management Process

Research Data Management Process

Research data management involves the active organization and maintenance of data throughout the research process, and suitable archiving of the data at the project’s completion. It is an on-going activity throughout the data lifecycle.

                                                                           

Data analysis tools/software

There are number of tools/software that one can choose from to analyse data. Such tools include the following, just to name a few:

  • Datapine, which is popular business intelligence software that is focused on delivering simple, yet powerful analysis features into the hands of beginners and advanced users that need a fast and reliable online data analysis solution.
  • Python is extremely accessible to code in comparison to other popular languages such as Java, and its syntax is relatively easy to learn making this tool popular among users that look for an open-source solution and simple coding processes. In data analysis, Python is used for data crawling, cleaning, modelling, and constructing analysis algorithms based on business scenarios. One of the best features is actually its user-friendliness: programmers do not need to remember the architecture of the system nor handle the memory – Python is considered a high-level language that is not subject to the computer’s local processor.
  • SQL is programming language that is used to manage/query data held in relational databases, particularly effective in handling structured data as a database tool for analysts. It is highly popular in the data science community and one of the analyst tools used in various business cases and data scenarios.
  • ETL is a process used by companies, no matter the size, across the world, and if a business grows, chances are you will need to extract, load and transform data into another database to be able to analyse it and build queries. There are some core types of ETL tools such as batch ETL, real-time ETL, and cloud based ETL, each with its own specifications and features that adjust to different business needs. These are the tools used by analysts that take part in more technical processes of data management within a company, and one of the best examples is Talend.
  • Spreadsheets are one of the most traditional forms of data analysis.

Excel needs a category on its own since this powerful tool has been in the hands of analysts for a very long time. Often considered as a traditional form of analysis, Excel is still widely used across the globe.

Citing Data

It is very important to acknowledge the sources that the researcher has used in their research.  Information that should be included when citing are as follows: the author/creator, the title, publisher, date of publication, the DOI and the page numbers if it is an article, date accessed if it is an online article.

Metadata

Metadata is made up of a number of elements, which can be categorised into the different functions they support. A metadata standard will normally support a number of defined functions, and will specify elements, which make these possible. A metadata standard may support some or all of the following functions:

Descriptive Metadata enables identification, location and retrieval of information resources by users, often including the use of controlled vocabularies for classification and indexing and links to related resources.

Technical Metadata describes the technical processes used to produce, or required to use a digital object.

Administrative Metadata is used to manage administrative aspects of the digital object such as intellectual property rights and acquisition. Administrative Metadata also documents information concerning the creation, alteration and version control of the metadata itself. This is sometimes known as meta-metadata!

Use Metadata manages user access, user tracking and multi-versioning information.

Preservation Metadata, amongst other things, documents actions, which has been undertaken to preserve a digital resource such as migrations and checksum calculations.

Data Publishing

Data publishing is the act of releasing research data in published form for use by others. During the process of publishing data, a publisher can also choose an option of whether they want their data to be private or accessed by the public.

Preserving Data

Data preservation refers to maintaining access to data and files over time. For data to be preserved, at minimum, it must be stored in a secure location, stored across multiple locations, and saved in file formats that will likely have the greatest utility in the future

Data Storage

There are a number of data storage methods that one can use to store data. How (the format) and where you store your data is very important. You can store your data in any of the following:

Flash Drive Arrays, Hybrid Flash Arrays; Hybrid Cloud Storage; Backup Software (backing-up your file is also an important part); Cloud Storage (where you can access your data remotely)