Open filters Close filters

19 Results

Backups, Archives & Data Preservation

Unrestricted Use

Public Domain

Backups, Archives & Data Preservation

Rating

There are several important elements to digital preservation, including data protection, backup and archiving. In this lesson, these concepts are introduced and best practices are highlighted with case study examples of how things can go wrong. Exploring the logistical, technical and policy implications of data preservation, participants will be able to identify their preservation needs and be ready to implement good data preservation practices by the end of the module.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Best Practice: Document steps used in data processing

Unrestricted Use

Public Domain

Best Practice: Document steps used in data processing

Rating

Different types of new data may be created in the course of a project, for instance visualizations, plots, statistical outputs, a new dataset created by integrating multiple datasets, etc. Whenever possible, document your workflow (the process used to clean, analyze and visualize data) noting what data products are created at each step. Depending on the nature of the project, this might be as a computer script, or it may be notes in a text file documenting the process you used (i.e. process metadata). If workflows are preserved along with data products, they can be executed and enable the data product to be reproduced.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Describe method to create derived data products

Unrestricted Use

Public Domain

Best practice: Describe method to create derived data products

Rating

When describing the process for creating derived data products, the following information should be included in the data documentation or the companion metadata file.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Ensure datasets used are reproducible

Unrestricted Use

Public Domain

Best practice: Ensure datasets used are reproducible

Rating

When searching for data, whether locally on one’s machine or in external repositories, one may use a variety of search terms. In addition, data are often housed in databases or clearinghouses where a query is required in order access data. In order to reproduce the search results and obtain similar, if not the same results, it is necessary to document which terms and queries were used.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Identify most appropriate software

Unrestricted Use

Public Domain

Best practice: Identify most appropriate software

Rating

Follow the steps below to choose the most appropriate software to meet your needs:
Identify what you want to achieve (discover data, analyze data, write a paper, etc.)
Identify the necessary software features for your project (i.e. functional requirements)
Identify logistics features of the software that are required, such as licensing, cost, time constraints, user expertise, etc. (i.e. non-functional requirements)
Determine what software has been used by others with similar requirements
Ask around (yes, really); find out what people like
Find out what software your institution has licensed
Search the web (e.g. directory services, open source sites, forums)
Follow-up with independent assessment
Generate a list of software candidates
Evaluate the list; iterate back to Step 1 as needed
As feasible, try a few software candidates that seem promising

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Identify outliers

Unrestricted Use

Public Domain

Best practice: Identify outliers

Rating

Outliers may not be the result of actual observations, but rather the result of errors in data collection, data recording, or other parts of the data life cycle. This can be used to identify outliers for closer examination.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Identify values that are estimated

Unrestricted Use

Public Domain

Best practice: Identify values that are estimated

Rating

Data tables should ideally include values that were acquired in a consistent fashion. However, sometimes instruments fail and gaps appear in the records.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Store data with appropriate precision

Unrestricted Use

Public Domain

Best practice: Store data with appropriate precision

Rating

Data should not be entered with higher precision than they were collected in.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Best practice: Understand the geospatial parameters of multiple data sources

Unrestricted Use

Public Domain

Best practice: Understand the geospatial parameters of multiple data sources

Rating

Understand the input geospatial data parameters, including scale, map projection, geographic datum, and resolution, when integrating data from multiple sources. Care should be taken to ensure that the geospatial parameters of the source datasets can be legitimately combined.

Subject:: Applied Science; Information Science
Material Type:: Lesson
Provider:: DataONE
Date Added:: 03/28/2022

Data Analysis and Workflows

Unrestricted Use

Public Domain

Data Analysis and Workflows

Rating

Understanding the types, processes, and frameworks of workflows and analyses is helpful for researchers seeking to understand more about research, how it was created, and what it may be used for. This lesson uses a subset of data analysis types to introduce reproducibility, iterative analysis, documentation, provenance and different types of processes. Described in more detail are the benefits of documenting and establishing informal (conceptual) and formal (executable) workflows.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Unrestricted Use

Public Domain

Data Citation

Rating

Data citation is a key practice that supports the recognition of data creation as a primary research output rather than as a mere byproduct of research. Providing reliable access to research data should be a routine practice, similar to the practice of linking researchers to bibliographic references. After completing this lesson, participants should be able to define data citation and describe its benefits; to identify the roles of various actors in supporting data citation; to recognize common metadata elements and persistent data locators and describe the process for obtaining one, and to summarize best practices for supporting data citation.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Data Entry and Manipulation

Unrestricted Use

Public Domain

Data Entry and Manipulation

Rating

When entering data, common goals include creating data sets that are valid, have gone through an established process to ensure quality, are organized, and reusable. This lesson outlines best practices for creating data files. It will detail options for data entry and integration, and provide examples of processes used for data cleaning, organization and manipulation.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Data Management Planning

Unrestricted Use

Public Domain

Data Management Planning

Rating

Data management planning is the starting point in the data life cycle. Creating a formal document that outlines what you will do with the data during and after the completion of research helps to ensure that the data is safe for current and future use. This lesson describes the benefits of a data management plan (DMP), outlines the components of a DMP, details tools for creating a DMP, provides NSF DMP information, and demonstrates the use of an example DMP.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Data Management Skillbuilding Hub - DataOne

Unrestricted Use

Public Domain

Data Management Skillbuilding Hub - DataOne

Rating

The Data Management Skillbuilding Hub is a repository for open educational resources regarding data management, meaning that it is a collection of learning resources freely contributed by anyone willing to share them. Materials such as lessons, best practices, and videos, are stored in the DataONEorg GitHub repository as well as searchable through the Data Management Training Clearinghouse. We invite you submit your own educational resources so that the Data Management Skillbuilding Hub can remain an up-to-date and sustainable educational tool for all to benefit from. You can easily contribute learning materials to the Skillbuilding Hub via GitHub online.

Subject:: Applied Science; Information Science
Material Type:: Lesson; Primary Source
Provider:: DataONE
Date Added:: 03/21/2022

Data Quality Control and Assurance

Unrestricted Use

Public Domain

Data Quality Control and Assurance

Rating

Quality assurance and quality control are phrases used to describe activities that prevent errors from entering or staying in a data set. These activities ensure the quality of the data before it is collected, entered, or analyzed, as well as actively monitoring and maintaining the quality of data throughout the study. In this lesson, we define and provide examples of quality assurance, quality control, data contamination and types of errors that may be found in data sets. After completing this lesson, participants will be able to describe best practices in quality assurance and quality control and relate them to different phases of data collection and entry.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Unrestricted Use

Public Domain

Data Sharing

Rating

When first sharing research data, researchers often raise questions about the value, benefits, and mechanisms for sharing. Many stakeholders and interested parties, such as funding agencies, communities, other researchers, or members of the public may be interested in research, results and related data. This lesson addresses data sharing in the context of the data life cycle, the value of sharing data, concerns about sharing data, and methods and best practices for sharing data.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Introduction to Data Management

Unrestricted Use

Public Domain

Introduction to Data Management

Rating

As rapidly changing technology enables researchers to collect large, complex datasets with relative ease, the need to effectively manage these data increases in kind. This is the first lesson in a series of education modules intended to provide a broad overview of various topics related to research data management. It covers: trends in data collection, storage and loss, the importance and benefits of data management, and an introduction to the data life cycle.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Legal & Policy Issues

Unrestricted Use

Public Domain

Legal & Policy Issues

Rating

Conversations regarding research data often intersect with questions related to ethical, legal, and policy issues for managing research data. This lesson will define copyrights, licenses, and waivers, discuss ownership and intellectual property, and describe some reasons for data restriction. After completing this lesson, participants will be able to identify ethical, legal, and policy considerations that surround the use and management of research data.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Metadata Management

Unrestricted Use

Public Domain

Metadata Management

Rating

What is metadata? Metadata is data (or documentation) that describes and provides context for data and it is everywhere around us. Metadata allows us to understand the details of a dataset, including: where it was collected, how it was collected, what gaps in the data mean, what the units of measurement are, who collected the data, how it should be attributed etc. By creating and providing good descriptive metadata for our own data, we enable others to efficiently discover and use the data products from our research. This lesson explores the importance of metadata to data authors, users of the data and organizations, and highlights the utility of metadata. It provides an overview of the different metadata standards that exist, and the core elements that are consistent across them. It guides users in selecting a metadata standard to work with and introduces the best practices needed for writing a high quality metadata record.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020