National Park Service Data Release Reports

Resources and Guides for generating NPS DRRs associated with data packages

Background

Recognizing the broad move toward open science, we’ve seen the following changes in expectations (outside NPS) since the inception of the I&M program.

Data release reports are designed to parallel external peer-reviewed scientific journals dedicated to facilitate reuse of reproducible scientific data, in recognition that the primary reason IMD data are collected is to support science-based decisions.

Note that publication in a data release report series (not mandated) is distinct from requirements to document data collection, processing, and quality evaluation (mandated; see below). The establishment of a data release report series is intended to facilitate and encourage this type of reporting in a standard format, and in a manner commensurate with current scientific norms.

A template for creating data release reports to be published in the DRR series. We have also developed procedures for authoring DRR reports in Microsoft Word and porting them to the appropriate format.

Definitions

Reproducibility. The degree to which scientific information, modeling, and methods of analysis could be evaluated by an independent third party to arrive at the same, or substantially similar, conclusion as the original study or information, and that the scientific assessment can be repeated to obtain similar results (Plesser 2017). A study is reproducible if you can take the original data and the computer code used to analyze the data and reproduce all of the numerical findings from the study. This may initially sound like a trivial task but experience has shown that it’s not always easy to achieve this seemingly minimal standard (ASA 2017, Plesser 2017).

Transparency. Full disclosure of the methods used to obtain, process, analyze, and review scientific data and other information products, the availability of the data that went into and came out of the analysis, and the computer code used to conduct the analysis. Documenting this information is crucial to ensure reproducibility and requires, at minimum, the sharing of analytical data sets, relevant metadata, analytical code, and related software.

Fitness for Use. The utility of scientific information (in this case a dataset) for its intended users and its intended purposes. Agencies must review and communicate the fitness of a dataset for its intended purpose, and should provide the public sufficient documentation about each dataset to allow data users to determine the fitness of the data for the purpose for which third parties may consider using it.

Decisions. The type of decisions that must be based on publicly-available, reproducible, and peer-reviewed science has not been defined. At a minimum it includes any influential decisions, but it may also include any decisions subject to public review and comment.

Descriptive Reporting. The policies listed above are consistent in the requirement to provide documentation that describes the methods used to collect, process, and evaluate science products, including data. Note that this is distinct from (and in practice may significantly differ from) prescriptive documents such as protocols, procedures, and study plans. Descriptive reporting should cite or describe relevant science planning documents, methods used, deviations, and mitigations. In total, descriptive reporting provides a clear “line of sight” on precisely how data were collected, processed, and evaluated. Although deviations may warrant revisions to prescriptive documents, changes in prescriptive documents after the fact do not meet reproducibility and transparency requirements.

Policy Requirements

NPS Requirements

DO11B-a, DO 11B-b, OMB M-05-03 (Peer review and information quality):

OMB M-19-15 (Updates to Implementing the Information Quality Act):

NPS Guidelines

NPS guidelines on Use of Scientific Information Multiple policy and guidance documents require the use of best available science in decision-making. Additional requirements include:

SO 3369 (Promoting Open Science):

DO 11B (Ensuring Objectivity, Utility, and Integrity of Information Used and Disseminated by the National Park Service):

I&M Requirements

NPS-75 (Inventory and Monitoring Guidelines):

IMD Reporting and Analysis Guidance

Implications

Because all of the data IMD collects is intended for use in supporting science-based decisions as per our program’s five goals, and is intended for use in planning (the decisions of which are subject to public comment as per NEPA requirements), this means that by default:

Scope

(for the NPS Inventory & Monitoring Program)

General Studies

Any project that involves the collection of scientific data for use in supporting decisions to be made by NPS personnel. General study data may or may not be collected based on documented or peer-reviewed study plans or defined quality standards, but are in most cases purpose-driven and resultant information should be evaluated for the suitability for—and prior to—their use in decision support. These data may be reused for secondary purposes including similar decisions at other locations or times and/or portions of general study data may be reused or contribute to other scientific work (observations from a deer browsing study may be contribute to an inventory or may be used as ancillary data to explain monitoring observations).

Workflow for data collection, processing, dissemination, and use for general studies. Teal-colored boxes are subject to reproducibility requirements.

Figure 1: Workflow for data collection, processing, dissemination, and use for general studies. Teal-colored boxes are subject to reproducibility requirements.

Vital Signs Monitoring

Vital signs monitoring data are collected by IMD and park staff to address specific monitoring objectives following methods designed to ensure long-term comparability of data. Procedures are established to ensure that data quality standards are maintained in perpetuity. However, because monitoring data are collected over long periods of time in dynamic systems, the methods employed may differ from those prescribed in monitoring protocols, procedures, or sampling plans, and any deviations (and resultant mitigations to the data) must be documented. Data should be evaluated to ensure that they meet prescribed standards and are suitable for analyses designed to test whether monitoring objectives have been met. Monitoring data may be reused for secondary purposes including synthesis reports and condition assessments, and portions of monitoring data may contribute to inventories.

Workflow for data collection, processing, dissemination, and use for vital sign monitoring efforts. Teal-colored boxes are subject to reproducibility requirements.

Figure 2: Workflow for data collection, processing, dissemination, and use for vital sign monitoring efforts. Teal-colored boxes are subject to reproducibility requirements.

Inventory Studies

Inventory study data are similar to general study data in that they are time- and area-specific efforts designed to answer specific management needs as well as broader inventory objectives outlined in project-specific study plans and inventory science plans. Inventory studies typically follow well-documented data collection methods or procedures, and resultant data should be evaluated for whether they are suitable for use in supporting study-specific and broader inventory-level objectives. Inventory study data are expected to be reused to meet broader inventory level goals, but may also support other scientific work and decision support.

Workflow for data collection, processing, dissemination, and use for inventory studies. Teal-colored boxes are subject to reproducibility requirements.

Figure 3: Workflow for data collection, processing, dissemination, and use for inventory studies. Teal-colored boxes are subject to reproducibility requirements.

American Statistical Association (ASA). 2017. Recommendations to funding agencies for supporting reproducible research. https://www.amstat.org/asa/files/pdfs/POL-ReproducibleResearchRecommendations.pdf.

Plesser, H. E. 2017. Reproducibility vs. Replicability: A brief history of a confused terminology. Front. Neuroinform. 11:76. https://doi.org/10.3389/fninf.2017.00076.