NEWS.md
2024-07-16 * Added experimental function document_missing_values()
, which searches a file for multiple missing value codes, replaces them all with NA, and generates a new column with the missing value codes so that they can be properly documented in EML. This is a work-around for the fact that there is currently not a good way to get multiple missing value codes in a single column via EMLassemblyline. This function is still under development; expect substantial changes an improvements up to and including removing the function entirely.
2024-07-09 * Added function get_user_email()
, which accesses NPS active directory via a powershell function to return the user’s email address. Probably won’t work for non-NPS users and probably won’t work for non-windows users. * Updated rest API from legacy v6 to current v7.
2024-06-28 * Updated get_park_polygon()
to use the new API (had been using a legacy API). Added documentation to specify that the function is getting the convexhull for the park, which may not work particularly well for some parks. 2024-06-27 * bug fixes for generate_ll_from_utm()
* add function remove_empty_tables()
(and associated unit tests) * update documentation for replace blanks()
to indicate it can replace blanks with more than just NA
2024-05-08 * Updated the replace_blanks()
function to accept any missing value code a user inputs (but it still defaults to NA). 2024-04-18 * Added the function generate_ll_from_utm()
which supersedes convert_utm_to_ll()
and improves upon it in several ways, included accepting a column of UTMs and also returns a column of CRS along with the decimal degrees latitude and longitude. 2024-04-17 * Major updates to the DRR template including: using snake case instead of camel case for variables; updating Table 3 to only display filenames only when there are multiple files, fixed multiple issues with footnotes, added citations to NPSdataverse packages, added a section that prints the R code needed to download the data package and load it in to R. * Updated the DRR documentation to account for new variable names.
2024-03-07 * Update error warning in check_te()
to not reference VPN since NPS no longer uses VPN. * add private function .get_unit_bondaries()
: hits ArcGIS API to pull more precise park unit boundaries than get_park_polgyon()
* add validate_coord_list()
function that takes advantage of improved precision of .get_unit_boundaries()
and is vectorized, enabling users to input multiple coordinate combinations and park units directly from a data frame.
2024-02-09 * This version adds the DRR template, example files, and associated documentation to the QCkit package. * Bugfix in get_custom_flag()
: it was counting both A (accepted) and AE (Accepted, estimated) as Accepted. Fixed the regex such that it Accepted will include all cells that start with A followed by nothing or by any character except AE such that flags can have explanation codes added to them (e.g. A_jenkins if “Jenkins” flagged the data as accepted)
2024-01-23 * Maintenance on get_custom_flag()
to align with updated DRR requirements * Added function replace_blanks()
to ingest a directory of .csvs and write them back out to .csv (overwriting the original files) with blanks converted to NA (except if a file has NO data - then it remains blank and needs to be dealt with manually)
2023-11-20 * Added the function create_datastore_scipt()
, which given a username and repo for GitHub will generate a draft Script Reference on DataStore based on the information found in the latest Release on GitHub.
24 April 2023 * fixed bug in get_custom_flags()
.
17 April 2023
get_elevation()
new function for getting elevation from GPS coordinates via USGS API.21 March 2023
order_cols
new function for ordering columns added16 March 2023
get_taxon_rank()
which takes a column of scientific names and generates a new column with the most specific scientific name rank listed. It does this purely based on recognizing patterns in the scientific naming scheme and not by matching a list of known genera, families, etc.get_taxon_rank()
and te_check()
into a single file, taxonomy.R.te_check()
in favor of check_te()
DC_col_check()
in favor of check_dc_cols()
utm_to_ll()
in favor of convert_utm_to_ll()
long2UTM()
in favor of convert_long_to_utm()
28 February 2023
te_check()
bug fix - exact column name filtering allows for multiple columns with similar names in the input data column. Improved documentation for transparency.23 February 2023
te_check()
. It now supports searching multiple park units.22 February 2023
te_check()
. Now prints the source of the federal match list data and the date it was accessed to the console. Made the output format prettier. Added an “expansion” option to the function. Defaults to expansion = FALSE, which checks for exact matches between the scientific binomial supplied by the user and the full scientific binomial in the matchlist. When expansion = TRUE, the genera in the data supplied will be checked against the matchlist and all species from a given genera will be returned, regardless of whether a given species is actually in the supplied data set. A new column “InData” will tell the user whether a given species is actually in their data or has been expanded to.02 February 2023
te_check()
that was causing the function return species that were not threatened or endangered. The function now returns a tibble containing all species that are threatened, endangered, or considered for listing, specifies the status code of each species, and then give a brief explanation of the federal endangered species act status code returned.get_dp_flags()
, get_df_flags()
, and get_dc_flags
in favor of get_custom_flags()
. The new get_custom_flags()
function returns 1-3 data frames, depending on user input that contain the output of the 3 previous functions. It also allows the user to specify additional non-flagged columns to be included in the QC summary.
get_custom_flags()
as experimental.get_dp_flags()
returns counts of each data flag (A, AE, R, P) across the whole data package (as well as all cells in the data package).get_df_flags()
returns counts of data flags within each data file of the data package (as well as counts for all cells within the data package).get_dc_flags()
returns the name of each flagging column within each data package and the count of each flag within each column as well as the total number of cells across all the data flagging columns.force
option that defaults to force = FALSE
and prints the results to the screen. setting force = TRUE
will suppress the on-screen output.