← Back to About Datasets

NIST Study Guide

Background

The study guide is a smaller dataset developed by NIST for screeners to use for evaluation as practice for when screeners wish to evaluate their screening tools for the full NIST datasets. It consists of approximately 50 sequences for screeners to evaluate and is downloadable from the website.

Dataset Format

The dataset is a downloadable FASTA file with the following:

  • UUID: The universally unique identifier assigned to the sequence. This is generated directly from the FASTA sequence and a NIST-generated namespace to enable traceability across dataset versions.
  • Sequence: The nucleotide sequence of the respective entry.

Submission Format

Screeners should submit their results through the “Upload Study Guide Results” button on the dashboard in a CSV or TSV format. Submissions should have a Flag boolean column where positive inputs are 1 and negative inputs are 0. A downloadable example of a results submission is available on the website for the study guide dataset.

Evaluation Results

The evaluation results return a Pass or Fail with detailed statistics on accuracy and recall, as well as a downloadable file for whether each UUID in the submission was correct or incorrect. A passing score is when:

  • Accuracy ≥ 75% ((TP + TN) / Total)
  • Recall ≥ 95% (TP / (TP + FN))

Please note that a result does not constitute certification, accreditation, or endorsement by NIST or IBBIS.

Sign up to start evaluating