NIST Study Guide
Background
The study guide is a smaller dataset developed by NIST for screeners to use for evaluation as practice for when screeners wish to evaluate their screening tools for the full NIST datasets. It consists of approximately 50 sequences for screeners to evaluate and is downloadable from the website.
Dataset Format
The dataset is a downloadable FASTA file with the following:
- UUID: The universally unique identifier assigned to the sequence. This is generated directly from the FASTA sequence and a NIST-generated namespace to enable traceability across dataset versions.
- Sequence: The nucleotide sequence of the respective entry.
Submission Format
Screeners should submit their results through the “Upload Study Guide Results” button on the
dashboard in a CSV or TSV format. Submissions should have a Flag boolean column where positive
inputs are 1 and negative inputs are 0. A downloadable example of a results
submission is available on the website for the study guide dataset.
Evaluation Results
The evaluation results return a Pass or Fail with detailed statistics on accuracy and recall, as well as a downloadable file for whether each UUID in the submission was correct or incorrect. A passing score is when:
- Accuracy ≥ 75% ((TP + TN) / Total)
- Recall ≥ 95% (TP / (TP + FN))
Please note that a result does not constitute certification, accreditation, or endorsement by NIST or IBBIS.