15 Feb 2022

subsetting the data

Alright, I got a bit ahead of myself importing the data yesterday. Because I am using the data to run tests, I want to create a subset containing only about 20 datapoints. To do this I want to get random data points from the set:

import random

x = 20
selected_elements = []

while x > 0:
    selected_elements.append(
        random.randrange((1, 309812, 4))
    x -= 1

[140445, 279801, 99949, 128401, 22385, 113897, 7493, 143441, 245621, 220889,
 155237, 205217, 92565, 77941, 173313, 181325, 263481, 49605, 178705, 153261]

Then, save these sequences(all 4 lines!) into a new test data fastq file, put this in the manifest and import and run the tests on this file.

import test data

qiime tools import \
--type 'SampleData[SequencesWithQuality]' \
--input-path zymo_manifest.tsv \
--output-path zymo_test_data.qza \
--input-format SingleEndFastqManifestPhred33V2