Analysis of 329,942 SARS-CoV-2 Records Retrieved from GISAID Database
Quantori is excited to share research findings that are available on Cold Spring Harbor Laboratory's bioRxiv preprint server for biology “Analysis of 329,942 SARS
- The analysis yielded 155 genome variations (SNPs and deletions) in more than 0.3% of the sequences.
- Nine common SNPs were present in more than 20% of the samples.
- Clustering results suggested that a proportion of people (2.46%) were infected with a distinct subtype of the B.1.1.7 variant.
- The subtype may be characterized by four to six additional mutations, with four being a more frequent option (G28881A, G28882A, and G28883С in the N gene, A23403G in S, A28095T in ORF8, G25437T in ORF3a).
- Two clusters were formed by mutations in the samples uploaded predominantly by Denmark and Australia, which may indicate the emergence of “Danish” and “Australian” variants.
- Five clusters were linked to increased/decreased age, shifted gender ratio, or both.
- Metadata mining analysis has led to a hypothesis about gender inequality in medical care in certain countries.
- ORF6 and E were the most conserved genes (96.15% and 94.66% of the sequences totally match the reference, respectively), making them potential targets for vaccines and treatment.
Read the preprint here
Image Description: Forty-three clusters were revealed by HDBScan. Legend on the right contains cluster numbers and color schemes.