Preprocess direct-to-consumer (DTC) genomes for research


DataNum.Desc.Time freezeLinkSize
OpenSNP100Random 100 individual VCFs for quick download.
See an example file here.
19.02.2020 684M
OpenSNP5081Individual VCF, GRCh37.
List of genomes processed: here, and log.
19.02.2020 36G
PGP734Individual VCFs, GRCh37.
List of genomes processed: here, and log.
19.02.2020 5.6G
OpenSNP5393Combined Plink format (bim, fam, bed) in GRCh37,
including those that were originally deposited in GRCh36 format.
19.02.2020 570M

Quick check variants seen in the OpenSNP genomes: openSNP.bim

Additional data support can be requested by contacting us.


C. Lu, B. Greshake Tzovaras, J. Gough, A survey of direct-to-consumer genotype data,and quality control tool (GenomePrep) for research, Computational and Structural Biotechnology Journal(2021), doi:



Privacy-notice: UK Research and Innovation understands the importance of protecting personal information and is committed to complying with the General Data Protection Regulation 2016/679 (GDPR). It is committed to fostering a culture of transparency and accountability by demonstrating compliance with the principles set out in the Regulation – as laid out below and in the UKRI privacy policy. Your genotype data will be temporarily stored for processing, and permanently deleted after 12 hours and before 36 hours.

© 2021 Chang Lu