This dataset contains whole-genome nucleotide diversity estimates for 269 primate species.

DATA
The data/ folder is organized by species. Each species folder follows the structure below:

|
+-- Species_name/
|   |
|   +-- cram/
|   |
|   +-- gVCF/
|   |
|   +-- gVCF_vo/

where the cram/ subfolder contains individual-specific CRAM files containing aligned sequencing reads, the gVCF/ subfolder contains Genomic VCF (gVCF) files at base-pair resolution and the gVCF_vo/ subfolder contains gVCF files with variant-only positions. gVCF files are organised by batch and male/female ploidy (see metadata on references).

REFERENCE ALIGNMENT
The reference_alignment/ folder contains the Cactus output (primates_49_species.hal) of the whole genome alignment between 49 assemblies (47 non-human primate references used for mapping short reads, the T2T H. sapiens reference and the T. belangeri reference used as the outgroup), and the filtered maf file of the alignment (primates_49_species.Homo_sapiens.maf.gz).

METADATA
The metadata/ folder contains two subfolders with individual-specfic (individuals/ subfolder) and reference specific metadata (references/ subfolder). 

The files within the individuals/ subfolder are organized by genus. Each {GENUS}_individuals.txt file contains ID, mapping referece, sex and basic coverage info for individuals within the particular genus, while each {GENUS}_short_read_accession.txt file contains info on short read data for each individual within the genus.

The files within the references/ subfolder are organized by reference assembly used in read-mapping and reference alignment. The _REFERENCE_LIST.txt files contains basic info on each reference assembly. Each {REFERENCE_SPECIES}_contigs.txt file contains info on sex-linkage, male/female ploidy and batch for all contigs that belong to the specfic reference.