Curation of Human, Yeast, and E. coli proteins

We've set up a new way to look at PiQSi's data: you can now browse and curate proteins by organism, and currently we included Human, Yeast, and E. coli.

We chose these three organisms because, one out of three proteins in PDB belongs to one of them! This high coverage offers a unique opportunity to carry out genomic studies while taking into account the quaternary structure of proteins. However, for such studies one needs a dataset where the quaternary structure is decribed accurately.

The idea is simple: the table below indicates the number of genes in all three organisms for which a structure was found (sequence identity >= 98%, gene coverage >50%). You can now either click on the "#Total" number to display structures that still need to be curated, or click on (d) in order to download each dataset.

#Genes
(Unique)
#PDBs
(Total)
#PDBs
(NR100)
#PDBs
(NR90)
Human
1019 (d)
300 - 136
5473
2217 - 379
2478
848 - 201
1376
289 - 141
#Total (d = download)
#correct - #error
Yeast
239 (d)
87 - 30
701
334 - 50
382
157 - 33
241
72 - 20
E. coli
690 (d)
204 - 77
2935
1114 - 148
1502
536 - 106
837
216 - 80