RefSeq release 80 now available; GI identifiers to be removed in next release (March 2017)Friday, January 13, 2017
RefSeq release 80 is now accessible online, via FTP and through NCBI's programming utilities. This full release incorporates genomic, transcript, and protein data available as of January 9, 2017 and contains 118,059,547 records, including 78,028,152 proteins, 17,862,608 RNAs, and sequences from 66,224 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.
As announced in March 2016, NCBI has implemented the removal of GI numbers from some presentations of nucleotide and protein sequence records. GI sequence identifiers will be removed from flatfile and FASTA formats in the RefSeq FTP release in March 2017.
RefSeq plans to start a comprehensive reannotation of all prokaryotic genomes in a few weeks, which will be included in its entirety in the May 2017 release.
For more information on RefSeq release 80, please see the release notes.