Geneticists push for global data-sharing
International organization aims to promote exchange and linking of DNA sequences and clinical information.
Now, a consortium of 69 institutions in 13 countries hopes to address the problem by creating an organization to enable the free flow of information in genomic medicine. On 5 June, the consortium, which is calling itself the ‘global alliance’, announced that the organization will develop standards and policies to encourage data-sharing of a person’s DNA sequence combined with clinical information. The alliance’s founders are basing their model on the World Wide Web Consortium, which in the 1990s established standards for the programming language HTML and spurred the growth of web pages across the Internet.
“This alliance steps into what otherwise might be a real void,” says Francis Collins, director of the US National Institutes of Health (NIH) in Bethesda, Maryland, which is a member of the alliance. For example, Collins says, there are no standards for storing genetic sequences or for assessing their accuracy.
The alliance also hopes to tackle privacy and informed-consent issues that prevent researchers from sharing data, and plans to create a network of cloud-computing platforms and analysis tools in an effort to provide access to the shared data.
A big question for the group is whether it can convince institutions to share their most meaningful data. “The mission is unquestionably worthy,” says cardiologist Eric Topol, director of the Scripps Translational Science Institute in La Jolla, California, which has not yet considered joining the alliance. But, he adds, “it means taking the walls down, and that’s tricky — because you’ve got each centre wanting to hold on to its own data, and the loss of control is a very difficult concept”.
The effort has gained support from some of the world’s most influential sequence-data holders, including the NIH, the Wellcome Trust Sanger Institute in Hinxton, UK, and the BGI (formerly the Beijing Genomics Institute) in Shenzhen, China. David Altshuler, a geneticist at the Broad Institute in Cambridge, Massachusetts, who led an eight-person organizational committee for the project, is keen to add more members. “We’re saying, ‘This is bigger than any group or institution — let’s figure out how to get it right’,”he says.
With the cost of sequencing falling with each passing year, the number of sequenced human genomes is now poised to reach into the millions. But researchers can’t gain a complete picture of how genes influence disease unless those data are linked to clinical information and different institutions share data with each other.
Researchers are often reluctant to share this hard-won information, however. And on occasion, because of privacy concerns, they are legally prevented from doing so. That blocks scientists’ ability to use the world’s collective data to find answers to simple questions, such as how often a particular genetic variant is linked to a disease.
The establishment of technical standards for storage and sharing will go part of the way towards making genomic data easier to share and analyse. But the alliance also hopes to surmount some of the legal barriers by establishing how anonymity is handled and what information needs to be kept secure. Institutions that abide by core principles could then share data even if their policies differed in other, less central ways.
Moreover, the alliance wants to encourage the development of tools to allow patients to maintain control over their own medical and genetic data. Harold Varmus, director of the National Cancer Institute (NCI) in Bethesda, suggests that institutions should be able to tag their data so that it is accessible only for certain studies — a step that is “going to be incredibly important”, he says.
Some major genomic-medicine projects have signed up to the alliance, but others have not yet joined, and have limited outsiders’ access to their data. That is partly to head off privacy and security concerns, but also because the information is such a valuable commodity (see ‘Precious data’).
|Project||Enrolled participants||Joined global alliance?|
|US Million Veteran Program||213,000||No|
|Vanderbilt University BioVU||165,000||No|
|Kaiser Permanente Research Program on Genes, Environment, and Health||430,000||No|
|Deciphering Developmental Disorders||12,000||Yes|