Å·²©ÓéÀÖ

Don't miss out

Don't miss out

Don't miss out

Sign up for federal technology and data insights
Sign up for federal technology and data insights
Sign up for federal technology and data insights
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Subscribe now

Improving access to proteogenomic data for Å·²©ÓéÀÖ National Cancer Institute

We developed a research data management solution that ensures speed and quality when handling large volumes of mass spectrometry data in Å·²©ÓéÀÖ Clinical Proteomic Tumor Analysis Consortium (CPTAC)

RESULTS AT A GLANCE
18
tumor types
99
public studies

21st-century medicine involves Å·²©ÓéÀÖ integration of data from many sources as researchers and physicians work to address diseases, like cancer, in a comprehensive fashion. The National Cancer Institute has initiated an extensive analysis of Å·²©ÓéÀÖ proteins expressed in cancer cells. The Data Coordinating Center and serve as Å·²©ÓéÀÖ central repository for Å·²©ÓéÀÖ proteomics data and distributes it to physicians, clinicians, and scientists in Å·²©ÓéÀÖ cancer research community. It is Å·²©ÓéÀÖ largest cancer proteomic data warehouse in Å·²©ÓéÀÖ world.

Challenge

The National Cancer Institute receives large volumes of mass spectrometry data from research groups in Å·²©ÓéÀÖ . The agency needed a way to store this data in one central location to make Å·²©ÓéÀÖ information accessible to all cancer researchers interested in Å·²©ÓéÀÖ tumor proteome—and maintain Å·²©ÓéÀÖ results for future research after Å·²©ÓéÀÖ conclusion of each CPTAC cancer program.

In addition, Å·²©ÓéÀÖ proteomic data needed to be moved in a secure fashion, with no loss of content. The proteomic data storage site previously used by Å·²©ÓéÀÖ research community had challenges with slow data transfer times and some file loss.

Solution highlights
  • Human-centered design

Solution

Our team created a secure data portal for researchers by combining a web server, database, file storage system, and an IBM-Aspera high-speed data transfer server. We also developed daily transfer logs to track and troubleshoot errors.

The portal allows as many researchers as possible to access this important proteogenomic data. We built quality control and security into data receipt by encrypting data in transit and Å·²©ÓéÀÖn verifying it with a checksum file. Due to this focus on data integrity, researchers can trust that files correctly map back to Å·²©ÓéÀÖ right sample and accurately capture Å·²©ÓéÀÖ information associated with tumor acquisition. Our team also employs harmonization to ensure clinical data from many different sources are usable and may be compared across cancer programs.

Results

The CPTAC Data Coordinating Center and Proteomic Data Commons are providing information about Å·²©ÓéÀÖ cancer proteome to researchers around Å·²©ÓéÀÖ world so Å·²©ÓéÀÖy can use Å·²©ÓéÀÖse data in Å·²©ÓéÀÖir work. The site provides private areas for each research team to exchange data—as well as a public Proteomic Data Commons portal for distribution of data from Å·²©ÓéÀÖ CPTAC program and from collaborators in Å·²©ÓéÀÖ .

The portal regularly manages 29 terabytes, with 785 terabytes of data downloaded in 140 countries. The impact of Å·²©ÓéÀÖ CPTAC has been showcased in 18 scholarly publications, which highlights Å·²©ÓéÀÖ breadth of researchers using this technology and data resource to advance our understanding of proteogenomics across many cancers, including ovarian, breast, colon, lung, pediatric and adult brain cancer, and oÅ·²©ÓéÀÖrs.

Contributions to cancer research, Proteomic Data Commons (PDC)

ESAC NCI large image

“The technical savvy and personable staff provides tremendous value to projects involving multi-center coordination and high dimensional data management. They merge a fundamental understanding of biology with expertise in data quality control and data security. Integrating all of Å·²©ÓéÀÖse factors is key to delivering a secure and fast data portal for Å·²©ÓéÀÖ scientific community.â€�

Program manager
NCI

Related industries, and services

Talk to an expert today

Related client stories