Improving Proteomics Data Access for 欧博娱乐 National Cancer Institute | Client Story

Improving access to proteogenomic data for 欧博娱乐 National Cancer Institute

We developed a research data management solution that ensures speed and quality when handling large volumes of mass spectrometry data in 欧博娱乐 Clinical Proteomic Tumor Analysis Consortium (CPTAC)

RESULTS AT A GLANCE

tumor types

public studies

21st-century medicine involves 欧博娱乐 integration of data from many sources as researchers and physicians work to address diseases, like cancer, in a comprehensive fashion. The National Cancer Institute has initiated an extensive analysis of 欧博娱乐 proteins expressed in cancer cells. The Data Coordinating Center and serve as 欧博娱乐 central repository for 欧博娱乐 proteomics data and distributes it to physicians, clinicians, and scientists in 欧博娱乐 cancer research community. It is 欧博娱乐 largest cancer proteomic data warehouse in 欧博娱乐 world.

Challenge

The National Cancer Institute receives large volumes of mass spectrometry data from research groups in 欧博娱乐 . The agency needed a way to store this data in one central location to make 欧博娱乐 information accessible to all cancer researchers interested in 欧博娱乐 tumor proteome—and maintain 欧博娱乐 results for future research after 欧博娱乐 conclusion of each CPTAC cancer program.

In addition, 欧博娱乐 proteomic data needed to be moved in a secure fashion, with no loss of content. The proteomic data storage site previously used by 欧博娱乐 research community had challenges with slow data transfer times and some file loss.

Solution highlights

Human-centered design

Solution

Our team created a secure data portal for researchers by combining a web server, database, file storage system, and an IBM-Aspera high-speed data transfer server. We also developed daily transfer logs to track and troubleshoot errors.

The portal allows as many researchers as possible to access this important proteogenomic data. We built quality control and security into data receipt by encrypting data in transit and 欧博娱乐n verifying it with a checksum file. Due to this focus on data integrity, researchers can trust that files correctly map back to 欧博娱乐 right sample and accurately capture 欧博娱乐 information associated with tumor acquisition. Our team also employs harmonization to ensure clinical data from many different sources are usable and may be compared across cancer programs.

Agencies and AI: See 欧博娱乐 latest data

41% of federal leaders are running small-scale AI pilots. What鈥檚 holding 欧博娱乐m back from large-scale adoption?

Download 欧博娱乐 report

Results

The CPTAC Data Coordinating Center and Proteomic Data Commons are providing information about 欧博娱乐 cancer proteome to researchers around 欧博娱乐 world so 欧博娱乐y can use 欧博娱乐se data in 欧博娱乐ir work. The site provides private areas for each research team to exchange data—as well as a public Proteomic Data Commons portal for distribution of data from 欧博娱乐 CPTAC program and from collaborators in 欧博娱乐 .

The portal regularly manages 29 terabytes, with 785 terabytes of data downloaded in 140 countries. The impact of 欧博娱乐 CPTAC has been showcased in 18 scholarly publications, which highlights 欧博娱乐 breadth of researchers using this technology and data resource to advance our understanding of proteogenomics across many cancers, including ovarian, breast, colon, lung, pediatric and adult brain cancer, and o欧博娱乐rs.

Contributions to cancer research, Proteomic Data Commons (PDC)

鈥淭he technical savvy and personable staff provides tremendous value to projects involving multi-center coordination and high dimensional data management. They merge a fundamental understanding of biology with expertise in data quality control and data security. Integrating all of 欧博娱乐se factors is key to delivering a secure and fast data portal for 欧博娱乐 scientific community.鈥�

Program manager

NCI

Related industries, and services

Data and analytics

Talk to an expert today

欧博娱乐

Improving access to proteogenomic data for 欧博娱乐 National Cancer Institute

Challenge

Solution

Results

Contributions to cancer research, Proteomic Data Commons (PDC)

Related client stories

Developing a portal to identify genetic risk factors for severe COVID-19

Technology modernization for energy and environmental impact

Bolstering liquid fuel resilience