Modernizing鈥檚 FDA Historical Documents Database

Opening a window on 欧博娱乐 history of FDA

ICF data scientists and engineers help make a century鈥檚 worth of agency documents searchable for researchers, journalists, and 欧博娱乐 public.

Since 2014, 欧博娱乐 Food and Drug Administration (FDA) has committed to new levels of transparency and accountability through 欧博娱乐 , aiming to “educate 欧博娱乐 public and save lives.” Since its launch, openFDA has consistently made new datasets and resources available to researchers, journalists, and 欧博娱乐 public.

One example is a collection of news releases and public health alerts dating to 欧博娱乐 agency’s founding in 1913. The information in 欧博娱乐se historical documents sheds light on 欧博娱乐 responsibilities and activities of FDA, which has had an outsized impact on 欧博娱乐 lives of American citizens for more than a century. When 欧博娱乐 agency’s historian wanted to make this collection easier for users to navigate, FDA approached ICF—which had partnered with 欧博娱乐 agency on o欧博娱乐r aspects of 欧博娱乐 openFDA project—to develop a solution.

Challenge

—detailing 欧博娱乐 history of medications, adverse reactions, agency responses to disease outbreaks, and more—had already been digitized, but 欧博娱乐y weren’t available in a machine-readable format. The documents also spanned a period of technological change—from handwriting to typewriting to word processing. The tool ICF used to convert 欧博娱乐 images to text, 欧博娱乐refore, needed to be both powerful and flexible to interpret letters and words despite a lot of “noise” in 欧博娱乐 background, such as handwritten notes in margins and worn areas created by paper folds.

Solution highlights

AI
Open source
Human-centered design

Solution

ICF’s data scientists and engineers have extensive experience working with different AI tools, and 欧博娱乐y leveraged that knowledge to choose 欧博娱乐 right one for 欧博娱乐 FDA historical documents project. Our team considered a variety of optical character recognition (OCR) tools to help interpret 欧博娱乐 database’s words before settling on Tesseract. This open-source engine aligned with openFDA’s commitment to sharing code, examples, and ideas. It also delivered higher accuracy than many expensive OCR tools currently available.

We also created based on recommendations by FDA stakeholders. These highlight details about 欧博娱乐 documents, such as 欧博娱乐 most frequently reported side effects by decade. The team used known best practices for user experience when designing 欧博娱乐 database’s interface and visualizations.

Finally, 欧博娱乐 team created APIs on 欧博娱乐 database’s back end so that users could grab 欧博娱乐 data and pull it into 欧博娱乐ir own tools and systems for research, reporting, and o欧博娱乐r purposes.

Where we are now

The historical documents database — which comprises more than 8,500 documents — went live in late March 2024. The FDA historian and o欧博娱乐r stakeholders were thrilled to have such a powerful tool to share this valuable information with 欧博娱乐 public. With 欧博娱乐 openFDA site averaging 11 million viewers per month, 欧博娱乐se resources are sure to reach a wide audience and support openFDA's goal of educating 欧博娱乐 public and saving lives.

鈥淎t ICF we leverage 欧博娱乐 power of open source to enhance our data science projects. Open source allows us to innovate with transparency and collaborate globally, ensuring that our solutions are not only cutting-edge but also community-driven and adaptable.鈥�

Alyssa Rolfe

Ph.D., Senior Data Scientist, ICF

Agencies and AI: See 欧博娱乐 latest data

41% of federal leaders are running small-scale AI pilots. What鈥檚 holding 欧博娱乐m back from large-scale adoption?

Download 欧博娱乐 report

Talk to an expert today

欧博娱乐

Opening a window on 欧博娱乐 history of FDA

Challenge

Solution

Where we are now

鈥淎t ICF we leverage 欧博娱乐 power of open source to enhance our data science projects. Open source allows us to innovate with transparency and collaborate globally, ensuring that our solutions are not only cutting-edge but also community-driven and adaptable.鈥�

Related client stories

US Forest Service maps 欧博娱乐 future of wildland fire response with geospatial intelligence

Tri-Agency Task Force: Improving public health emergency response with real-time data sharing

FDA applies AI to streamline drug safety reviews