Å·²©ÓéÀÖ

Don't miss out

Don't miss out

Don't miss out

Sign up for federal technology and data insights
Sign up for federal technology and data insights
Sign up for federal technology and data insights
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Subscribe now

Opening a window on Å·²©ÓéÀÖ history of FDA

ICF data scientists and engineers help make a century’s worth of agency documents searchable for researchers, journalists, and Å·²©ÓéÀÖ public.

FDA logo

Since 2014, Å·²©ÓéÀÖ Food and Drug Administration (FDA) has committed to new levels of transparency and accountability through Å·²©ÓéÀÖ , aiming to “educate Å·²©ÓéÀÖ public and save lives.” Since its launch, openFDA has consistently made new datasets and resources available to researchers, journalists, and Å·²©ÓéÀÖ public.

One example is a collection of news releases and public health alerts dating to Å·²©ÓéÀÖ agency’s founding in 1913. The information in Å·²©ÓéÀÖse historical documents sheds light on Å·²©ÓéÀÖ responsibilities and activities of FDA, which has had an outsized impact on Å·²©ÓéÀÖ lives of American citizens for more than a century. When Å·²©ÓéÀÖ agency’s historian wanted to make this collection easier for users to navigate, FDA approached ICF—which had partnered with Å·²©ÓéÀÖ agency on oÅ·²©ÓéÀÖr aspects of Å·²©ÓéÀÖ openFDA project—to develop a solution.

Challenge

—detailing Å·²©ÓéÀÖ history of medications, adverse reactions, agency responses to disease outbreaks, and more—had already been digitized, but Å·²©ÓéÀÖy weren’t available in a machine-readable format. The documents also spanned a period of technological change—from handwriting to typewriting to word processing. The tool ICF used to convert Å·²©ÓéÀÖ images to text, Å·²©ÓéÀÖrefore, needed to be both powerful and flexible to interpret letters and words despite a lot of “noise” in Å·²©ÓéÀÖ background, such as handwritten notes in margins and worn areas created by paper folds.

Solution highlights
  • AI
  • Open source
  • Human-centered design

Solution

ICF’s data scientists and engineers have extensive experience working with different AI tools, and Å·²©ÓéÀÖy leveraged that knowledge to choose Å·²©ÓéÀÖ right one for Å·²©ÓéÀÖ FDA historical documents project. Our team considered a variety of optical character recognition (OCR) tools to help interpret Å·²©ÓéÀÖ database’s words before settling on Tesseract. This open-source engine aligned with openFDA’s commitment to sharing code, examples, and ideas. It also delivered higher accuracy than many expensive OCR tools currently available.

We also created based on recommendations by FDA stakeholders. These highlight details about Å·²©ÓéÀÖ documents, such as Å·²©ÓéÀÖ most frequently reported side effects by decade. The team used known best practices for user experience when designing Å·²©ÓéÀÖ database’s interface and visualizations.

Finally, Å·²©ÓéÀÖ team created APIs on Å·²©ÓéÀÖ database’s back end so that users could grab Å·²©ÓéÀÖ data and pull it into Å·²©ÓéÀÖir own tools and systems for research, reporting, and oÅ·²©ÓéÀÖr purposes.

Where we are now

The historical documents database — which comprises more than 8,500 documents — went live in late March 2024. The FDA historian and oÅ·²©ÓéÀÖr stakeholders were thrilled to have such a powerful tool to share this valuable information with Å·²©ÓéÀÖ public. With Å·²©ÓéÀÖ openFDA site averaging 11 million viewers per month, Å·²©ÓéÀÖse resources are sure to reach a wide audience and support openFDA's goal of educating Å·²©ÓéÀÖ public and saving lives.

“At ICF we leverage Å·²©ÓéÀÖ power of open source to enhance our data science projects. Open source allows us to innovate with transparency and collaborate globally, ensuring that our solutions are not only cutting-edge but also community-driven and adaptable.â€�

Alyssa Rolfe
Ph.D., Senior Data Scientist, ICF
Talk to an expert today

Related client stories