A Chihuahua or a Muffin? FDA Announces Plans for Aggressive Use of Artificial Intelligence

May 30, 2025By Adrienne R. Lenz, Principal Medical Device Regulation Expert & Lisa M. Baumhardt, Principal Medical Device Regulatory Expert & Gail H. Javitt

On May 8, 2025, FDA announced the successful completion of a generative artificial intelligence (AI) scientific review pilot program aimed at accelerating the review process and an “aggressive” timeline to rollout the use of AI tools across the Agency.  Extolling the “tremendous promise” of the new AI tools and their value in reducing reviewer tasks that “once took days to just minutes,” FDA Commissioner Dr. Martin Makary directed all FDA centers to immediately begin deployment with a goal of full integration by June 30, meaning that by that date all centers “will be operating on a common, secure generative AI system integrated with FDA’s internal data platforms.”

Notably absent from FDA’s announcement were any details about the technology that was deployed in the completed pilot program.  According to the global consulting group ICF, the pilot used an ICF-developed Computerized Labeling Assessment Tool (CLAT) to read drug labels and pinpoint specific items for review, with the goal of improving the effectiveness of the drug labeling review. As described here, CLAT is a tool that processes images of carton and container labeling to identify minimum requirements on the label, identify the availability of objects on an image, color differentiation of strength, missing barcode and orientation, incorrect or missing strength statements, error-prone abbreviations, look-alike labels, and text prominence. Another article about CLAT (available here) suggests FDA’s model will continually learn and improve over time.

Also not addressed in the announcement is whether the Agency-wide rollout will be limited to narrowly focused tasks, such as drug labeling review, or whether it will be applied to broader use cases, e.g., review of a full marketing application.

FDA has been working for years to understand the complexity of AI and how to ensure it functions as intended. As we recently blogged about here and here, FDA has issued guidance on lifecycle management for AI-enabled device software functions.  FDA’s guidance discusses the use of a robust development process to ensure transparency and reduce bias, which has the potential to produce incorrect results in a systematic but unforeseeable way.  With FDA’s aggressive timeline for Agency-wide implementation in less than two months, we wonder if FDA will be able to apply the same lifecycle and data management practices it expects for developers of AI-enabled device software functions.

As FDA knows well, the quality of the data used for training and tuning AI models has an impact on the quality of the output of the AI model. Based on our experience with FDA review of 510(k) applications, a single document may be updated several times over the course of the review, and a document submitted in response to an FDA information request may completely replace a previously submitted document. When developing and implementing AI models for use by FDA in review of 510(k)s and other applications where data can be updated throughout the review process, it will be important to clean the data to remove incorrect or duplicate information prior to training, which may be a manual process that could easily be overlooked with too aggressive a rollout.

FDA’s expectations for what is considered acceptable also change over time or differ between device types for a variety of reasons. When developing an AI model, it will be important to ensure data in the training set represents the current expectations, as training with data from testing to an outdated standard or following a now obsolete guidance will likely lead to the AI model being less useful.

At a high level, and based on the reported outcome of the pilot, the use of AI for reviews across FDA sounds promising. After all, FDA has access to all of the data submitted in applications and all of the review information related to those data. Therefore, it seems reasonable that AI models could be trained on the data, provide useful insights to reviewers, and speed up review times. Another area for which AI models may be suited in the medical device space would be post-market data, such as Medical Device Reports, where the data submitted is in a more standardized format from manufacturer to manufacturer. Applying AI models to review large amounts of data from multiple manufacturers could help FDA identify early signals related to product quality and patient safety.

At the same time, the absence of details about the planned rollout, along with the aggressive timeline, raise potential concerns.  We have all seen AI failures online, some amusing (e.g., is it a Chihuahua or blueberry muffin) and others with more serious implications.  We hope that FDA’s quest for speed does not prevent the Agency from adhering to its own “best practices” expected of industry to ensure any new tools implemented will be truly helpful to the review teams and not undermine the quality of reviews and that FDA provides transparency to industry about what documents in a submission are being reviewed by AI.

Categories: Medical Devices