LLMs may help draft radiology claim appeal letters
An Academic Radiology pilot study found that LLMs could generate useful appeal-letter templates for denied interventional radiology services, but hallucinations and fabricated references remained concerns.

Large language models may help radiology teams draft appeal letters for denied insurance coverage, according to a pilot study published in Academic Radiology.
The study evaluated whether LLMs could generate accurate, clinically valid, and usable letters for appealing insurance denials related to interventional radiology services. The work focused on a common administrative burden in imaging practices: preparing payer-facing appeals when requested procedures are not approved.
Researchers tested 4 LLMs: Claude 3.5, Nova Pro, Llama-3.1-70B, and ChatGPT-4o. The models were prompted to generate appeal letters for simulated clinical scenarios using 3 techniques: zero-shot prompting, few-shot prompting, and retrieval-augmented generation.
A total of 12 appeal letters were generated and reviewed by 4 board-certified interventional radiologists. Reviewers were blinded to the model and prompting method used for each letter. They assessed content, grammar, structure, and usability, while references cited by the models were checked for accuracy.
Across the models, mean content scores were 3.9 out of 5, while mean grammar and structure scores were 4.3 out of 5. The letters were generally viewed as readable and usable, though reviewer agreement varied across scoring categories.
Usability was one of the stronger findings. Reviewers indicated that the LLM-generated letters would serve as helpful templates in 73% of cases. That suggests the tools may have near-term value as drafting aids rather than autonomous appeal systems.
Safety and accuracy concerns remained. Hallucinations were identified in 16 of 48 letters. ChatGPT-4o was more vulnerable to hallucinations than the offline models in the study, according to the reported results.
Reference accuracy was also a limitation. Of 44 references cited across the generated letters, 80% of references from offline models were fabricated. By comparison, 29% of ChatGPT-4o-generated letters contained fabricated references.
The findings point to a practical but limited role for LLMs in radiology administration. These tools may reduce the time needed to prepare first drafts, but the outputs still require human review before submission to insurers.
The authors concluded that generative AI may help reduce administrative burden related to prior authorizations or denials, but careful oversight remains necessary. That caution is especially relevant when letters include clinical reasoning, literature references, or payer-facing statements that could affect patient access to care.
The study adds to a growing body of work examining LLMs in radiology beyond image interpretation, including documentation, report summarization, patient communication, and workflow support.
About the author
RadiologySignal.com writersEditorial Team
Radiology Signal Staff covers developments across medical imaging, radiology AI, imaging informatics, clinical research, and radiology business. The team monitors primary sources, peer-reviewed studies, company announcements, society updates, and healthcare industry news to deliver concise reporting for imaging professionals.
More from this section

Siemens Healthineers gets FDA clearance for 6 Artis interventional systems
Siemens Healthineers has received FDA clearance for 6 systems in its Artis interventional imaging portfolio. All 6 systems feature the Optiq AI imaging chain, which analyzes and optimizes image data in real time.

Radiologist-AI pairing shows high sensitivity for PE detection
A Northwell Health study found 97.8% agreement between radiologists and an AI tool for pulmonary embolism detection on CT pulmonary angiography.

AI model maps body composition from whole-body MRI
Researchers used AI to analyze whole-body MRI scans from 66,608 participants and create reference curves for fat and muscle distribution.