LLMs may help draft radiology claim appeal letters

Large language models may help radiology teams draft appeal letters for denied insurance coverage, according to a pilot study published in Academic Radiology.

The study evaluated whether LLMs could generate accurate, clinically valid, and usable letters for appealing insurance denials related to interventional radiology services. The work focused on a common administrative burden in imaging practices: preparing payer-facing appeals when requested procedures are not approved.

Researchers tested 4 LLMs: Claude 3.5, Nova Pro, Llama-3.1-70B, and ChatGPT-4o. The models were prompted to generate appeal letters for simulated clinical scenarios using 3 techniques: zero-shot prompting, few-shot prompting, and retrieval-augmented generation.

A total of 12 appeal letters were generated and reviewed by 4 board-certified interventional radiologists. Reviewers were blinded to the model and prompting method used for each letter. They assessed content, grammar, structure, and usability, while references cited by the models were checked for accuracy.

Across the models, mean content scores were 3.9 out of 5, while mean grammar and structure scores were 4.3 out of 5. The letters were generally viewed as readable and usable, though reviewer agreement varied across scoring categories.

Usability was one of the stronger findings. Reviewers indicated that the LLM-generated letters would serve as helpful templates in 73% of cases. That suggests the tools may have near-term value as drafting aids rather than autonomous appeal systems.

Safety and accuracy concerns remained. Hallucinations were identified in 16 of 48 letters. ChatGPT-4o was more vulnerable to hallucinations than the offline models in the study, according to the reported results.

Reference accuracy was also a limitation. Of 44 references cited across the generated letters, 80% of references from offline models were fabricated. By comparison, 29% of ChatGPT-4o-generated letters contained fabricated references.

The findings point to a practical but limited role for LLMs in radiology administration. These tools may reduce the time needed to prepare first drafts, but the outputs still require human review before submission to insurers.

The authors concluded that generative AI may help reduce administrative burden related to prior authorizations or denials, but careful oversight remains necessary. That caution is especially relevant when letters include clinical reasoning, literature references, or payer-facing statements that could affect patient access to care.

The study adds to a growing body of work examining LLMs in radiology beyond image interpretation, including documentation, report summarization, patient communication, and workflow support.

LLMs may help draft radiology claim appeal letters

Siemens Healthineers gets FDA clearance for 6 Artis interventional systems

Radiologist-AI pairing shows high sensitivity for PE detection

AI model maps body composition from whole-body MRI

More from this section

Siemens Healthineers gets FDA clearance for 6 Artis interventional systems

Radiologist-AI pairing shows high sensitivity for PE detection

AI model maps body composition from whole-body MRI