A new study published on March 24 in Radiology, the journal of the Radiological Society of North America (RSNA), shows that both radiologists and multimodal large language models (LLMs) have difficulty telling real X-rays apart from artificial intelligence (AI)-generated “deepfake” images. The findings raise concerns about the risks posed by synthetic medical images and highlight the need for better tools and training to help protect the accuracy of medical imaging and prepare health care professionals to recognize deepfakes.
A “deepfake” is a video, photo, image or audio file that appears authentic but has been created or altered using AI.
“Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present,” said lead study author Mickael Tordjman, M.D., post-doctoral fellow, Icahn School of Medicine at Mount Sinai, New York. “This creates a high-stakes vulnerability for fraudulent litigation if, for example, a fabricated fracture could be indistinguishable from a real one. There is also a significant cybersecurity risk if hackers were to gain access to a hospital’s network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical chaos by undermining the fundamental reliability of the digital medical record.”
Study Details and Image Testing
The research included 17 radiologists from 12 institutions across six countries (United States, France, Germany, Turkey, United Kingdom and United Arab Emirates). Their experience ranged from beginners to specialists with up to 40 years in practice. In total, the study examined 264 X-ray images, split evenly between real scans and AI-generated ones.
Participants reviewed two separate image sets with no overlap. One set contained a mix of real images and ChatGPT-generated X-rays from different parts of the body. The second set focused on chest X-rays, with half real and half created using RoentGen, an open-source generative AI diffusion model developed by Stanford Medicine researchers.
Detection Accuracy for Radiologists and AI
When radiologists were not told that fake images were included, only 41% recognized the AI-generated X-rays after evaluating their technical quality. Once they were informed that synthetic images were present, their average accuracy in distinguishing real from fake rose to 75%.
Performance varied widely among individuals. Radiologists correctly identified between 58% and 92% of the ChatGPT-generated images. AI systems showed similar limitations. Four multimodal LLMs — GPT-4o (OpenAI), GPT-5 (OpenAI), Gemini 2.5 Pro (Google), and Llama 4 Maverick (Meta) — achieved accuracy rates ranging from 57% to 85%. Even ChatGPT-4o, which was used to generate the deepfake images, did not detect all of them, though it performed better than the other models.
For the RoentGen-generated chest X-rays, radiologists achieved accuracy rates between 62% and 78%, while the AI models ranged from 52% to 89%.
Experience Does Not Guarantee Detection
The study found no link between a radiologist’s years of experience and their ability to identify fake X-rays. However, musculoskeletal radiologists performed significantly better than other subspecialists.
Visual Clues in Deepfake X-Rays
Researchers identified several patterns that can appear in synthetic images.
“Deepfake medical images often look too perfect,” Dr. Tordjman said. “Bones are overly smooth, spines unnaturally straight, lungs overly symmetrical, blood vessel patterns excessively uniform, and fractures appear unusually clean and consistent, often limited to one side of the bone.”
Risks and Safeguards for Medical Imaging
The results highlight serious risks if deepfake X-rays are misused. Fabricated images could be used in legal cases or inserted into hospital systems to influence diagnoses and disrupt care.
To reduce these threats, researchers recommend stronger digital protections. These include invisible watermarks embedded directly into images and cryptographic signatures linked to the technologist at the time of image capture, which can help verify authenticity.
The Future of AI in Medical Imaging
“We are potentially only seeing the tip of the iceberg,” Dr. Tordjman said. “The logical next step in this evolution is AI-generation of synthetic 3D images, such as CT and MRI. Establishing educational datasets and detection tools now is critical.”
To support education and awareness, the researchers have released a curated deepfake dataset that includes interactive quizzes for training purposes.
