JAMA Network Open November 02, 2018

Evaluation of Digital Breast Tomosynthesis as Replacement of Full-Field Digital Mammography Using an In Silico Imaging Trial (VICTRE)

Aldo Badano, Christian G. Graff, Andreu Badal, Diksha Sharma, Rongping Zeng, Frank W. Samuelson, Stephen J. Glick, Kyle J. Myers

Bottom Line

The Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE) demonstrated that a completely in silico clinical trial using synthetic virtual patients could successfully replicate real-world human trials, proving that digital breast tomosynthesis offers superior lesion detection compared to full-field digital mammography.

Key Findings

1. Computational readers analyzed 31,055 digital mammography (DM) and 27,960 digital breast tomosynthesis (DBT) cases generated from 2,986 virtual patients with varying realistic breast densities.
2. The mean change in the Area Under the Curve (AUC) for overall lesion detection was 0.0587 (SE, 0.0062; P < .001), significantly favoring DBT over standard DM.
3. The AUC improvement for DBT was more pronounced for masses (change in AUC = 0.0903, SE 0.008) compared to microcalcifications (change in AUC = 0.0268, SE 0.004).
4. The in silico results closely matched those of a real-world comparative trial with human radiologists, which had previously demonstrated an AUC difference of 0.065 (SE 0.017) for masses, validating the simulation model.

Study Design

Design
In Silico Clinical Trial
N/A
Sample
2,986
Patients
Duration
N/A
Median
Setting
In silico
Population 2,986 synthetic image-based virtual patients with breast sizes and radiographic densities representative of a screening population, with simulated compressed thicknesses ranging from 3.5 to 6 cm.
Intervention Simulation of digital breast tomosynthesis (DBT) evaluated by a computational reader for lesion detection.
Comparator Simulation of full-field digital mammography (DM) evaluated by a computational reader for lesion detection.
Outcome Difference in area under the receiver operating characteristic curve (delta-AUC) for lesion detection between the DBT and DM modalities.

Study Limitations

The study utilized mathematical computational readers rather than human radiologists, which cannot fully replicate complex human psychophysics, visual search patterns, or cognitive fatigue.
Simulated lesions (spiculated masses and microcalcification clusters) and synthetic breast tissue phantoms may not capture the total morphological and biological heterogeneity of actual breast cancers.
The computer models used for Monte Carlo x-ray transport and biomechanical tissue compression rely on physical approximations and may not perfectly emulate every real-world hardware artifact or patient positioning challenge.

Clinical Significance

This landmark proof-of-concept study provides robust evidence that in silico (computer-simulated) imaging trials can serve as a viable, highly cost-effective, and rapid alternative to human clinical trials for the regulatory evaluation of medical imaging devices, while simultaneously reinforcing the clinical superiority of DBT over standard DM for breast cancer screening.

Historical Context

Historically, evaluating new imaging modalities required massively expensive and prolonged multi-reader, multi-case (MRMC) clinical trials involving thousands of human subjects. To accelerate medical device innovation without compromising safety or efficacy, the FDA initiated the VICTRE project to test if advanced computer simulations could replace traditional clinical trials. By successfully replicating the known real-world performance of DBT, VICTRE established a new paradigm in regulatory science, laying the groundwork for digital twins and in silico evidence in future FDA evaluations.

Guided Discussion

High-yield insights from every perspective

Med Student
Medical Student

What is the fundamental physical difference in image acquisition between Digital Breast Tomosynthesis (DBT) and Full-Field Digital Mammography (FFDM) that explains why DBT has superior lesion detection, particularly in dense breasts?

Key Response

FFDM takes a single 2D projection, leading to tissue superimposition which can obscure lesions. DBT acquires multiple low-dose projection images across an arc, which are reconstructed into 3D slices, significantly reducing overlapping tissue artifacts and improving mass visibility.

Resident
Resident

When ordering screening mammography for a patient with extremely dense breast tissue, how does the clinical choice between FFDM and DBT impact recall rates and the need for supplemental screening?

Key Response

DBT reduces the false-positive recall rate caused by overlapping normal tissue while simultaneously increasing the cancer detection rate compared to FFDM alone. While it improves detection, current practice often still considers supplemental screening (like ultrasound or MRI) for extremely dense breasts, though DBT is preferred as the baseline.

Fellow
Fellow

While the VICTRE trial demonstrated the superiority of DBT for lesion detection using virtual patients, how might the detection of microcalcifications differ between DBT and FFDM, and what are the implications for interpreting synthesized 2D views?

Key Response

While DBT is vastly superior for architectural distortion and mass detection, historically, early DBT struggled slightly with microcalcification clarity compared to raw FFDM due to slice thickness. Modern DBT uses synthesized 2D views to save dose, requiring fellows to carefully interpret calcification morphology on synthetic 2D versus scrolling through the high-resolution 3D stack.

Attending
Attending

If in silico trials like VICTRE can accurately replicate the outcomes of massive real-world screening trials, how should this paradigm shift influence our institutional approach to adopting and auditing new imaging technologies or AI algorithms?

Key Response

The success of VICTRE suggests device procurement and AI validation might increasingly rely on simulated datasets. Attendings must recognize that while in silico trials prove theoretical algorithmic superiority, post-market clinical auditing remains essential to verify outcomes against real-world factors like technologist positioning and patient motion.

Scholarly Review

Critical appraisal through the lens of expert reviewers and guideline development

PhD
PhD

The VICTRE study relies heavily on the realism of its synthetic breast phantoms. What are the primary mathematical limitations in modeling the power spectrum of structured anatomical noise, and how might discrepancies affect the simulated ROC curves?

Key Response

The validity of an in silico trial hinges on how well the simulated structured noise mimics real breast tissue. If the power-law exponent of the synthetic tissue's noise power spectrum does not perfectly match biological human variations, the simulated observer might misjudge the masking effect of breast density, threatening the trial's external validity.

Journal Editor
Journal Editor

As a reviewer, considering the use of mathematical model observers instead of human radiologists in the VICTRE trial, what specific validation steps must the authors provide to prove these algorithms accurately surrogate human mammographers?

Key Response

Computational observers do not suffer from fatigue and lack complex search heuristics. A critical reviewer would flag that unless the model observer is strictly calibrated to human performance using a subset of real-world Multi-Reader Multi-Case (MRMC) data, the trial measures theoretical device limits rather than true clinical efficacy.

Guideline Committee
Guideline Committee

Given that the ACR and USPSTF recognize improved cancer detection with DBT, should the FDA's acceptance of robust in silico evidence (like VICTRE) prompt committees to formally recommend DBT as the absolute minimum standard, fully replacing FFDM?

Key Response

While VICTRE provides robust evidence of DBT's superiority (higher sensitivity, lower recall), guideline committees must balance this with health equity, radiation dose, and equipment costs. Although in silico evidence strongly reinforces clinical recommendations, completely retiring FFDM in guidelines requires considering global access and real-world implementation realities.

Clinical Landscape

Noteworthy Related Trials

2013

STORM Trial

n = 7,292 · Lancet Oncol

Tested

Digital breast tomosynthesis (DBT) plus 2D mammography

Population

Women aged 48 years and older attending population-based screening

Comparator

2D mammography alone

Endpoint

Cancer detection rate

Key result: DBT significantly increased the cancer detection rate compared to 2D mammography alone, while reducing false-positive recall rates.
2013

Oslo Tomosynthesis Screening Trial

n = 12,621 · Radiology

Tested

Digital breast tomosynthesis plus full-field digital mammography

Population

Women aged 50-69 years participating in population-based screening

Comparator

Full-field digital mammography (FFDM) alone

Endpoint

Cancer detection rate and recall rate

Key result: DBT resulted in a 27 percent increase in the detection of breast cancers and a 15 percent decrease in false-positive rates compared to FFDM alone.
2015

TOMMY Trial

n = 7,060 · Health Technol Assess

Tested

Digital breast tomosynthesis with 2D mammography

Population

Women aged 47-73 years recalled for further assessment after routine screening

Comparator

Standard 2D mammography

Endpoint

Diagnostic accuracy

Key result: DBT improved diagnostic accuracy and specificity for breast cancer detection, especially in women with radiographically dense breasts.

Tailored to your role

Want this tailored to you?

Add your specialty or training stage to get role-specific takeaways and more questions.

Personalize this analysis