Artificial intelligence-based prediction of esophageal adenocarcinoma risk in Barrett’s esophagus patients: a literature review
Introduction
Barrett’s esophagus (BE), characterized by metaplastic columnar epithelium replacing squamous esophageal epithelium due to chronic gastroesophageal reflux disease (GERD), is the main risk factor for esophageal adenocarcinoma (EAC) (1,2). EAC incidence has risen sharply, with projections of a 7-fold increase by 2030, and late-stage diagnosis yields 5-year survival rates below 20% (3). Early detection of high-grade dysplasia (HGD) or early EAC enables curative endoscopic therapies, improving outcomes (4). However, even with high-definition white-light endoscopy (WLE) and systematic four-quadrant biopsies every 1–2 cm per the Seattle protocol, subtle or flat lesions and sampling error can lead to missed early esophageal adenocarcinoma (EEAC), with reported miss rates up to 23% (3,5,6).
The lack of universal screening guidelines and variable progression risks among BE patients highlight the need for precise tools (1,2). AI, leveraging machine learning (ML) and deep learning (DL), is transforming BE management by improving detection and risk stratification (7,8). Beyond lesion detection, AI-based computer-aided diagnostic systems have been explored for assessing invasion depth and supporting therapeutic decision-making in Barrett’s-related neoplasia (9).
Recent image-enhanced endoscopy (IEE)-based artificial intelligence (AI) studies using narrow band imaging (NBI) and linked color imaging (LCI) further illustrate feasibility for BE detection, though broader real-time validation is pending (10,11).
This paper explores AI’s role, challenges, and future potential and predicting EAC risk in BE patients. We present this article in accordance with the Narrative Review reporting checklist (available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-92/rc).
Methods
We searched PubMed, Embase, Scopus, Web of Science, and Cochrane up to June 30, 2025 (English only). ClinicalTrials.gov and World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) were checked for ongoing trials. Google Scholar was used for citation chasing.
- Inclusion: human BE/EAC studies using AI for detection computer-aided detection (CADe), characterization, computer-aided diagnosis (CADx), risk prediction, surveillance triage, or workflow support.
- Exclusion: editorials, letters without data, single-case reports, animal studies, and preprints or abstracts without a peer-reviewed article.
Two reviewers screened records and full texts with consensus resolution. We extracted endpoint, data source, setting, sample size, imaging modality, model type, still vs. real-time testing, validation (internal/external), comparator, and performance metrics [area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV)]. Results were synthesized narratively by domain: vision-aided endoscopy, non-vision models, and multimodal approaches. These are summarized in Table S1.
Current screening and surveillance: guidelines and gaps
Major society guidelines do not recommend population-wide screening for BE. They advise targeted, risk-based screening of selected individuals, such as chronic GERD with additional risk factors or a family history of BE/EAC, thresholds and implementation vary by society and region (1,2,12). GERD-based criteria alone have low sensitivity (38.6–43.2%), and broader risk models show modest discrimination (AUROC 0.50–0.60) (13). Over half of EAC patients report no significant GERD symptoms, limiting symptom-based strategies (3). Screening remains underused, so many BE cases go undiagnosed despite recent healthcare encounters (14). AI can combine clinical, endoscopic, pathology, and demographic features to flag higher-risk patients for targeted screening and triage (9).
Standard surveillance relies on WLE with random 4 quadrant biopsies (Seattle protocol), which is invasive, costly, operator dependent and prone to sampling errors, contributing to interval cancers (5). Dysplasia detection is challenging, even for experts, due to heterogeneous lesion patterns (6). Progression risk varies: non-dysplastic BE has a 0.3% annual rate, while HGD ranges from 6–19% (4,15). Endoscopic biopsies have limitations due to sampling variability and interobserver differences in pathology interpretation. Low-grade dysplasia (LGD) is frequently over-diagnosed, and inflammatory changes may obscure diagnosis, resulting in “indefinite for dysplasia” classifications (15). Current biomarkers (e.g., P53) lack predictive power, necessitating advanced tools (16). AI’s ability to analyze complex datasets and detect subtle and endoscopic features enhances diagnostic accuracy and optimizes surveillance (8).
Evidence of AI in BE
Vision-aided tools
AI-based CADe and CADx systems improve endoscopic surveillance by identifying these dysplastic or neoplastic lesions in real time (8). Convolutional neural networks (CNNs) trained on endoscopic images achieve high performance. A study reported AI detecting EEAC in BE with 90% sensitivity [95% confidence interval (CI): 0.83–0.94] and 86% specificity (95% CI: 0.78–0.91) (5,8). The Augsburg AI system, trained on high definition-white light endoscopy (HD-WLE) and NBI, detects Barrett’s esophagus-related neoplasia (BERN, including LGD/HGD). In a tandem randomized video trial, stand-alone AI achieved sensitivity 92.2% and specificity 68.9%, and it improved non-expert performance on second review (sensitivity 69.8% to 78.0%, specificity 67.3% to 72.7%) (17).
IEE AI is also emerging. An NBI-trained classifier detected BE on histology-confirmed images with accuracy 94.4%, sensitivity 94.3%, and specificity 94.4% (10). An LCI-based Vision Transformer for short-segment BE reached accuracy 90.5% with sensitivity 90.1% and specificity 91.1%. Both were single-center, still-image evaluations, real-time multicenter validation is needed (11).
The American Society for Gastrointestinal Endoscopy (ASGE)’s Preservation and Incorporation of Valuable Endoscopic Innovations (PIVI) guidelines require 90% sensitivity, 98% NPV, and 80% specificity for replacing random biopsies, a threshold AI approaches (18).
Non-vision-aided interpretation models
Non-vision-aided AI models predict BE and EAC risk by analyzing electronic health records (EHRs), clinical variables, biomarkers, and histopathological data (9). The Mayo Clinic’s transformer-based ML model, trained on over 6 million patient records, achieved an AUROC of 0.84 for predicting incident BE/EAC within 3 years (9). This model outperformed the Michigan Barrett’s Esophagus Risk Estimation Tool (M-BERET) (13).
TissueCypher employs a multiplex immunofluorescence assay with ML to generate a 5-year progression risk score for non-dysplastic BE and LGD, with reported sensitivity of approximately 88% for detecting progression to HGD or EAC (16,19).
AI-based histopathology models trained on whole-slide images have demonstrated high accuracy and reduced interobserver variability in dysplasia grading (20).
Emerging 3D pathology models using open-top light-sheet microscopy provide volumetric assessment of BE biopsies and show improved dysplasia detection compared with conventional 2D histology, though widespread adoption remains limited (21,22).
Layers of AI applications in BE
AI applications in BE management span multiple domains, creating a comprehensive approach to screening, detection, and treatment. Screening optimization is facilitated by models like the M-BERET, which uses logistic regression to predict BE presence based on clinical risk factors such as age, sex, smoking history, and GERD symptoms, achieving a sensitivity of 65% and specificity of 70% in a cohort of 2,000 patients, guiding endoscopic referrals in primary care settings (1). EHR-based models, such as the Mayo Clinic’s transformer-based ML, automate risk assessment by employing natural language processing (NLP) to analyze 6 million patient records, extracting features like diagnostic codes and medications to flag high-risk individuals for screening with an AUROC of 0.84, reducing unnecessary endoscopies (4). Lesion detection is enhanced by CADe systems, which improve real-time dysplasia/EAC identification, minimizing miss rates during surveillance (3).
CADx can help predict invasion depth. In a multicenter pilot using still HD-WLE images, an AI model differentiated T1a (mucosal) from T1b (submucosal) Barrett’s cancer with 77% sensitivity, 64% specificity, and 71% accuracy, comparable to expert endoscopists (9). Risk stratification is achieved through non-vision models that tailor surveillance intervals and identify candidates for endoscopic therapy, particularly for non-dysplastic BE (4). Prognosis and treatment planning are emerging areas where AI predicts treatment responses, though these applications are less developed in BE compared to other cancers (2). Together, these applications integrate clinical, endoscopic, and molecular data to enhance BE management (3).
AI enhances endoscopist skills in BE management by improving diagnostic accuracy and workflow efficiency, but its success hinges on optimizing the endoscopist-AI interaction (4). Studies show AI-assisted endoscopists improve performance, yet fall short of standalone AI accuracy due to suboptimal interaction, as demonstrated in ex-vivo trials where AI alone outperformed human-AI teams (3,4). To address this, AI systems must prioritize user-friendly interfaces for seamless integration into endoscopic practice, minimizing disruptions during procedures (4). Technical refinements, such as reducing lag time and false positives, are critical to prevent fatigue, which can erode trust and effectiveness, for instance, refining algorithms to lower false-positive rates in CADe systems enhances reliability (4). Comprehensive training is essential, covering AI system operation, development insights, and output interpretation, enabling endoscopists to understand limitations like algorithm bias (4). For trainees, AI serves as an educational tool, with CADe systems providing real-time feedback to improve lesion detection skills, though overreliance may hinder independent learning, as seen in colonoscopy studies where trainees’ visual scanning patterns were disrupted (10). In risk prediction, AI models like TissueCypher forecast 5-year progression risk with 88% sensitivity, guiding surveillance intervals, while emerging models predict endoscopic therapy responses, though validation is needed (2,7). Critically, AI’s potential to standardize care risks deskilling if not balanced with ongoing training and clinical vigilance (4). A summary of available strategies is shown in Table 1.
Table 1
| AI strategy | Example tool & performance | Clinical benefit | Primary challenges |
|---|---|---|---|
| Prescope risk triage | Mayo EHR transformer AUROC 0.84; MARK-BE score Sens 70%/Spec 65% | Flags high-risk patients before endoscopy; spares low-risk patients | Needs multicenter validation |
| Non-invasive screening | Cytosponge-TFF3 + AI raised BE detection vs. usual care | Swallow test avoids initial scope; improves patient comfort | Adoption and payment uncertainty |
| Real-time lesion detection (CADe) | Augsburg CNN Sens 96%, Spec 92%; BOIA boosted sensitivity 74%→88% | Highlights subtle dysplasia; reduces missed lesions | False alerts may distract operators |
| Lesion characterization (CADx) | CNN for T1a vs. T1b Sens 77%/Acc 71% | Guides EMR vs. surgery decisions | Risk of understaging |
| AI-guided advanced imaging | VLE heatmap Sens 86%; acetic acid AI matched Seattle yield | Targets biopsies; reduces random sampling and procedure time | High hardware cost; staff training |
| Pathology risk stratification | TissueCypher Sens 88% (5-year progression); whole-slide DL AUROC 0.94 | Personalizes surveillance intervals; reduces overtreatment | Panel cost; requires lab access |
| 3D pathology triage | OTLS + DL AUROC 0.92; +15% dysplasia detection | Assesses full biopsy volume; improves early detection | Prototype hardware; limited availability |
3D, three-dimensional; Acc, accuracy; AI, artificial intelligence; AUROC, area under the receiver operating characteristic curve; BE, Barrett’s esophagus; BOIA, Barrett’s oesophagus imaging for artificial intelligence; CADe, computer-aided detection; CADx, computer-aided diagnosis; CNN, convolutional neural network; DL, deep learning; EHR, electronic health record; EMR, endoscopic mucosal resection; MARK-BE, machine learning risk prediction in Barrett’s oesophagus; OTLS, open-top light-sheet microscopy; Sens, sensitivity; Spec, specificity; VLE, volumetric laser endomicroscopy.
Challenges, limitations, ethical considerations and privacy
AI in BE management faces significant hurdles that must be addressed for effective implementation. Many AI algorithms lack transparency, making it hard for clinicians to trust their predictions or use them in decision-making, so explainable AI is needed to clarify how results are generated (2).
Many algorithms lack transparency, reduces clinician trust and complicates clinical decision-making, necessitating explainable AI to align predations with clinical criteria (2). Data quality and bias pose challenges, as models trained on limited or non-diverse datasets may underperform in real-world settings, requiring multicenter validation to ensure generalizability (3). Integration barriers, such as the need for infrastructure and training to incorporate AI into clinical workflows and EHR systems, are particularly pronounced in community settings with limited resources. Regulatory hurdles are substantial, as the ASGE’s PIVI thresholds (90% sensitivity, 98% NPV) are stringent, and few AI systems consistently meet them across populations (3,6). Additionally, the risk of overdiagnosis due to high sensitivity may lead to unnecessary interventions, increasing patient anxiety and healthcare costs. Critical examination reveals that publication bias and overoptimistic claims may exaggerate AI performance, underscoring the need for rigorous real-world validation.
Implementing AI in BE management incurs high short-term costs that pose barriers to adoption. Developing and validating AI systems, such as CADe or 3D pathology models, requires substantial investment in computational infrastructure, specialized imaging (e.g., open-top light-sheet microscopy), and clinical trials, often costing millions, as seen in the $3.65 million National Institutes of Health (NIH) grant for 3D pathology (23). Training clinicians and integrating AI into EHR systems demand additional resources, particularly in community hospitals with limited budgets. These costs may delay widespread use, especially in low-resource settings, potentially limiting access to advanced care (24). However, long-term savings from reduced unnecessary endoscopies and early interventions could offset initial expenses, necessitating cost-effectiveness studies to justify investment (11).
Ethical issues remain an important consideration in the use of AI for BE management. Algorithms trained on non-representative datasets may perform unevenly across populations, increasing the risk of misclassification in underrepresented groups and potentially widening existing health disparities (25,26). There is also concern that excessive reliance on AI outputs could undermine clinical judgment if automated predictions are accepted without appropriate contextual interpretation (25). Informed consent is another challenge, as patients may not clearly understand how AI contributes to diagnostic or management decisions, making transparent communication about its role and limitations essential. Access is equally important, since the cost and infrastructure required for AI deployment may limit availability in resource-constrained settings (25,26).
Data privacy is a parallel concern, given the reliance of AI systems on large volumes of sensitive EHR and imaging data. These datasets increase the risk of data breaches, re-identification, and improper secondary use if safeguards are inadequate (26). Cloud-based processing and cross-jurisdictional data storage further complicate regulatory compliance. Strong de-identification practices, encryption, controlled access, and clear data governance policies are therefore necessary to protect patient trust and support responsible clinical adoption of AI in BE care (26).
Future integration and research directions
AI can strengthen BE screening and surveillance across three fronts. First, EHR-embedded risk flags from NLP models can surface high-risk patients who lack GERD symptoms and prompt targeted screening (4). Second, real-time CADe/CADx can standardize exam quality across experience levels, cut miss rates, and guide targeted biopsies, reducing reliance on untargeted four-quadrant sampling (3). Third, risk-based surveillance can lengthen intervals for low-risk patients and intensify follow-up for high-risk patients, improving resource use, formal cost-effectiveness studies are emerging but needed at scale (2,6,11). Successful adoption requires clear performance thresholds, clinician training, and seamless EHR integration. Collaborative efforts such as BE-AI consortia and society standards (for example, PIVI) can align validation and reporting.
Near-term advances center on multimodal AI that fuses endoscopic images, EHR data, and biomarkers (for example, p53, Ki-67), with tools like TissueCypher illustrating histopathology-plus-molecular risk stratification (21). Explainable models with interpretable outputs are in development to aid trust and regulatory review. Prospective, multicenter studies are testing real-time systems (for example, the Augsburg BERN algorithm) and clinical risk-prediction tools (for example, Mayo models) to assess generalizability (17). Early-phase work is exploring wide-area transepithelial sampling with computer-assisted 3D analysis (WATS-3D) with AI, spectral imaging with CNNs, and non-endoscopic Cytosponge approaches, these are promising but require external validation before routine use (12-14).
Equity and implementation matter. Adapting AI for low-resource settings, running longitudinal cohorts to validate progression prediction, auditing for bias, and designing cost-aware deployment plans should be priorities. This will require tight collaboration among developers, clinicians, and regulators. The ASGE AI Institute (launched 2024) is supporting education, standards, and Food and Drug Administration (FDA) engagement to speed safe adoption (15).
Conclusions
AI is poised to transform BE management by enabling more precise risk stratification, earlier detection, and personalized intervention, with the potential to substantially reduce EAC mortality. Achieving this will require prioritizing multicenter validation across diverse populations to ensure global applicability, particularly for non-vision-based models that currently rely heavily on North American data, such as TissueCypher (7). Collaborative efforts, including initiatives led by the ASGE AI Institute, can support standardized training, ethical frameworks, and scalable implementation (14). Although early adoption is associated with high upfront costs, including investment in advanced technologies such as 3D pathology, these are likely to be offset by long-term savings through fewer unnecessary endoscopies and earlier, curative interventions (6,9). AI-driven personalization of surveillance and endoscopic therapy based on individual risk profiles represents a major advance, with emerging models beginning to inform treatment selection and response prediction (2). While challenges related to bias, generalizability, and equity remain, continued progress in explainable and multimodal AI offers a clear path forward, positioning AI as a central tool in redefining BE care and making EAC a more preventable and manageable disease worldwide (4).
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-92/rc
Peer Review File: Available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-92/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-92/coif). M.A. serves as an unpaid editorial board member of Translational Gastroenterology and Hepatology from September 2024 to December 2026. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Peters Y, Al-Kaabi A, Shaheen NJ, et al. Barrett oesophagus. Nat Rev Dis Primers 2019;35. [Crossref] [PubMed]
- Meinikheim M, Messmann H, Ebigbo A. Role of artificial intelligence in diagnosing Barrett’s esophagus-related neoplasia. Clin Endosc 2023;56:14-22. [Crossref] [PubMed]
- Parasa S, Repici A, Berzin T, et al. Framework and metrics for the clinical use and implementation of artificial intelligence algorithms into endoscopy practice: recommendations from the American Society for Gastrointestinal Endoscopy Artificial Intelligence Task Force. Gastrointestinal Endoscopy 2023;97:815-824.e1. [Crossref] [PubMed]
- Iyer PG, Sachdeva K, Leggett CL, et al. Development of Electronic Health Record-Based Machine Learning Models to Predict Barrett’s Esophagus and Esophageal Adenocarcinoma Risk. Clin Transl Gastroenterol 2023;14:e00637. [Crossref] [PubMed]
- van der Sommen F, Zinger S, Curvers WL, et al. Computer-aided detection of early neoplastic lesions in Barrett’s esophagus. Endoscopy 2016;48:617-24. [Crossref] [PubMed]
- Thosani N, Abu Dayyeh BK, Sharma P, et al. ASGE Technology Committee systematic review and meta-analysis: imaging in Barrett’s esophagus. Gastrointest Endosc 2016;83:867-879.e7. [PubMed]
- Davison JM, Goldblum J, Grewal US, et al. Independent Blinded Validation of a Tissue Systems Pathology Test to Predict Progression in Patients With Barrett’s Esophagus. Am J Gastroenterol 2020;115:843-52. [Crossref] [PubMed]
- Sharma P, Hassan C. Artificial Intelligence and Deep Learning for Upper Gastrointestinal Neoplasia. Gastroenterology 2022;162:1056-66. [Crossref] [PubMed]
- Ebigbo A, Mendel R, Probst A, et al. Computer-aided diagnosis using deep learning in the evaluation of early oesophageal adenocarcinoma. Gut 2019;68:1143-5. [Crossref] [PubMed]
- de Groof AJ, Struyvenberg MR, van der Putten J, et al. Deep-Learning System Detects Neoplasia in Patients With Barrett’s Esophagus With Higher Accuracy Than Endoscopists in a Multistep Training and Validation Study With Benchmarking. Gastroenterology 2020;158:915-29.e4. [Crossref] [PubMed]
- Jong MR, de Groof AJ, Struyvenberg MR, et al. Advancement of artificial intelligence systems for surveillance endoscopy of Barrett’s esophagus. Dig Liver Dis 2023;55:731-9. [PubMed]
- Fitzgerald RC, di Pietro M, O'Donovan M, et al. Cytosponge-trefoil factor 3 versus usual care to identify Barrett's oesophagus in a primary care setting: a multicentre, pragmatic, randomised controlled trial. Lancet 2020;396:33344. [Crossref] [PubMed]
- Rubenstein JH, Morgenstern H, Appelman H, et al. Prediction of Barrett's esophagus among men. Am J Gastroenterol 2013;108:353-62. [Crossref] [PubMed]
- Jong MR, de Groof AJ. Advancement of artificial intelligence systems for surveillance endoscopy of Barrett's esophagus. Dig Liver Dis 2024;56:1126-30. [Crossref] [PubMed]
- Sharma P, Dent J, Armstrong D, et al. The development and validation of an endoscopic grading system for Barrett's esophagus: the Prague C & M criteria. Gastroenterology 2006;131:1392-9. [Crossref] [PubMed]
- Critchley-Thorne RJ, Duits LC, Prichard JW, et al. A Tissue Systems Pathology Assay for High-Risk Barrett’s Esophagus. Cancer Epidemiol Biomarkers Prev 2016;25:958-68. [Crossref] [PubMed]
- Ebigbo A, Mendel R, Probst A, et al. Real-time use of artificial intelligence in the evaluation of cancer in Barrett’s oesophagus. Gut 2020;69:615-6. [Crossref] [PubMed]
- Sharma P, Savides TJ, Canto MI, et al. The American Society for Gastrointestinal Endoscopy PIVI (Preservation and Incorporation of Valuable Endoscopic Innovations) on imaging in Barrett’s Esophagus. Gastrointest Endosc 2012;76:252-4. [Crossref] [PubMed]
- Davison JM, Goldblum J, Grewal US, et al. Independent Blinded Validation of a Tissue Systems Pathology Test to Predict Progression in Patients With Barrett’s Esophagus. Am J Gastroenterol 2020;115:843-52. [Crossref] [PubMed]
- Kather JN, Heij LR, Grabsch HI, et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer 2020;1:789-99. [Crossref] [PubMed]
- Erion Barner LA, Gao G, Reddi DM, et al. Artificial Intelligence-Triaged 3-Dimensional Pathology to Improve Detection of Esophageal Neoplasia While Reducing Pathologist Workloads. Mod Pathol 2023;36:100322. [Crossref] [PubMed]
- Reddi DM, Barner LA, Burke W, et al. Nondestructive 3D Pathology Image Atlas of Barrett Esophagus With Open-Top Light-Sheet Microscopy. Arch Pathol Lab Med 2023;147:1164-71. [Crossref] [PubMed]
- National Institutes of Health. 3D pathology of Barrett’s esophagus using light-sheet microscopy. NIH RePORTER Grant R01CA234589.
- Singer ME, Smith MS. Wide Area Transepithelial Sampling with Computer-Assisted Analysis (WATS3D) Is Cost-Effective in Barrett’s Esophagus Screening. Dig Dis Sci 2021;66:1572-9. [Crossref] [PubMed]
- Kelly CJ, Karthikesalingam A, Suleyman M, et al. Key challenges for delivering clinical impact with artificial intelligence. BMC Med 2019;17:195. [Crossref] [PubMed]
- Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44-56. [Crossref] [PubMed]
Cite this article as: Ahuja P, Gangwani MK, Ahuja N, Kamal F, Shah Y, Ali H, Aziz M, Ali MA, Inamdar S. Artificial intelligence-based prediction of esophageal adenocarcinoma risk in Barrett’s esophagus patients: a literature review. Transl Gastroenterol Hepatol 2026;11:50.

