Development of a preoperative prediction model for neoplastic polyps of the gallbladder based on a convolutional neural network model using ultrasonic images
Original Article

Development of a preoperative prediction model for neoplastic polyps of the gallbladder based on a convolutional neural network model using ultrasonic images

Yongyi Zhu1#, Yi Lu2#, Qingjin Zeng1, Ping Wang3, Meiqing Cheng4, Shaohong Wu4, Yanping Mo1, Yifei Wang1, Ziqi Zhu1, Yi Zhang2, Yong Ren5,6*, Yanling Zhang1*

1Department of Ultrasound, The Third Affiliated Hospital, Sun Yat-sen University, Guangzhou, China; 2Department of Hepatobiliary Surgery, The Third Affiliated Hospital, Sun Yat-sen University, Guangzhou, China; 3Department of Ultrasound, The Third Affiliated Hospital of Southern Medical University, Academy of Orthopedics, Guangzhou, China; 4Department of Medical Ultrasonics, Institute of Diagnostic and Interventional Ultrasound, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; 5Smart Healthcare Research Institute, Xunfei Healthcare Technology Co., Ltd., Hefei, China; 6Shensi Lab, Shenzhen Institute for Advanced Study, UESTC, Shenzhen, China

Contributions: (I) Conception and design: Y Zhu, Y Lu, Q Zeng, Yanling Zhang; (II) Administrative support: Y Lu, Yanling Zhang; (III) Provision of study materials or patients: Y Zhu, Y Lu, Q Zeng, Yanling Zhang; (IV) Collection and assembly of data: Y Zhu, P Wang, M Cheng, S Wu, Y Mo, Y Wang, Z Zhu, Yi Zhang; (V) Data analysis and interpretation: Y Zhu, Y Lu, Y Ren; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work as co-first authors.

*These authors contributed equally to this work.

Correspondence to: Yanling Zhang, MD. Department of Ultrasound, The Third Affiliated Hospital, Sun Yat-sen University, No. 600 Tianhe Road, Guangzhou 510630, China. Email: zhangylg@mail.sysu.edu.cn; Yong Ren, PhD. Smart Healthcare Research Institute, Xunfei Healthcare Technology Co., Ltd., No. 666, Wangjiang West Road, High-tech Zone, Hefei 230071, China; Shensi lab, Shenzhen Institute for Advanced Study, UESTC, Shenzhen, China. Email: koalary@qq.com.

Background: Polypoid lesions of the gallbladder (PLGs) are common lesions that can be classified as either nonneoplastic or neoplastic polyps. Surgical resection is recommended for neoplastic polyps, and accurate preoperative identification of neoplastic polyps is needed to guide appropriate management. However, accurately distinguishing neoplastic polyps remains challenging. Therefore, the aim of this study is to establish a preoperative prediction model for neoplastic polyps on the basis of a convolutional neural network (CNN) model using ultrasound images and evaluate its reliability.

Methods: This was a multicentre retrospective study. All included cases were divided into a training set, an internal test set, and an external test set. A CNN model was established using the Inception-V3 model, and the ultrasound images from the training set were input into the CNN for feature processing. The internal and external test set images were subsequently used to assess the predictive performance of the CNN model, which was then compared with the diagnostic performance of three sonographers with different levels of experience and an ultrasound feature-based nomogram model.

Results: A total of 380 cases (921 images in total) were retrospectively collected, with 194 cases in the training set (547 images in total), 83 cases in the internal test set (234 images in total), and 103 cases in the external test set (140 images in total). The areas under the curves (AUCs) of the CNN model were 0.896 and 0.852 in the internal and external test sets, respectively. In addition, the CNN model outperformed the three sonographers with varying levels of experience (AUC =0.687, 0.703, and 0.803, respectively), but was comparable to the nomogram model (AUC =0.880) in terms of diagnostic efficacy.

Conclusions: The CNN model, which is based on ultrasound images, has demonstrated relatively good predictive performance in preoperatively identifying neoplastic polyps and is highly important for assisting in the selection of treatment methods for PLG patients.

Keywords: Polypoid lesion of the gallbladder (PLG); ultrasonic features; convolutional neural network (CNN); prediction model


Received: 15 June 2025; Accepted: 26 September 2025; Published online: 23 January 2026.

doi: 10.21037/tgh-25-84


Highlight box

Key findings

• In this multicentre study, a preoperative prediction model, which was established for distinguishing neoplastic polyps of the gallbladder on the basis of a convolutional neural network (CNN) model using ultrasonic images, demonstrated excellent predictive performance and outperformed the sonographers in diagnostic accuracy.

What is known and what is new?

• Accurate preoperative identification of neoplastic polyps among common polypoid lesions of the gallbladder (PLGs), which can be nonneoplastic or neoplastic, is crucial for management, yet distinguishing them remains challenging.

• The CNN model, which is based on ultrasound images, has demonstrated relatively good predictive performance in preoperatively identifying neoplastic polyps.

What is the implication, and what should change now?

• This study indicated that the CNN model, which is based on ultrasound images, is a valuable tool for assisting the selection of treatment methods for PLG patients. and may reduce unnecessary cholecystectomies to some extent.


Introduction

Polypoid lesions of the gallbladder (PLGs) are common lesions with a prevalence rate of 5–10% (1). This lesion is characterized by a solid, immobile, and nonshadowing protrusion that originates from the gallbladder mucosa (2). Histopathologically, PLGs can be classified into nonneoplastic (including cholesterol polyps, adenomyomatosis, and inflammatory polyps) and neoplastic (including adenoma and adenocarcinoma) categories. Among these, cholesterol polyps are the most common, as evidenced by an incidence of 60–70% (3). However, cholesterol polyps lack malignant potential and thus surgical resection is not necessary. In contrast, a gallbladder adenoma is a precancerous lesion, and the prognosis of gallbladder adenocarcinoma is extremely poor (3-6). Therefore, cholecystectomy is necessary for neoplastic polyps. Accurate preoperative identification of the PLG is crucial for selecting the most appropriate clinical treatment.

According to current guidelines for the diagnosis and treatment of PLG, cholecystectomy is recommended for patients with a PLG with a maximum diameter greater than or equal to 10 mm (7-10). However, it has been reported that up to 40–50% of patients who are recommended to undergo cholecystectomy on the basis of these guidelines are ultimately diagnosed with nonneoplastic polyps following surgery (11,12). This means that a significant proportion of patients undergo unnecessary surgical treatment. Laparoscopic cholecystectomy, although a minimally invasive procedure, can still result in severe abdominal pain, bile duct injury, and other related complications. Therefore, it is necessary to effectively identify more features to distinguish between nonneoplastic and neoplastic polyps beyond just the largest diameter of the lesion.

PLGs are mainly diagnosed via imaging, including transabdominal ultrasonography (US), endoscopic ultrasonography (EUS), computed tomography (CT), magnetic resonance (MR), and positron emission tomography-computed tomography (PET-CT) (13-21). US is the primary imaging modality recommended because of its noninvasiveness, lack of ionizing radiation, ease of use, and cost-effectiveness (8,13,22). However, accurately identifying PLGs using US can be challenging, and there is currently no consensus on the most useful US features for differential diagnosis (23-32). Additionally, US examination is disadvantageous in that it is subjective and operator-dependent, both of which affect the accuracy of the results to some extent.

In recent years, artificial intelligence (AI) has been increasingly used in the auxiliary diagnosis of diseases in the lungs, liver, breast, thyroid, and neck blood vessels. Some studies have shown that the use of AI technology can improve doctors’ diagnostic accuracy (33-39). In the field of AI, deep learning based on convolutional neural networks (CNNs) is currently the most cutting-edge technology. In terms of medical imaging, deep learning has been successfully applied to the automatic classification, segmentation, feature extraction and analysis of images and therefore has considerable potential for further development (33). However, there are few relevant studies in which CNNs have been used to diagnose neoplastic polyps from ultrasound images, and none of them are multicentre studies (40,41). Therefore, whether the CNN can effectively identify cholesterol polyps and neoplastic polyps from ultrasound images still needs to be further verified using external data. Therefore, the aim of this study is to establish a preoperative prediction model for neoplastic polyps based on a CNN model using ultrasound images and evaluate its reliability. We present this article in accordance with the TRIPOD reporting checklist (available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-84/rc).


Methods

Patients

This was a multicentre retrospective study, with the training cohort and internal test cohort from The Third Affiliated Hospital of Sun Yat-sen University and the external test cohort from The Third Affiliated Hospital of Southern Medical University and The First Affiliated Hospital of Sun Yat-sen University. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Institutional Ethics Committee of The Third Affiliated Hospital of Sun Yat-sen University (No. [2020]02-076-01) and informed consent was taken from all the patients. The Third Affiliated Hospital of Southern Medical University and The First Affiliated Hospital of Sun Yat-sen University were also informed of the study and gave their consent.

The inclusion criteria were as follows: (I) patients who underwent cholecystectomy and preoperative abdominal US at one of the above three hospitals from November 2008 to March 2022; and (II) postoperative pathologically confirmed gallbladder cholesterol polyps or neoplastic polyps (including adenoma and adenocarcinoma). The exclusion criteria were as follows: (I) the largest diameter of the PLG was less than 5 mm; (II) thick-walled gallbladder adenocarcinoma appeared on ultrasound images; and (III) low-quality ultrasound images.

Ultrasonic images

The US equipment used in this study included Siemens (Erlangen, Germany; ACUSON X150/300, equipped with a 6C1 convex array probe), Esaote (Genoa, Italy; My Lab Twice, equipped with a CA421 convex array probe), HITACHI (Tokyo, Japan; ALOKA α5/α7, equipped with a USF9130 convex array probe), SonoScape (Shenzhen, China; S22/S59, equipped with a C344 convex array probe/C1-6 convex array probe), Toshiba (Otawara, Japan; Aplio300/500, equipped with a PVT-375BT convex array probe), Mindray (Shenzhen, China; DC-80/Resona 7/Resona 9, equipped with a 5CS-1E Convex array probe/SC6-1U convex array probe), Philips (Bothell, WA, USA; IE33/EPIQ7, equipped with a C5-1 convex array probe), Canon (Otawara, Japan; APLIO i900, equipped with an i8CX1 convex array probe), etc.

US images, including conventional two-dimensional ultrasound images and colour Doppler ultrasound images, were retained during gallbladder US examinations of patients who had fasted for more than 8 hours. Sonographers were responsible for image acquisition. For patients who underwent multiple US examinations, the latest preoperative US images were taken, and the interval between the time of the US examinations and the operations was not more than one month. US images that met the inclusion criteria were selected by a sonographer with more than 2 years of experience in ultrasound examination, ensuring that they were unmarked, clear, and fully displayed the PLG. All included images were resized to 512×512 pixels and saved in PNG format.

Model establishment and validation

As shown in Figure 1, AI algorithms primarily consist of three processing stages: feature extraction from input images, model establishment, and model evaluation.

Figure 1 Process involved in the establishment and evaluation of the CNN model. Firstly, images labeled by postoperative pathology results were fed into the model for feature extraction and processing. Then the Inception-V3 architecture, a classic CNN architecture, was adopted to extract their high-dimensional features. Finally, the performance of the CNN model was evaluated in both internal and external test cohorts using confusion matrix, ROC curve, and Grad-CAMs. CNN, convolutional neural network; Grad-CAM, Gradient-weighted Class Activation Mapping; ROC, receiver operating characteristic.

Feature extraction

With postoperative pathology as the gold standard, the training cohort was divided into two groups: cholesterol polyps and neoplastic polyps. The corresponding ultrasound images were subsequently tagged with their respective classification labels. These labelled images were then fed into the model for feature extraction and processing.

Model establishment

In this study, a CNN module was trained leveraging the renowned Inception-V3 architecture for model training purposes. Prior to initiating the training process, meticulous parameter configuration was conducted, encompassing the intricate network structure, initial weights, and hyperparameters. Upon introducing the images into the CNN, a methodical layer-by-layer “distillation” process subsequently occurred, where the network systematically extracted pertinent image features. Each layer, guided by its current set of parameters, meticulously extracted features from its predecessor layer. These extracted features were then seamlessly passed on to the subsequent layer for further abstraction and extraction of higher-dimensional characteristics. The CNN algorithm then autonomously executed forward feature extraction and backward parameter updates, adhering to the predefined loss function and learning rate, until the desired level of prediction accuracy was achieved on the test dataset.

Deep learning algorithms rely heavily on data, necessitating ample amounts to successfully train a high-performing CNN. Nevertheless, acquiring a substantial quantity of medical data suitable for modelling and training a top-tier CNN from the ground up is often challenging. Given the relatively limited sample size, this study employed transfer learning, a technique designed to leverage fully trained model parameters from a vast database and adapt them to a smaller dataset for fine-tuning purposes. We first loaded the pretrained weights obtained from the ImageNet dataset. The top fully connected layers were then swapped out and replaced with a single layer comprising two neurons, which was specifically tailored for performing binary classification tasks.

Model evaluation

To assess the predictive performance of the model, confusion matrices and receiver operating characteristic (ROC) curves were employed, which allowed for a quantitative analysis of the model’s classification accuracy, including the identification of true positives, true negatives, false positives, and false negatives. Within the test cohort, the diagnostic performance of the CNN model was compared with that of three junior (2 years of experience), intermediate (5 years of experience), and senior (more than 10 years of experience) sonographers. Additionally, we compared the diagnostic performance of the CNN model with that of an ultrasound feature-based nomogram model (Figure S1) that requires manual extraction of ultrasonic features, including the maximum diameter, number, and presence of echogenic foci. The maximum diameter refers to the longest diameter line of the PLG, which can be either the longitudinal diameter (parallel to the gallbladder wall) or transverse diameter (perpendicular to the gallbladder wall) of the lesion, measured using the caliper tool in the ultrasound imaging system. The number was classified into single and multiple lesions. A single lesion is defined as the presence of only one PLG in the gallbladder, while multiple lesions refer to the presence of two or more PLGs. The echogenic foci were defined as punctate hyperechoic structures with a diameter of 1–5 mm within the PLG, with the echo intensity of the adjacent liver parenchyma as the reference standard.

The CNN model automatically learns high-dimensional features from different pixel information of raw ultrasound images—these features may involve subtle pixel intensity changes, texture patterns, and spatial relationships that are difficult to capture through traditional visual observation. Thus, to make the model’s classification logic interpretable and verify whether its judgment basis aligns with medical diagnostic standards, we adopted the Gradient-weighted Class Activation Mapping (Grad-CAM) visualization technique. Grad-CAM quantifies the gradients of the model’s output relative to the feature maps in the terminal convolutional layer, assigns corresponding weights to these feature maps based on the gradient magnitudes, and subsequently generates a heatmap. This heatmap delineates the regions within the input images that exhibit the highest relevance to the model’s predictive inference.

Statistical analysis

Statistical analysis was conducted using SPSS 25.0, with a P value <0.05 indicating statistical significance.

Data for the continuous variables are presented as means and standard deviations, and data for the categorical variables are presented as frequencies and percentages. Normally distributed continuous variables were compared using the independent two-sample t-test, whereas nonnormally distributed continuous variables were compared using the Mann-Whitney U test. The χ2 test or Fisher’s exact test was applied for the comparison of categorical variables between groups.

The predictive performance of the models was evaluated using confusion matrices and ROC curves. The sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) were calculated based on the confusion matrices, whereas the area under the curve (AUC) was derived from the ROC curves.


Results

Baseline characteristics

A total of 380 patients were included in this study, comprising 277 patients from The Third Affiliated Hospital of Sun Yat-sen University, 53 patients from The Third Affiliated Hospital of Southern Medical University, and 50 patients from The First Affiliated Hospital of Sun Yat-sen University. The inclusion and exclusion criteria are illustrated in Figure 2. In total, 194 patients (547 images) were included in the training cohort, 83 patients (234 images) were included in the internal test cohort, and 103 patients (140 images) were included in the external test cohort. As shown in Table 1, among the baseline characteristics of the three cohorts, there were no statistically significant differences among the three groups in terms of pathological diagnosis, age, or maximum diameter of the PLG (P>0.05).

Figure 2 Flowchart for the study. Hospital A: The Third Affiliated Hospital of Sun Yat-sen University. Hospital B: The Third Affiliated Hospital of Southern Medical University. Hospital C: The First Affiliated Hospital of Sun Yat-sen University.

Table 1

Baseline clinical information of the CNN model’s training cohort and test cohort

Characteristics Training cohort (n=194) Internal test cohort (n=83) External test cohort (n=103) P
Age (years) 41±22 40±14 41±24 0.55
Pathological diagnoses (cholesterol polyps/neoplastic polyps) 124/70 53/30 61/42 0.70
Maximum diameter of PLGs (mm) 12±5 12±6 12±8 0.84

Data are presented as mean ± standard deviation or n. CNN, convolutional neural network; PLGs, polypoid lesions of gallbladder.

Predictive performance of the CNN model

The ROC curve for the CNN model for predicting neoplastic polyps is shown in Figure 3. The AUC values for the internal test set and external test set are 0.896 and 0.852, respectively. The sensitivity, specificity, accuracy, PPV, and NPV of the CNN model in predicting neoplastic polyps are shown in confusion matrices (Figure 4).

Figure 3 ROC curves of the CNN model in the internal and external test cohorts. AUC, area under the ROC curve; CNN, convolutional neural network; ROC, receiver operating characteristic.
Figure 4 Confusion matrix of the CNN model in the internal test cohort (A) and the external test cohort (B). Rows and columns represent true labels (pathological results) and predicted labels (CNN predicted results), respectively. Based on the confusion matrix, the sensitivity and specificity of the CNN model for predicting neoplastic polyps in the internal test set were 74% and 91%, while in the external test set, the sensitivity and specificity were 58% and 86%, respectively. CNN, convolutional neural network.

Figure 5 and Table 2 present the comparative results of the diagnostic performance between the CNN model with sonographers and the nomogram model. Compared with all three ultrasound sonographers, the CNN model (AUC =0.896) exhibited superior diagnostic performance (AUC =0.803 for the senior sonographers, 0.703 for the intermediate sonographers, 0.687 for the junior sonographers), whereas its performance was comparable to that of the nomogram model (AUC =0.880).

Figure 5 ROC curves of the CNN model, sonographers, and the feature-based nomogram model in the internal test cohort. AUC, area under the curve; CNN, convolutional neural network; ROC, receiver operating characteristic.

Table 2

Diagnostic performance of the CNN model, sonographers and the nomogram model

Model/reader Sensitivity (%) Specificity (%) Accuracy (%) PPV (%) NPV (%) AUC
CNN model
   Internal test cohort 74.3 91.1 84.3 84.2 85.2 0.896
   External test cohort 58.4 86.2 75.3 69.3 79.2 0.852
Sonographers
   Senior 70.0 90.6 83.1 80.8 84.2 0.803
   Intermediate 50.0 90.6 75.9 75.0 76.2 0.703
   Junior 60.0 77.4 71.1 60.0 77.4 0.687
Nomogram model 73.3 84.9 80.7 84.9 73.3 0.880

AUC, area under the curve; CNN, convolutional neural network; NPV, negative predictive value; PPV, positive predictive value.

Visual explanation of the CNN model

Figure 6 illustrates the application of Grad-CAM on the CNN model to generate heatmaps. Notably, the areas of interest identified by the CNN model closely align with the clinically relevant regions.

Figure 6 Examples of visualization using Grad-CAM on the CNN model. (A) Ultrasound images with PLG regions outlined. (B) Heatmaps highlighting regions identified by the CNN model. Red areas indicate strong contributions to diagnosis, while blue areas contribute less. CNN, convolutional neural network; Grad-CAM, Gradient-weighted Class Activation Mapping; PLG, polypoid lesion of the gallbladder.

Discussion

In this study, we established a preoperative prediction model for neoplastic polyps of the gallbladder based on a CNN model using ultrasonic images, which demonstrated excellent predictive performance in both internal and external test sets. Grad-CAM heatmaps revealed that the focus areas of the CNN model roughly overlapped with those of clinical diagnosis, indicating a certain degree of interpretability in predicting neoplastic polyps. Furthermore, the diagnostic performance of the CNN model was compared with that of three ultrasound sonographers with different levels of experience and the nomogram model based on ultrasound features. The results showed that the CNN model outperformed the ultrasound physicians in diagnostic accuracy while achieving comparable predictive performance to the nomogram model, suggesting promising clinical applications for CNNs in predicting neoplastic polyps.

To our knowledge, while prior studies have explored the use of AI methods in analysing ultrasound images of PLGs, this is the first multicentre study to employ a CNN model for discriminating PLGs. Chen et al. developed a computer-assisted diagnosis system based on principal component analysis (PCA) and the AdaBoost algorithm, achieving an AUC of 0.862 for PLG diagnosis (42). Yuan et al. employed support vector machine (SVM) to extract spatial and morphological features of PLGs, achieving an AUC of 0.898 when combining the two types of features (43). These studies demonstrate the feasibility of using AI to aid in PLG diagnosis. However, owing to the limited number of cases, deep learning techniques, which are capable of multilayer representation learning of ultrasound images, were not used in these studies. Instead, they focused on learning only one or two layers of data representation. Subsequently, Jeong et al. established a decision support system based on deep learning (DL-DSS) for PLG discrimination, achieving an AUC of 0.92. Furthermore, they indicated that the DL-DSS could decrease the gap between reviewers and reduce the false positive rate (40). Nevertheless, this was a single-centre study, with both training and testing images sourced from the same institution. Recently, Kim et al. proposed an ensemble CNN model for PLG discrimination, achieving an AUC of 0.896 (41). Although their study was conducted as a dual-centre study, they employed fivefold cross-validation for model validation rather than an independent test set. Consequently, it remains unclear whether the model demonstrates similar diagnostic performance on external data. However, our current multicentre study addresses these limitations by establishing a CNN model that demonstrates good predictive performance in both internal and external test sets.

The CNN model demonstrated high sensitivity, specificity, and accuracy in predicting neoplastic polyps within the internal test set, with a particularly impressive specificity of 91.1%. A high specificity indicates that the CNN model has a low false positive rate in identifying neoplastic polyps, meaning that a greater number of cholesterol polyps are likely to be identified preoperatively, which holds significance in reducing unnecessary surgical procedures. Compared with the internal test set, the diagnostic efficacy of the CNN model in the external test set was slightly lower, potentially due to insufficient generalization capabilities of the model on the external test set data. In addition to the lack of generalizability, there may be differences in data noise and data distribution biases between the external and internal test sets, which can affect the model’s diagnostic efficacy. Nevertheless, with an AUC of 0.852 and a specificity of 86.2% in the external test set, the model still demonstrated good diagnostic performance, especially in reducing the false positive rate.

The diagnostic efficacy of the CNN model is superior to that of three ultrasound sonographers with different levels of experience, indicating that clinically distinguishing neoplastic polyps from cholesterol polyps can be challenging, especially for junior sonographers. One of the limitations of US is its operator dependency, as sonographers differ in terms of medical knowledge, experience, and skill level, leading to different interpretations of the same patient’s results. In contrast, the CNN model can leverage several standardized medical images and data for training, enabling it to learn more features and patterns, thereby enhancing its diagnostic efficacy. The ultrasound feature-based nomogram model demonstrated good diagnostic performance in the internal test set in this study, comparable to that of the CNN model. However, the ultrasound features required for the model need to be manually extracted by sonographers, introducing a degree of subjectivity, and the time required for feature extraction is longer than that for computers to process images. Chen et al. reported that the average time for a physician to diagnose PLG based on ultrasound images is 3 seconds, whereas a computer needs only 0.02 seconds (42). Above all, the application of CNNs in distinguishing neoplastic polyps from cholesterol polyps holds broad clinical prospects. However, the use of AI is not intended to replace clinicians but rather to assist them in making better decisions. Therefore, further research is needed to explore how to integrate this technology into medical practice and maximize its clinical impact, aiding sonographers in more accurate diagnosis.

In practical applications, there may be concerns about the interpretation of deep learning results, specifically the inability to explain the basis for a model’s judgement despite its excellent classification performance. However, the representations learned by CNNs are well suited for visualization, primarily because they represent visual concepts. Since the model’s classification is based on high-dimensional features learned from different pixels of the original ultrasound images, we utilize Grad-CAM visualization techniques to identify the most indicative regions in the ultrasound images for the CNN model’s judgement (44). By leveraging the heatmaps generated by Grad-CAM, we can determine whether the model’s classification basis aligns with medical diagnostic criteria. As shown in Figure 6, the feature regions used by the CNN model for diagnosing neoplastic polyps roughly overlap with the regions of clinical diagnostic interest, indicating that the model’s classification basis is clinically relevant. However, the highlighted regions in the heatmaps do not completely overlap with the regions of clinical diagnostic interest, which can be explained from multiple perspectives. Firstly, it could be due to the involvement of several parameters in the deep learning process, making it difficult to identify specific variables that contribute to each prediction, potentially leading to overfitting (44). Secondly, there is another possibility that the model identifies some features in regions outside the PLG, such as the gallbladder wall and liver parenchyma, which are hard for the naked eye to distinguish but help the CNN model in classification. However, this hypothesis still needs to be verified with a larger sample size.

This study has certain limitations. First, while it is a multicentre study with a relatively large sample size, the number of patients was still insufficient for training a CNN model. Second, as a retrospective study, even though poor-quality images have been screened out, differences in imaging resolution among various ultrasound instruments can still lead to variations in image quality, which may have some impact on the results. However, this reflects the reality of clinical applications. Finally, the study focused on static ultrasound images, which may not capture all the information about the lesion. Therefore, the next step in our research plan is to include dynamic ultrasound videos for training the CNN model.


Conclusions

In conclusion, in this study, we proposed a gallbladder neoplastic polyp prediction model based on CNN architecture using ultrasound images and reported its relatively good predictive performance in both internal and external test sets. Notably, the CNN model outperforms three ultrasound sonographers with different levels of experience, making it a valuable tool for assisting in the preoperative diagnosis of gallbladder neoplastic polyps and may reduce unnecessary cholecystectomies to some extent.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-84/rc

Data Sharing Statement: Available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-84/dss

Peer Review File: Available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-84/prf

Funding: This work was supported by the Shenzhen Science and Technology Program(No. JCYJ20220530145001002); the Science and Technology Projects in Guangzhou (No. 2023A04J1809); and the Guangdong Basic and Applied Basic Research Foundation (No. 2020A1515010178).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tgh.amegroups.com/article/view/10.21037/tgh-25-84/coif). Y.R. is an employee of Xunfei Healthcare Technology Co., Ltd., but all research work related to this manuscript was completed independently prior to his employment at this company. The company provided no support for the study, and there are no actual conflicts of interest to declare. The other authors have no conflicts of interest to disclose.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Institutional Ethics Committee of The Third Affiliated Hospital of Sun Yat-sen University (No. [2020]02-076-01) and informed consent was taken from all the patients. The Third Affiliated Hospital of Southern Medical University and The First Affiliated Hospital of Sun Yat-sen University were also informed of the study and gave their consent.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Xu A, Hu H. The gallbladder polypoid-lesions conundrum: moving forward with controversy by looking back. Expert Rev Gastroenterol Hepatol 2017;11:1071-80. [Crossref] [PubMed]
  2. Kamaya A, Fung C, Szpakowski JL, et al. Management of Incidentally Detected Gallbladder Polyps: Society of Radiologists in Ultrasound Consensus Conference Recommendations. Radiology 2022;305:277-89. [Crossref] [PubMed]
  3. Mellnick VM, Menias CO, Sandrasegaran K, et al. Polypoid lesions of the gallbladder: disease spectrum with pathologic correlation. Radiographics 2015;35:387-99. [Crossref] [PubMed]
  4. Albores-Saavedra J, Chablé-Montero F, González-Romo MA, et al. Adenomas of the gallbladder. Morphologic features, expression of gastric and intestinal mucins, and incidence of high-grade dysplasia/carcinoma in situ and invasive carcinoma. Hum Pathol 2012;43:1506-13. [Crossref] [PubMed]
  5. Nagtegaal ID, Odze RD, Klimstra D, et al. The 2019 WHO classification of tumours of the digestive system. Histopathology 2020;76:182-8. [Crossref] [PubMed]
  6. Waller GC, Sarpel U. Gallbladder Cancer. Surg Clin North Am 2024;104:1263-80. [Crossref] [PubMed]
  7. National Comprehensive Cancer Network. NCCN Clinical Practice Guidelines in Oncology: Hepatobiliary Cancers, Version 4.2021 — August 26, 2021. Available online: https://www.nccn.org/professionals/physician_gls/pdf/hepatobiliary.pdf
  8. Foley KG, Lahaye MJ, Thoeni RF, et al. Management and follow-up of gallbladder polyps: updated joint guidelines between the ESGAR, EAES, EFISDS and ESGE. Eur Radiol 2022;32:3358-68. [Crossref] [PubMed]
  9. Nagino M, Hirano S, Yoshitomi H, et al. Clinical practice guidelines for the management of biliary tract cancers 2019: The 3rd English edition. J Hepatobiliary Pancreat Sci 2021;28:26-54. [Crossref] [PubMed]
  10. Branch of Biliary Surgery, Chinese Surgical Society. Chinese Committee of Biliary Surgeons. Guideline for the diagnosis and treatment of gallbladder carcinoma (2019 edition). Zhonghua Wai Ke Za Zhi 2020;58:243-51. [Crossref] [PubMed]
  11. Patel K, Dajani K, Vickramarajah S, et al. Five year experience of gallbladder polyp surveillance and cost effective analysis against new European consensus guidelines. HPB (Oxford) 2019;21:636-42. [Crossref] [PubMed]
  12. Wennmacker SZ, van Dijk AH, Raessens JHJ, et al. Polyp size of 1 cm is insufficient to discriminate neoplastic and non-neoplastic gallbladder polyps. Surg Endosc 2019;33:1564-71. [Crossref] [PubMed]
  13. Chen T, Wang J. New progress of gallbladder polyps imaging diagnosis. Chinese Journal of Hepatobiliary Surgery 2019;25:394-7.
  14. Riddell ZC, Corallo C, Albazaz R, et al. Gallbladder polyps and adenomyomatosis. Br J Radiol 2023;96:20220115. [Crossref] [PubMed]
  15. Jang SI, Kim YJ, Kim EJ, et al. Diagnostic performance of endoscopic ultrasound-artificial intelligence using deep learning analysis of gallbladder polypoid lesions. J Gastroenterol Hepatol 2021;36:3548-55. [Crossref] [PubMed]
  16. Wennmacker SZ, Lamberts MP, Di Martino M, et al. Transabdominal ultrasound and endoscopic ultrasound for diagnosis of gallbladder polyps. Cochrane Database Syst Rev 2018;8:CD012233. [Crossref] [PubMed]
  17. Zhou W, Li G, Ren L. Triphasic Dynamic Contrast-Enhanced Computed Tomography in the Differentiation of Benign and Malignant Gallbladder Polypoid Lesions. J Am Coll Surg 2017;225:243-8. [Crossref] [PubMed]
  18. Bang SH, Lee JY, Woo H, et al. Differentiating between adenomyomatosis and gallbladder cancer: revisiting a comparative study of high-resolution ultrasound, multidetector CT, and MR imaging. Korean J Radiol 2014;15:226-34. [Crossref] [PubMed]
  19. Lee J, Yun M, Kim KS, et al. Risk stratification of gallbladder polyps (1-2 cm) for surgical intervention with 18F-FDG PET/CT. J Nucl Med 2012;53:353-8. [Crossref] [PubMed]
  20. Cho IR, Lee SH, Choi JH, et al. Diagnostic performance of EUS-guided elastography for differential diagnosis of gallbladder polyp. Gastrointest Endosc 2024;100:449-456.e1. [Crossref] [PubMed]
  21. Kozakai F, Ogawa T, Sakai T, et al. Plain Computed Tomography for Differentiating Neoplastic and Non-neoplastic Pedunculated Gallbladder Polyps. Intern Med 2024;63:3025-30. [Crossref] [PubMed]
  22. Bayram Kabaçam G, Akbıyık F, Livanelioğlu Z, et al. Decision for surgery in the management of a rare condition, childhood gallbladder polyps, and the role of ultrasonography. Turk J Gastroenterol 2013;24:556-60.
  23. Bhatt NR, Gillis A, Smoothey CO, et al. Evidence based management of polyps of the gall bladder: A systematic review of the risk factors of malignancy. Surgeon 2016;14:278-86. [Crossref] [PubMed]
  24. Zevallos Maldonado C, Ruiz Lopez MJ, Gonzalez Valverde FM, et al. Ultrasound findings associated to gallbladder carcinoma. Cir Esp 2014;92:348-55. [Crossref] [PubMed]
  25. Sun Y, Yang Z, Lan X, et al. Neoplastic polyps in gallbladder: a retrospective study to determine risk factors and treatment strategy for gallbladder polyps. Hepatobiliary Surg Nutr 2019;8:219-27. [Crossref] [PubMed]
  26. Liu XS, Chen T, Gu LH, et al. Ultrasound-based scoring system for differential diagnosis of polypoid lesions of the gallbladder. J Gastroenterol Hepatol 2018;33:1295-9. [Crossref] [PubMed]
  27. Yang JI, Lee JK, Ahn DG, et al. Predictive Model for Neoplastic Potential of Gallbladder Polyp. J Clin Gastroenterol 2018;52:273-6. [Crossref] [PubMed]
  28. Zhu L, Han P, Jiang B, et al. Value of Conventional Ultrasound-based Scoring System in Distinguishing Adenomatous Polyps From Cholesterol Polyps. J Clin Gastroenterol 2022;56:895-901. [Crossref] [PubMed]
  29. Zhang X, Wang J, Wu B, et al. A nomogram-based model and ultrasonic radiomic features for gallbladder polyp classification. J Gastroenterol Hepatol 2022;37:1380-8. [Crossref] [PubMed]
  30. Liu K, Lin N, You Y, et al. Risk factors to discriminate neoplastic polypoid lesions of gallbladder: A large-scale case-series study. Asian J Surg 2021;44:1515-9. [Crossref] [PubMed]
  31. Wang Y, Peng J, Liu K, et al. Preoperative prediction model for non-neoplastic and benign neoplastic polyps of the gallbladder. Eur J Surg Oncol 2024;50:107930. [Crossref] [PubMed]
  32. Li Q, Dou M, Liu H, et al. Prediction of neoplastic gallbladder polyps in patients with different age level based on preoperative ultrasound: a multi-center retrospective real-world study. BMC Gastroenterol 2024;24:146. [Crossref] [PubMed]
  33. Choy G, Khalilzadeh O, Michalski M, et al. Current Applications and Future Impact of Machine Learning in Radiology. Radiology 2018;288:318-28. [Crossref] [PubMed]
  34. Gao C, Wu L, Wu W, et al. Deep learning in pulmonary nodule detection and segmentation: a systematic review. Eur Radiol 2025;35:255-66. [Crossref] [PubMed]
  35. Nam JG, Park S, Hwang EJ, et al. Development and Validation of Deep Learning-based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs. Radiology 2019;290:218-28. [Crossref] [PubMed]
  36. Ghosh S, Zhao X, Alim M, et al. Artificial intelligence applied to 'omics data in liver disease: towards a personalised approach for diagnosis, prognosis and treatment. Gut 2025;74:295-311. [Crossref] [PubMed]
  37. Basurto-Hurtado JA, Cruz-Albarran IA, Toledano-Ayala M, et al. Diagnostic Strategies for Breast Cancer Detection: From Image Generation to Classification Strategies Using Artificial Intelligence Algorithms. Cancers (Basel) 2022;14:3442. [Crossref] [PubMed]
  38. Peng S, Liu Y, Lv W, et al. Deep learning-based artificial intelligence model to assist thyroid nodule diagnosis and management: a multicentre diagnostic study. Lancet Digit Health 2021;3:e250-9. [Crossref] [PubMed]
  39. Biswas M, Saba L, Omerzu T, et al. A Review on Joint Carotid Intima-Media Thickness and Plaque Area Measurement in Ultrasound for Cardiovascular/Stroke Risk Monitoring: Artificial Intelligence Framework. J Digit Imaging 2021;34:581-604. [Crossref] [PubMed]
  40. Jeong Y, Kim JH, Chae HD, et al. Deep learning-based decision support system for the diagnosis of neoplastic gallbladder polyps on ultrasonography: Preliminary results. Sci Rep 2020;10:7700. [Crossref] [PubMed]
  41. Kim T, Choi YH, Choi JH, et al. Gallbladder Polyp Classification in Ultrasound Images Using an Ensemble Convolutional Neural Network Model. J Clin Med 2021;10:3585. [Crossref] [PubMed]
  42. Chen T, Tu S, Wang H, et al. Computer-aided diagnosis of gallbladder polyps based on high resolution ultrasonography. Comput Methods Programs Biomed 2020;185:105118. [Crossref] [PubMed]
  43. Yuan HX, Yu QH, Zhang YQ, et al. Ultrasound Radiomics Effective for Preoperative Identification of True and Pseudo Gallbladder Polyps Based on Spatial and Morphological Features. Front Oncol 2020;10:1719. [Crossref] [PubMed]
  44. Selvaraju RR, Cogswell M, Das A, et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int J Comput Vis 2020;128:336-59.
doi: 10.21037/tgh-25-84
Cite this article as: Zhu Y, Lu Y, Zeng Q, Wang P, Cheng M, Wu S, Mo Y, Wang Y, Zhu Z, Zhang Y, Ren Y, Zhang Y. Development of a preoperative prediction model for neoplastic polyps of the gallbladder based on a convolutional neural network model using ultrasonic images. Transl Gastroenterol Hepatol 2026;11:13.

Download Citation