site stats

Machine Learning Applications of Surgical Imaging for the Diagnosis and Treatment of Spine Disorders: Current State of the Art



Recent developments in machine learning (ML) methods demonstrate unparalleled potential for application in the spine. The ability for ML to provide diagnostic faculty, produce novel insights from existing capabilities, and augment or accelerate elements of surgical planning and decision making at levels equivalent or superior to humans will tremendously benefit spine surgeons and patients alike. In this review, we aim to provide a clinically relevant outline of ML-based technology in the contexts of spinal deformity, degeneration, and trauma, as well as an overview of commercial-level and precommercial-level surgical assist systems and decisional support tools. Furthermore, we briefly discuss potential applications of generative networks before highlighting some of the limitations of ML applications. We conclude that ML in spine imaging represents a significant addition to the neurosurgeon’s armamentarium—it has the capacity to directly address and manifest clinical needs and improve diagnostic and procedural quality and safety—but is yet subject to challenges that must be addressed before widespread implementation.

Over the past 2 decades, we have witnessed an explosive expansion of the armamentarium of imaging technologies used to diagnose and guide interventions to treat spinal disorders. Structural and functional quantitative imaging plays a critical role in the evaluation of the spine and its contents—advances in imagery analysis methods have resulted in more rapid and more specific diagnosis, therapy, and improved outcomes. More recently, the supreme capabilities of computational modeling, as applied to clinical and spine research activities, present unparalleled opportunities to accelerate or automate routine imaging tasks such as segmentation of anatomy, object (eg, lesion or region of interest) detection, and object classification.

Building on many successes across various medical specialties,1–3 machine learning (ML) has been applied to improve the diagnosis of spinal pathologies, aid planning of surgical interventions, and refine treatment of complex spinal conditions. These developments are not limited to the research setting—ML-based tools are making their way into the clinic, with the first applications receiving Food and Drug Administration (FDA) approval.

This review provides an overview of ML applications for spine diagnostics and treatments with a special focus on developing an intuitive understanding of the application of this technology by the spine surgeon. Broadly divided into sections focusing on “discriminative” (ie, gleaning insights from supplied data) and “generative” (ie, producing a new object) applications, we survey ML research in spinal deformity, degeneration and trauma, surgical planning and intraoperative assist systems, and other quality and safety improvements (Figure 1). We conclude with a brief assessment of limitations and recapitulation of the potential benefit.


Imaging-based ML benefits the physician and the patient. Advances in ML methods may represent significant improvements to the current workflow in spine surgery. Examples discussed in this review include increased diagnostic and characterization functionality to the existing picture archiving and communication system, prognostic support for preoperative/postoperative management decision making, and others. ML, machine learning.


TABLE 1. – Selected Literature for Discriminative ML Applications in Spine Surgery: Applications for Diagnosis and Characterization of Spinal Deformity

Reference ML method Input Determinant Results Context Data set size
Task Index Value
Bergeron et al5 SVM Laser scan and XR (PA) Curvature coefficient Scoliosis 149 sets
Jaremko et al6 ANN Laser scan and XR (PA) Cobb angle Se/Sp 1.0/0.75 Scoliosis 65 sets
Ramirez et al7 SVM, DT Laser scan and XR (PA) Curvature classification SVM classification Acc 69%-85% Idiopathic scoliosis 111 sets
DT classification Acc
Komeili et al8 DT Laser scan and XR Progression identification Progression Acc 85.7% Idiopathic scoliosis 100 sets
Nonprogression Acc 71.6%
Watanabe et al9 CNN Moiré photograph and XR Cobb angle MAE 3.42° Idiopathic scoliosis 1996 sets
Zhang et al10 FHT XR Cobb angle ICC >0.95 Scoliosis 76 images
MAD <5°
Wu et al11 CNN XR Landmark estimation Estimation MSE 0.0046 Scoliosis 481 images
Wu et al12 CNN XR (AP, lat) Cobb angle AP ∡ CMAE 4.04° Scoliosis 526 images
Lat ∡ 4.07°
Pan et al13 CNN XR Cobb angle ICC 0.887 Scoliosis 248 images
MAD <3.5°
Horng et al14 CNNs XR (AP) Cobb angle Acc >96% Scoliosis 35 images
Se/Sp >0.98/0.945
Zhang et al15 DCNN Model/spine XR Cobb angle Model ∡ ICC >0.912 Scoliosis 40 models and 65 images
Spine ∡ >0.771
Cho et al16 CNN XR (lat) Lordosis angle Lordosis ∡ Acc 86.2% Lordosis 629 images
MAE 8.055°
Birtane et al17 Fuzzy Model/spine XR King-Moe classification Model ∡ Acc 80% Scoliosis 10 models and 25 images
Spine ∡ 50%
Yang et al18 CNN Photograph and XR Detection and grading Detection AUC 0.929 Scoliosis 1683 sets
Duong et al19 Fuzzy Reconstructed spine models Classification patterns Scoliosis 409 models
Galbusera et al20 CNN XR Geometric parameters Cobb ∡ SE 9.9° Scoliosis/kyphosis/lordosis 493 sets
Kyphosis ∡ 8.6°
Lordosis ∡ 11.5°
∡, angle measurement; Acc, accuracy; ANN, artificial neural network; AP, anteroposterior; AUC, area under the curve; CMAE, circular mean angular error; CNN, convolutional neural network; DCNN, deep convolutional neural network; DT, decision tree; FHT, fuzzy hough transform; ICC, intraclass correlation; MAD, mean angular deviation; MAE, mean angular error; ML, machine learning; MSE, mean-squared error; PA, posteroanterior; SE, standard error; Se/Sp, sensitivity and specificity; SVM, support vector machine; XR, x-ray.


Detection and Characterization of Spinal Deformities

One of the most popular contexts for the application of discriminative neural networks (NNs) is the diagnosis and characterization of scoliosis and related spinal deformities. With an incidence as high as 68% in adults older than 60 years, there is significant interest in developing more efficacious methods of characterizing the disease.4,5 The current clinical paradigm involves manual measurement of the Cobb angle, which poses limitations based on dimensionality and observer bias. As such, ML-based methods have attempted to address these limitations in different ways (Table 1).


TABLE 2. – Selected Literature for Discriminative ML Applications in Spine Surgery: Applications for Diagnosis and Characterization of Degenerative and Traumatic Spinal Pathologies

Reference ML method Input Determinant Results Context Data set size
Task Index Value
Michopoulou et al24 FCMs T2 MRI DD detection Degenerated disk ID DSI 89.2-91.7 DDD 170 IVD
Normal disk ID 84.4-88.6
Oktay et al25 SVM T2 MRI DD detection Detection Acc 92.81% DDD 612 IVD
Se/Sp 0.946/0.898
Ghosh et al26 SVM, LDA, NBC, QDA, SVM T2 MRI DD detection Detection Acc 94.86% DDD 175 IVD
Se/Sp 0.959/0.9245
Ghosh et al27 k-NN, SVM, NBC T2-SPIR MRI DD detection ID by k-NN Acc 96.57% DDD 175 IVD
Se/Sp 0.981/0.959
ID by SVM Acc 92.0%
Se/Sp 1.0/0.8852
ID by NBC Acc 98.29%
Se/Sp 0.962/0.992
Castro-Mateos et al28 NN T2 MRI Pffirmann grading of DD Disk grades Se/Sp 0.955/0.873 DDD 240 IVD
Sundarsingh and Kesavan29 RF T2 MRI DD classification “Bulge” classification Acc 94% DDD 378 IVD
Se/Sp 0.89/0.96
“Desiccated” classification Acc 92%
Se/Sp 0.97/0.87
“Normal” classification Acc 97.3%
Se/Sp 0.71/1.0
Al-Helo et al30 NN/kM CT (sag) Fracture detection Detection by NN Acc 93.2% VF 50 scans
Detection by kM Se/Sp 0.991/0.875
Roth et al31 DCNN CT (ax/cor/sag) Fracture detection Detection AUC 0.857 VF 55 fractures in 23 patients
FPR = 5 Se 0.71
FPR = 10 Se 0.91
Burns et al32 Unclear CT Fracture detection, Genant classification/grading, and bone density measurement VF detection Se/Sp 0.957/0.71 VF, OP 210 fractures in 150 patients
Genant classification Acc 95%
WCκ 0.90
Genant grading Acc 68%
WCκ 0.59
Yousefi et al33 SVM, k-NN CT Fracture detection and classification Detection by SVM Acc 86.6% VF 25 scans
Se/Sp 0.914/0.85
Detection by k-NN Acc 88.3%
Se/Sp 0.925/0.833
Murata et al34 DCNN XR (AP) Fracture detection Detection Acc 86.0% VF 300 PTLR?
Se/Sp 0.847/0.873
Veronezi et al35 NN XR (lat) OA detection Detection Acc 62.85% Primary OA 206 images
Se/Sp 0.657/0.60
Se/Sp 0.85/0.679
Ruiz-España et al36 GVF T2 MRI (ax/sag) Pfirrmann grading of DD and SS detection DD Pfirrmann classification Se/Sp 0.958/0.926 DDD, SS 295 IVD
SS detection 0.84
Han et al37 DCNN T1/T2 MRI Neural foraminal stenosis detection Abnormal foramen detection mAP 0.837-0.876 FS 200 scans
Lee et al38 DCNN XR and DXA Bone mineral density calculation Density prediction Acc 71% OP 334 images
Se/Sp 0.81/0.60
∡, angle measurement; Acc, accuracy; AP, anteroposterior; AUC, area under the curve; Cκ, Cohen kappa; CMAE, circular mean angular error; CT, computed tomography; DCNN, deep convolutional neural network; DD, degenerated disk; DDD, degenerated disk disease; DSI, dice-squared index; DXA, dual-energy x-ray absorptiometry; FPR, false-positive rate per patient; FCMs, fuzzy c-means; FS, foraminal stenosis; ICC, intraclass correlation; ID, identification; IVD, intervertebral disk; kM, k-means; k-NN, k-nearest neighbor; LDA, linear discriminant analysis; MAD, mean angular deviation; MAE, mean angular error; mAP, mean average precision; MAE, mean angular error; MSE, mean-squared error; NBC, naive Bayesian classifier; NN, neural network; OA, osteoarthritis; OP, osteoporosis; QDA, quadratic discriminant analysis; RF, random forest; Se/Sp, sensitivity and specificity; SPIR, spectral presaturation with inversion recovery; SS, spinal stenosis; SVM, support vector machine; VF, vertebral fracture; WCκ, weighted Cohen kappa; XR, x-ray.

The earliest such attempts involved the estimation of spinal deformity from surface topography. Owing to the prohibitive cost and supine positional requirement of MRI-based methods, heterogeneity of results from extant “scoliometer” systems, and large doses of radiation required for radiography-based diagnosis and characterization, the ability to derive the same insights into skeletal structure from surface topography was urgently necessary. Early efforts involved the use of artificial NNs,6 support vector machines (SVMs),5,7 and decision trees8 in the identification or characterization of scoliotic spines from laser scans. The authors pointed out that such methods could reduce radiation doses for individuals without scoliotic progression.8 Watanabe et al9 and Yang et al18 trained convolutional NN-based scoliosis classifiers based on photographic imagery and reported superhuman accuracy. These studies showed promising results to classify and monitor the severity of scoliosis based on surface tomography. Of note, the results are heterogenous and cannot be compared without validation studies. Nonetheless, improvements in imaging technology, ML development, and training on larger data sets could avoid misclassification errors and increase the accuracy and reproducibility of trained ML. As such, radiation-independent identification of scoliosis and scoliotic progression may allow for implementation of such ML-based topographical analysis in a screening capacity.

Other approaches to the characterization of scoliosis have relied on ML-based extraction of spinal geometric features. Methods were trained on spinal radiographs10–14,17,19,20 or manufactured models15,17 and were either automatic or semiautomatic (ie, manually tagged region of interest). Approaches by Zhang et al,10 Wu et al,11,12 Pan et al,13 and Horng et al14 produced Cobb angle measurements without significant differences from expert physician measurements with the same interobserver variance found in manual measurement.21 A recent application of U-nets in the context of lordosis measurement achieved physician-level accuracy albeit with higher variability, demonstrating the promise of such methods in the context of other spinal deformities.16

Diagnosis of Degenerative and Traumatic Spinal Pathologies

Degeneration and traumatic injury both represent a major source of healthcare burden around the world, particularly in countries with aging populations.22,23 Although significant effort has been put into developing automated methods of detecting and delineating clinically relevant spinal anatomy, few methods have achieved notable levels of accuracy in characterizing degenerative conditions. The lack of distinct radiographical intensity values for relevant anatomy poses an immense challenge—the annulus fibrosis and surrounding connective tissue are isointense, as are the nucleus of the intervertebral disk and the surrounding vertebral body. Furthermore, the 2-dimensional slices used in conventional MRI provide lower-resolution voxel data, resulting in partial volume effects that can cause blurring of boundaries.24 As such, applications of ML to radiographic interpretation tasks may represent a more streamlined approach to characterizing degenerative and traumatic spinal disease before treatment (Table 2).

ML methods applied to identifying degenerated disks have used fuzzy c-means,24 SVMs,25 sequential combinations of ML methods,26 or majority-voting systems in which the output of a variety of ML methods is weighted.27 Ghosh et al26,27 used the latter 2 methods in identifying degenerated disk disease on T2-weighted midsagittal MRI scans—both achieving impressive results. Following work by Oktay et al25 used a SVM on a substantially larger data set and found that their SVM method outperformed the 2 methods used by Ghosh et al.26,27

Later approaches used ML methods to classify degenerated disks along self-defined criteria29 or the “gold standard” Pfirrmann grading criteria.28,36 Sundarsingh and Kesavan29 developed a fully automated system that identified and classified intervertebral disks on T2 MRI as “normal,” “bulging,” or “desiccated” through a combination of feature detection methods and ML classifiers. Ruiz-España et al36 developed a semiautomatic system that used gradient vector flow to grade degenerated disks, achieving “almost-perfect” agreement with a human expert (Cohen kappa of 0.81). Similarly, Castro-Mateos et al28 used a NN to classify degenerated disks by the Pfirrmann criteria on T2 MRI with manually marked disk centers with respectable accuracy. As such, these methods demonstrate a trend of increasing automation and clinically relevant accuracy.

Such approaches have been implemented in computer-aided design (CAD) systems for the automated detection of vertebral fractures. Methods used in such cases include NNs,30,33,34 deep CNNs (DCNNs),31 and SVMs33 applied to computed tomography (CT) scans and radiographs. Burns et al32 developed a CAD system capable of detecting, classifying, and Genant grading vertebral fractures with “almost-perfect” alignment with an expert radiologist. A remarkable method was developed by Murata et al,34 who trained and validated a DCNN on a data set of 300 anteroposterior radiographs. Murata et al34 compared the DCNN’s accuracy against 20 orthopedic residents, 24 board-certified orthopedic surgeons, and 9 board-certified spine surgeons—the DCNN outperformed or almost matched the residents and orthopedists for accuracy and sensitivity and caught 96% of the fractures missed by humans. This was despite the fact that the DCNN identifications were based entirely on radiographs although the physicians could access nonradiological information in the electronic medical record (EMR). These works demonstrate the massive potential for ML methods to increase diagnostic accuracy and reduce the cost and radiation exposure involved—radiographs are a fraction of the cost of MRIs and require 5% to 33% as much radiation exposure as CT scans.34

Although detection and classification of disk degeneration, disk herniation, and vertebral fractures represent the bulk of ML-based work in this space, ML-based CAD systems are being investigated for use with other structural spinal pathologies such as osteoarthritis,35 osteoporosis,38 and foraminal stenosis36,37,39 (Figure 2). Furthermore, “multivalent” CAD systems capable of identifying multiple pathologies are also under investigation.42 These applications indicate the tremendous potential of ML in the context of degenerative and traumatic spinal disease.


Discriminative and generative machine learning. In general, ML models are divided into discriminative models and generative models. This illustration shows examples of ML applications in spine imaging and surgery. A, Discriminative modeling is a supervised learning task that involves training a model to compute the probability of output given an input. In this example, a convolutional neural network was trained for accurate vertebral segmentation and disk-level localization. After that, the model was able to automatically measure disk volumes for stenosis grading.40 B, Generative modeling is an unsupervised learning task that involves automatically discovering and learning the characteristics of input data in such a way that the model can be used to generate or output new data that plausibly could have been drawn from the original data set.41 Here, a generative adversarial network was used to generate a magnetic resonance image from a captured computed tomography image. Another task may involve enhancing the resolution of acquired radiology images. ML, machine learning.


Surgical Planning

In addition to identifying or characterizing pathologies, image analysis applications exemplify the capacity of ML methods to assist the spine surgeon in generating plans and during the procedure. Integrated directly into the surgical workflow, such methods can augment or automate otherwise time-consuming or fallible processes. These systems represent tremendous potential in the context of preoperative and intraoperative assistance for spinal applications.

Assistive planning software has given surgeons the advantage to generate 3-dimensional (3D) models of preoperative imaging to both better understand complex anatomy and facilitate better deformity correction by fabricating patient-specific instrumentation and implants through 3D printing. Such models have also been used to customize pedicle screws, guide osteotomies, and define stress on the spine through simulation of the planned intervention. This capability is enabled by ML algorithms that automate the segmentation of the spine and generation of 3D models.43 These models are trained on patient data to design patient-specific spinal cages in a manner that significantly reduces cage fitting time and improves correction quality for better outcomes.44

Moreover, NNs were trained to autonomously place pedicle screws with the correct length, diameter, and angulation for correction surgery. In a retrospective study on preoperative CT scans of 20 patients and 208 pedicle screws, Siemionow et al45 found that 99% of pedicle screw trajectories suggested by ARAI (Surgalign) were Ravi grade 1 and Gertzbein grade A, indicating no dural breach. Systems such as these may represent a very potent tool for the spine surgeon of the future.

Intraoperative Assist Systems

Other ML methods directly integrate with the operative workflow to assist the surgeon—one such system, the ClarifEye (Royal Philips), has already reached commercialization (Figure 3). This system combines ML image segmentation with augmented realitydisplay based on intraoperative cone beam CT (cbCT), enabling the surgeon to visualize both anatomy and equipment within the surgical site by way of a head-mounted display. This system has been tested for spinal fusion,47 bone biopsy and pedicle screw placement,48,49 and percutaneous vertebroplasty.50 Using a hough transform trained on 20 cadavers, the ClarifEye system could automatically identify vertebral pedicles and suggest screw paths. When this trained model was applied to intraoperative cbCT scans, generated surgical plans had 86.1% accuracy, rising to 95.4% when cadavers with severe scoliosis, degeneration, or prior surgery were excluded.47 Elmi-Terander et al51 studied the accuracy of Jamshidi needle placement with ClarifEye assistance and found a mean error comparable with human placement alongside a reduction in angular deviation (associated with unfavorable outcomes). The authors further reported a 94.1% accuracy in screw placement in clinical trials.51 Auloge et al50 used the system for the planning of transpedicular approaches for minimally invasive percutaneous vertebroplasty in another clinical trial and found no significant differences in accuracy between computer-assisted and manual planning at the expense of doubled trocar deployment time. Furthermore, Auloge et al50 and Elmi-Terander et al51 both noted significant reductions in radiation dose because of the less frequent use of intraoperative CT.


Intraoperative assistance with computer vision. Imagery-based intraoperative support applications that can enable visualization of anatomy, instrumentation, and trajectory volumes in minimally invasive procedures have already begun small-scale clinical validation in spine surgery. (1) Preoperative/intraoperative imagery from computed tomography or MRI is acquired and input to an ML method. (2) The ML method identifies spinal anatomy, pathologies, and instrumentation, registering 3D models against the original imaging modality; (3) optimal trajectories for screws and instruments may be generated. (4) 3D models are used to generate an overlay of the surgical field through a head-mounted display (eg, Philips ClarifEye, xvision, and ZedView) or reflection onto semitransparent mirror (eg, ARAI, Perk station used by Fritz et al46). 3D, three-dimensional; ML, machine learning.

While the ClarifEye system is pending FDA approval, the xvision (Augmedics) augmented reality spinal fusion assist system has already entered clinical use in the United States. The xvision system consists of a wireless head-mounted display that projects the position of anatomy and instrumentation onto the surgeon’s field of view by way of registration markers that are attached to surgical equipment and the spinal anatomy, such as the iliac crest or spinous processes. This system allows for surgeons to visualize the anatomy and intraoperative navigation plans without taking their eyes or hands out of the field. When compared with existing computer-based navigation methods, xvision is able to match or exceed pedicle screw placement accuracy.49

Although the ClarifEye and xvision systems may be the vanguard of commercial ML-based surgical planning and assistance technologies, they are by no means alone. The ARAI (Surgalign) combines a NN-based pedicle screw trajectory planner with real-time spinal segmentation and registration.45 The resulting system is able to project the positions of anatomy and hardware onto a transparent screen between the surgeon and the patient and validated on cadavers.52 Development of such technologies is also taking place in academia—Fritz et al46 used the 3D Slicer platform with a Perk station to perform MRI-guided vertebroplasties—the Perk station superimposed real-time needle tracking on a registered MRI scan and projected it over the patient. Using such an approach, the authors reported accurate needle placement with no leakage of cement into the surrounding spaces. Similarly, Abe et al53 superimposed surgical plans generated by the ZedView (LEXI) preoperative planning system onto head-mounted displays for visualization of vertebroplasty needles and found significant reductions in insertion angle error relative to manually generated plans. Other ML-guided imagery-based methods are yet in development and may broaden this admittedly limited scope—technical validation of reinforcement learning systems and Markov decision process-based methods for automated planning of anterior approach spinal fusion surgery has already been demonstrated.54 Although these technologies are yet in their infancy, it is important to note their potential in aiding the accuracy and efficiency of the surgeon.

Predictive and Prognostic Support

The application of ML to interpretation of medical data with the intent of developing prognostic indicators for predictive and decisional support is a region of significant research interest across all medical specialties. However, most approaches use a combination of demographic and quantitative data from the EMR and lie outside the radiological and spinal focus of this review. The interested reader is directed to excellent reviews on such applications of ML to neurosurgical decision-making support by Malik and Khan,55 Damron and Mann,56 and Buchlak et al.57 In this section, we aim to provide an overview of ML-based predictive, prognostic, and decision-making support systems that primarily use spinal radiographic data.

Conventional bone-imaging techniques gather data along many osteological parameters with exquisitely high resolution. However, the relationship between parameters and osteological properties of bone is by-and-large poorly understood. ML methods have been applied to generate novel predictive insights from such parameter/property relationships. Atkinson et al58 used a Gradient Boosting Machine (GBM) to assess both vertebral and Colles fracture risks based on quantitative CT and dual-energy x-ray absorptiometry and found that the GBM outperformed a conventional regression model. More recent work by Muehlematter et al59 tasked junior and senior radiologists and a SVM with predicting vertebral fracture based solely on an initial CT scan and found that it handily outperformed both radiologists. Another effort by Hopkins et al60 saw the application of a DCNN to the diagnosis and grading of cervical spondylotic myelopathy on MRI scans according to the modified Japanese Orthopedic Association clinical criteria, achieving strong results.

This unique insight generation capability has also been applied to the prediction of treatment outcomes. Lewandrowski et al61 reported tasking the Multus Radbot (Aptus Engineering) DCNN with predicting the likelihood of successful endoscopic decompression from preoperative MRIs and found nearly identical prediction accuracy between a radiologist and the DNN. Similarly, Pasha et al62 used a random forest (RF) classifier to predict posterior spinal fusion outcomes with 75% accuracy based on preoperative radiographic and clinical measurements and surgical parameters. Furthermore, recent developments in DNN-based data fusion methods have enabled the synthesis of both imaging-based and EMR-based ML methods.63 As such, routine or postinterventional imaging may yield powerful prognostic and predictive insight, particularly when combined with other data sources (Figure 4).


Integrated data fusion workflows. Discriminative and generative data products from bespoke ML applications tailored toward radiographic analysis and demographic data can in turn be synthesized by “fusion” ML methods. As such, a large variety of data, such as preoperative or routine imaging, laboratory values, and demographic details, can be sequentially analyzed and integrated by ML methods to generate predictive and decisional insight based on the sum total of information available to the clinician or surgeon. ML, machine learning.


While discriminative networks represent the bulk of ongoing efforts in medical ML, generative adversarial networks (GANs) are not yet a decade old at the time of writing yet represent a vast and as yet unexplored space for clinical applications.64 By definition, a GAN pits generator and discriminator networks against each other, ultimately producing objects that sufficiently imitate the training data to fool the discriminator network (Figure 2). As such, these systems can be tuned to focus on their generative or discriminative functions. While the latter largely resemble applications of NNs,37,6567 the former can provide more efficient and/or accurate solutions to challenges in the methodology of imagery acquisition and analysis.41 GANs trained on radiographic imagery have been applied in the context of radiotherapy treatment plan generation,68 prediction of brain tumor growth patterns,69 and acceleration of image recompilation in an existing picture archiving and communication system.70 However, the greatest strength lies in the “quality-of-life” improvements that GANs represent. Owing to the scarcity of homogenous training data, some have applied GANs to generate artificial medical imagery as a training set for other ML systems.71 Similarly, GAN-based imagery protocols have captured MRI and CT images with reduced scan times,72,73 radiation doses,74,75 and contrast required.76 Retrospective GANs can intelligently denoise low-quality imagery77–81 and correct for motion artefacts.82 The most fascinating application of GAN involves intermodality conversion of imaging—magnetic resonance imaging to CT,83–85 CT to magnetic resonance imaging,86 and CT to cbCT.87 This functionality may enable the surgeon and the physician to benefit from the diagnostic insight of either modality while reducing costs to the patient or minimizing radiation dose. As such, although GAN methods are still in their infancy, successful implementation will dramatically improve the medical imagery workflow. The interested reader is directed to the excellent review by Yi et al64 on the subject for further information.


A ML model will only be as good as the data by which it trained—a severe limitation for current applications based on spine imaging. Although large data sets exist for demographic or EMR-based applications, there are very few dedicated to neuroimaging.88 Without large data sets that comprehensively cover the characteristics of spinal pathologies, the development of truly generalizable applications will be severely hindered. Furthermore, the “ground truth” for many discriminative applications in imaging analysis and computer vision is typically the assessment of a human expert. Thus, evaluation of accuracy will always be limited by measurement variability intrinsic to the ground truth.9,89 Finally, many newer deep learning methods are functionally “black boxes”—the mechanism by which they draw conclusions is unknown.90 This poses new ethical and medicolegal considerations for applications of these systemics in a clinical context.91,92 As such, ML applications in spine surgery are very much terra incognita—these challenges must be carefully navigated to fully enjoy the benefits that ML can provide.


The increase in applications and accessibility of ML brings with it an unparalleled potential for application in the neurosurgeon’s armamentarium. The ability for ML to quickly approach human or superhuman levels of diagnostic faculty or provide novel insights from existing capabilities will tremendously benefit physicians and their patients across the board. Although some systems have reached the commercial level, significant work must be done before widespread implementation.


1. Papp L, Spielvogel CP, Rausch I, Hacker M, Beyer T. Personalizing medicine through hybrid imaging and medical big data analysis. Front Phys. 2018;6:51.

2. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nat Rev Cancer. 2018;18(8):500-510.

3. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.

4. Schwab F, Dubey A, Gamez L, et al. Adult scoliosis: prevalence, SF-36, and nutritional parameters in an elderly volunteer population. Spine. 2005;30(9):1082-1085.

5. Bergeron C, Cheriet F, Ronsky J, Zernicke R, Labelle H. Prediction of anterior scoliotic spinal curve from trunk surface using support vector regression. Eng Appl Artif Intell. 2005;18(8):973-983.

6. Jaremko JL, Poncet P, Ronsky J, et al. Estimation of spinal deformity in scoliosis from torso surface cross sections. Spine. 2001;26(14):1583-1591.

7. Ramirez L, Durdle NG, Raso VJ, Hill DL. A support vector machines classifier to assess the severity of idiopathic scoliosis from surface topography. IEEE Trans Inf Technol Biomed. 2006;10(1):84-91.

8. Komeili A, Westover L, Parent EC, El-Rich M, Adeeb S. Monitoring for idiopathic scoliosis curve progression using surface topography asymmetry analysis of the torso in adolescents. Spine J. 2015;15(4):743-751.

9. Watanabe K, Aoki Y, Matsumoto M. An application of artificial intelligence to diagnostic imaging of spine disease: estimating spinal alignment from moiré images. Neurospine. 2019;16(4):697-702.

10. Zhang J, Lou E, Le LH, Hill DL, Raso JV, Wang Y. Automatic Cobb measurement of scoliosis based on fuzzy hough transform with vertebral shape prior. J Digit Imaging. 2009;22(5):463-472.

11. Wu H, Bailey C, Rasoulinejad P, Li S. Automatic landmark estimation for adolescent idiopathic scoliosis assessment using BoostNet. In: Descoteaux M, Meir-Hein L, Franz A, Jannin P, Collins D, Duchesne S, eds. Medical Image Computing and Computer Assisted Intervention—MICCAI 2017. Lecture Notes in Computer Science, 10433. Springer Verlag; 2017:127-135.

12. Wu H, Bailey C, Rasoulinejad P, Li S. Automated comprehensive adolescent idiopathic scoliosis assessment using MVC-net. Med Image Anal. 2018;48:1-11.

13. Pan Y, Chen Q, Chen T, et al. Evaluation of a computer-aided method for measuring the Cobb angle on chest X-rays. Eur Spine J. 2019;28(12):3035-3043.

14. Horng MH, Kuok CP, Fu MJ, Lin CJ, Sun YN. Cobb angle measurement of spine from X-ray images using convolutional neural network. Comput Math Methods Med. 2019;2019:6357171..

15. Zhang J, Li H, Lv L, Zhang Y. Computer-aided Cobb measurement based on automatic detection of vertebral slopes using deep neural network. Int J Biomed Imaging. 2017;2017:9083916.

16. Cho BH, Kaji D, Cheung ZB, et al. Automated measurement of lumbar lordosis on radiographs using machine learning and computer vision. Glob Spine J. 2020;10(5):611-618.

17. Birtane S, Korkmaz H. Rule-based fuzzy classifier for spinal deformities. Biomed Mater Eng. 2014;24(6):3311-3319.

18. Yang J, Zhang K, Fan H, et al. Development and validation of deep learning algorithms for scoliosis screening using back images. Commun Biol. 2019;2(1):390-398.

19. Duong L, Cheriet F, Labelle H. Three-dimensional classification of spinal deformities using fuzzy clustering. Spine. 2006;31(8):923-930.

20. Galbusera F, Niemeyer F, Wilke HJ, et al. Fully automated radiological analysis of spinal disorders and deformities: a deep learning approach. Eur Spine J. 2019;28(5):951-960.

21. Gstoettner M, Sekyra K, Walochnik N, Winter P, Wachter R, Bach CM. Inter- and intraobserver reliability assessment of the Cobb angle: manual versus digital measurement tools. Eur Spine J. 2007;16(10):1587-1592.

22. Ravindra VM, Senglaub SS, Rattani A, et al. Degenerative lumbar spine disease: estimating global incidence and worldwide volume. Glob Spine J. 2018;8(8):784-794.

23. Alexandru D, So W. Evaluation and management of vertebral compression fractures. Perm J. 2012;16(4):46-51.

24. Michopoulou SK, Costaridou L, Panagiotopoulos E, Speller R, Panayiotakis G, Todd-Pokropek A. Atlas-based segmentation of degenerated lumbar intervertebral discs from MR images of the spine. IEEE Trans Biomed Eng. 2009;56(9):2225-2231.

25. Oktay AB, Albayrak NB, Akgul YS. Computer aided diagnosis of degenerative intervertebral disc diseases from lumbar MR images. Comput Med Imaging Graph. 2014;38(7):613-619.

26. Ghosh S, Alomari RS, Chaudhary V, Dhillon G. Composite features for automatic diagnosis of intervertebral disc herniation from lumbar MRI. Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:5068-5071.

27. Ghosh S, Alomari RS, Chaudhary V, Dhillon G. Computer-aided diagnosis for lumbar MRI using heterogeneous classifiers. In: IEEE International Symposium on Biomedical Imaging: From Nano to Macro. IEEE; 2011:1179-1182.

28. Castro-Mateos I, Hua R, Pozo JM, Lazary A, Frangi AF. Intervertebral disc classification by its degree of degeneration from T2-weighted magnetic resonance images. Eur Spine J. 2016;25(9):2721-2727.

29. Sundarsingh S, Kesavan R. Diagnosis of disc bulge and disc desiccation in lumbar MRI using concatenated shape and texture features with random forest classifier. Int J Imaging Syst Technol. 2020;30(2):340-347.

30. Al-Helo S, Alomari RS, Ghosh S, et al. Compression fracture diagnosis in lumbar: a clinical CAD system. Int J Comput Assist Radiol Surg. 2013;8(3):461-469.

31. Roth HR, Wang Y, Yao J, Lu L, Burns JE, Summers RM. Deep convolutional networks for automated detection of posterior-element fractures on spine CT. In: Medical Imaging 2016: Computer-Aided Diagnosis, 9785. SPIE; 2016:97850P.

32. Burns JE, Yao J, Summers RM. Vertebral body compression fractures and bone density: automated detection and classification on CT Images. Radiology. 2017;284(3):788-797.

33. Yousefi H, Salehi E, Sheyjani OS, Ghanaatti H. Lumbar spine vertebral compression fracture case diagnosis using machine learning methods on CT images. In: 4th International Conference on Pattern Recognition and Image Analysis, IPRIA 2019. Institute of Electrical and Electronics Engineers Inc.; 2019:179-184.

34. Murata K, Endo K, Aihara T, et al. Artificial intelligence for the detection of vertebral fractures on plain spinal radiography. Sci Rep. 2020;10(1):20031.

35. Veronezi CC, de Azevedo Simões PW, dos Santos RL, et al. Computational analysis based on artificial neural networks for aiding in diagnosing osteoarthritis of the lumbar spine. Rev Bras Ortop. 2011;46(2):195-199.

36. Ruiz-España S, Arana E, Moratal D. Semiautomatic computer-aided classification of degenerative lumbar spine disease in magnetic resonance imaging. Comput Biol Med. 2015;62:196-205.

37. Han Z, Wei B, Mercado A, Leung S, Li S. Spine-GAN: semantic segmentation of multiple spinal structures. Med Image Anal. 2018;50:23-35.

38. Lee S, Choe EK, Kang HY, Yoon JW, Kim HS. The exploration of feature extraction and machine learning for predicting bone density from simple spine X-ray images in a Korean population. Skeletal Radiol. 2020;49(4):613-618.

39. Lu J-T, Pedemonte S, Bizzo B, et al. DeepSPINE: automated lumbar vertebral segmentation, disc-level designation, and spinal stenosis grading using deep learning. arXiv. 2018. Accessed June 28, 2021.

40. Lu J-T, Pedemonte S, Bizzo B, et al. DeepSPINE: automated lumbar vertebral segmentation, disc-level designation, and spinal stenosis grading using deep learning. arXiv. 2018. Accessed June 25, 2021.

41. Shin Y, Yang J, Lee YH. Deep generative adversarial networks: applications in musculoskeletal imaging. Radiol Artif Intell. 2021;3(3):e200157.

42. Zukic D, Vlasák A, Egger J, Hořínek D, Nimsky C, Kolb A. Robust detection and segmentation for diagnosis of vertebral diseases using routine MR images. Comput Graph Forum. 2014;33(6):190-204.

43. Mikulka J, Chalupa D, Riha K, Filipovic M, Dostal M. Pediatric spine segmentation and modeling using machine learning. In: 11th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), 12. IEEE; 2019:1–5.

44. McAFEE PC, Cunningham B, Mullinex K, Dobbs E, Eiserman L. Middle-column gap balancing and middle-column mismatch in spinal reconstructive surgery. Int J Spine Surg. 2018;12(2):160-171.

45. Siemionow K, Forsthoefel C, Foy M, Gawel D, Luciano C. Autonomous lumbar spine pedicle screw planning using machine learning: a validation study. J Craniovertebr Junction Spine. 2021;12(3):223.

46. Fritz J, U-Thainual P, Ungi T, et al. MR-guided vertebroplasty with augmented reality image overlay navigation. Cardiovasc Intervent Radiol. 2014;37(6):1589-1596.

47. Burström G, Buerger C, Hoppenbrouwers J, et al. Machine learning for automated 3-dimensional segmentation of the spine and suggested placement of pedicle screws based on intraoperative cone-beam computer tomography. J Neurosurg Spine. 2019;31(1):147-154.

48. Elmi-Terander A, Burström G, Nachabe R, et al. Pedicle screw placement using augmented reality surgical navigation with intraoperative 3D imaging: a first in-human prospective cohort study. Spine. 2019;44(7):517-525.

49. Molina CA, Theodore N, Ahmed AK, et al. Augmented reality-assisted pedicle screw insertion: a cadaveric proof-of-concept study. J Neurosurg Spine. 2019;33(1):1-8.

50. Auloge P, Cazzato RL, Ramamurthy N, et al. Augmented reality and artificial intelligence-based navigation during percutaneous vertebroplasty: a pilot randomised clinical trial. Eur Spine J. 2020;29(7):1580-1589.

51. Elmi-Terander A, Nachabe R, Skulason H, et al. Feasibility and accuracy of thoracolumbar minimally invasive pedicle screw placement with augmented reality navigation technology. Spine. 2018;43(14):1018-1023.

52. Siemionow KB, Katchko KM, Lewicki P, Luciano CJ. Augmented reality and artificial intelligence-assisted surgical navigation: technique and cadaveric feasibility study. J Craniovertebr Junction Spine. 2020;11(2):81-85.

53. Abe Y, Sato S, Kato K, et al. A novel 3D guidance system using augmented reality for percutaneous vertebroplasty: technical note. J Neurosurg Spine. 2013;19(4):492-501..

54. Zhang Q, Li M, Qi X, Hu Y, Sun Y, Yu G. 3D path planning for anterior spinal surgery based on CT images and reinforcement learning. In: IEEE International Conference on Cyborg and Bionic Systems (CBS). IEEE; 2018:317-321.

55. Malik AT, Khan SN. Predictive modeling in spine surgery. Ann Transl Med. 2019;7(S5):S173-S173.

56. Damron TA, Mann KA. Fracture risk assessment and clinical decision making for patients with metastatic bone disease. J Orthop Res. 2020;38(6):1175-1190.

57. Buchlak QD, Esmaili N, Leveque JC, et al. Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review. Neurosurg Rev. 2020;43(5):1235-1253.

58. Atkinson EJ, Therneau TM, Melton LJ, et al. Assessing fracture risk using gradient boosting machine (GBM) models. J Bone Miner Res. 2012;27(6):1397-1404.

59. Muehlematter UJ, Mannil M, Becker AS, et al. Vertebral body insufficiency fractures: detection of vertebrae at risk on standard CT images using texture analysis and machine learning. Eur Radiol. 2019;29(5):2207-2217.

60. Hopkins BS, Weber KA, Kesavabhotla K, Paliwal M, Cantrell DR, Smith ZA. Machine learning for the prediction of cervical spondylotic myelopathy: a post hoc pilot study of 28 participants. World Neurosurg. 2019;127(5):e436-e442.

61. Lewandrowski KU, Muraleedharan N, Eddy SA, et al. Artificial intelligence comparison of the radiologist report with endoscopic predictors of successful transforaminal decompression for painful conditions of the lumber spine: application of deep learning algorithm interpretation of routine lumbar magnetic. Int J Spine Surg. 2020;14(s3):S75-S85.

62. Pasha S, Shah S, Newton P. Machine learning predicts the 3D outcomes of adolescent idiopathic scoliosis surgery using patient-surgeon specific parameters. Spine. 2021;46(9):579-587.

63. Huang SC, Pareek A, Zamanian R, Banerjee I, Lungren MP. Multimodal fusion with deep neural networks for leveraging CT imaging and electronic health record: a case-study in pulmonary embolism detection. Sci Rep. 2020;10(1):22147-22149.

64. Yi X, Walia E, Babyn P. Generative adversarial network in medical imaging: a review. Med Image Anal. 2019;58:101552.

65. Neff T, Payer C, Štern D, Urschler M. Generative adversarial network based synthesis for supervised medical image segmentation. In: Proceedings of the OAGM & ARW Joint Workshop 2017: Vision. Automation and Robotics; 2017:140-145.

66. Alex V, Safwan KPM, Chennamsetty SS, Krishnamurthi G. Generative adversarial networks for brain lesion detection. In: Styner MA, Angelini ED. eds. Medical Imaging 2017: Image Processing, 10133. SPIE; 2017:101330G.

67. Brion E, Léger J, Barragán-Montero AM, Meert N, Lee JA, Macq B. Domain adversarial networks and intensity-based data augmentation for male pelvic organ segmentation in cone beam CT. Comput Biol Med. 2021;131:104269.

68. Mahmood R, Babier A, McNiven A, Diamant A, Chan TCY. Automated treatment planning in radiation therapy using generative adversarial networks. arXiv. 2018:1-14.

69. Elazab A, Wang C, Gardezi SJS, et al. GP brain tumor growth prediction using stacked 3D generative adversarial networks from longitudinal MR Images. Neural Networks. 2020;132:321-332..-

70. Mardani M, Gong E, Cheng JY, et al. Deep generative adversarial networks for compressed sensing (GANCS) automates MRI. arXiv. 2017:1-12.

71. Iqbal T, Ali H. Generative adversarial network for medical images (MI-GAN). J Med Syst. 2018;42(11):231.

72. Oh G, Sim B, Chung HJ, Sunwoo L, Ye JC. Unpaired deep learning for accelerated MRI using optimal transport driven CycleGAN. IEEE Trans Comput Imaging. 2020;6:1285-1296.

73. Quan TM, Nguyen-Duc T, Jeong WK. Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss. IEEE Trans Med Imaging. 2018;37(6):1488-1497.

74. Dashtbani Moghari M, Zhou L, Yu B, et al. Efficient radiation dose reduction in whole-brain CT perfusion imaging using a 3D GAN: performance and clinical feasibility. Phys Med Biol. 2021;66(7):075008.

75. Kearney V, Chan JW, Wang T, et al. DoseGAN: a generative adversarial network for synthetic dose prediction using attention-gated discrimination and generation. Sci Rep. 2020;10(1):11073-11078.

76. Haubold J, Hosch R, Umutlu L, et al. Contrast agent dose reduction in computed tomography with deep learning using a conditional generative adversarial network. Eur Radiol. 2021;31(8):6087-6095.

77. Yang Q, Yan P, Zhang Y, et al. Low dose CT image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE Trans Med Imaging. 2018;37(6):1348-1357.

78. Gregory S, Cheng H, Newman S, Gan Y. HydraNet: a multi-branch convolutional neural network architecture for MRI denoising. Med Imaging. 2021;11596:1159638.

79. Lyu Q, You C, Shan H, Zhang Y, Wang G. Super-resolution MRI and CT through GAN-CIRCLE. In: Müller B, Wang G, eds. Developments in X-Ray Tomography XII. SPIE; 2019:30. doi: 10.1117/12.2530592.

80. You C, Li G, Zhang Y, et al. CT super-resolution GAN constrained by the identical, residual, and cycle learning ensemble (GAN-CIRCLE). arXiv. 2018;39(1):188-203.

81. Chen Y, Christodoulou AG, Zhou Z, et al. MRI super-resolution with GAN and 3D multi-level DenseNet: smaller, faster, and better. arXiv. 2020. Accessed June 27, 2021.

82. Jiang W, Liu Z, Lee K-H, et al. Respiratory motion correction in abdominal MRI using a densely connected U-net with GAN-guided training. arXiv. 2019. Accessed June 27, 2021.

83. Kearney V, Ziemer BP, Perry A, et al. Attention-aware discrimination for MR-to-CT image translation using cycle-consistent generative adversarial networks. Radiol Artif Intell. 2020;2(2):e190027.

84. Wu H, Jiang X, Jia F. UC-GAN for MR to CT Image Synthesis. In: Nguyen D, Xing L, Jiang S, eds. Artificial Intelligence in Radiation Therapy. AIRT 2019. Lecture Notes in Computer Science. vol 11850. Springer; 2019:146-153. doi: 10.1007/978-3-030-32486-5_18.

85. Staartjes VE, Seevinck PR, Vandertop WP, van Stralen M, Schröder ML. Magnetic resonance imaging–based synthetic computed tomography of the lumbar spine for surgical planning: a clinical proof-of-concept. Neurosurg Focus. 2021;50(1):1-7.

86. Lee JH, Han IH, Kim DH, et al. Spine computed tomography to magnetic resonance image synthesis using generative adversarial networks: a preliminary study. J Korean Neurosurg Soc. 2020;63(3):386-396.

87. Liang X, Chen L, Nguyen D, et al. Generating synthesized computed tomography (CT) from cone-beam computed tomography (CBCT) using cyclegan for adaptive radiation therapy. arXiv. 2018:1-14.

88. Ghogawala Z, Dunbar M, Essa I. Artificial intelligence for the treatment of lumbar spondylolisthesis. Neurosurg Clin N Am. 2019;30(3):383-389.

89. Langensiepen S, Semler O, Sobottke R, et al. Measuring procedures to determine the Cobb angle in idiopathic scoliosis: a systematic review. Eur Spine J. 2013;22(11):2360-2371.

90. Staartjes VE, Stienen MN. Data mining in spine surgery: leveraging electronic health records for machine learning and clinical research. Neurospine. 2019;16(4):654-656..

91. Watson DS, Krutzinna J, Bruce IN, et al. Clinical applications of machine learning algorithms: beyond the black box. BMJ. 2019;364:l886..

92. Martín Noguerol T, Paulano-Godino F, Martín-Valdivia MT, Menias CO, Luna A. Strengths, Weaknesses, opportunities, and threats analysis of artificial intelligence and machine learning applications in radiology. J Am Coll Radiol. 2019;16(9):1239-1247.

Comments are closed.