Introduction
Computer vision is an interdisciplinary field that focuses on automating and replicating tasks by processing, analyzing, and understanding visual data (images or videos) in the same way that humans do. Computer vision and pattern recognition techniques have demonstrated great potential in many fields, including medicine.1-6 Computer vision has a long history spanning decades of efforts and research, developing to enable machines to perceive visual stimuli meaningfully. There has been a significant acceptance of computer vision–based technologies in the field of radiology,7-11 where computer-aided diagnostic systems are assisting medical professionals in making accurate diagnoses and predictions.
Artificial intelligence (AI), or machine learning (ML), is the most discussed topic today in medical imaging research and has the potential to invest deeply in all fields of medicine, significantly altering the way medicine is practiced. Taking advantage of the increasingly large amount of labeled medical data, AI has the potential to augment existing computer-based tools and assist medical professionals for certain repetitive and/or specialized tasks. For example, in the case of computed tomography (CT) and magnetic resonance imaging, there are hundreds of slices to examine, making it time consuming and an exhaustive task to perform manually. In some cases, the images acquired from these systems can be accompanied by distortions (eg, noise or poor quality). AI might help achieve results in a short amount of time. Intelligent dental radiographic interpretation tools are developed to assist dental health professionals to help enhance oral health care. In 1978, Richard Bellman defined AI as the automation of activities associated with human thinking abilities, which includes learning, decision making, and problem solving.12 A dental professional uses clinical and imaging information collected along with knowledge to make a decision on prognosis and appropriate patient management. The impact and implications of imaging diagnosis in dentistry is accentuated by the fact that dentistry is one of the few health care fields that routinely uses imaging to screen for abnormality across all age groups. Additionally, multiple types of imaging for the same anatomical region of the same individual are taken over time, spanning years, along with the corresponding non–image-based clinical data. The added challenge of a relatively low number of oral and maxillofacial radiologists/specialists available to interpret the high number of diagnostic imaging performed in dentistry could benefit from the support of AI systems in imaging diagnosis.
Over the last decade, there have been studies conducted to evaluate the efficiency of AI systems in the detection of dental caries, vertical root fractures, apical lesions, salivary gland diseases, maxillary sinusitis, maxillofacial cysts, metastasis to cervical lymph nodes, osteoporosis, and orthodontic diagnosis and treatment to name a few. Many of these studies have focused on CT, cone-beam computed tomography, bitewing, cephalometric, and panoramic radiographic diagnoses using neural networks (convolutional, artificial, and probabilistic neural networks), and concluded that the AI systems offered accurate diagnostic capabilities.13-24 Figure 1 demonstrates a basic AI system that is designed to identify periodontal bone loss (PBL) in panoramic radiographs.
The CNN-based system is designed to predict 3 types of treatment, namely, fluoride, filling, and root canal.
AI in dentistry has a proven potential to assist clinicians in providing appropriate patient care and early diagnosis, enhance accuracy of diagnosis, treatment planning, and predicting as well as monitoring outcomes while improving efficiencies, serving as a secondary opinion, adding value to forensic diagnosis, measurement of clinical/treatment outcomes, and objective management of appropriate insurance coverage. Ideally, with ongoing advances in imaging and technology, computer systems could advance to an economically viable state, enabling AI to serve as an adjunct to the clinical acumen of the radiologist.25
Despite these potentials, AI solutions have not yet become the norm in routine medical practice. Specifically, in dentistry, imaging plays a vital role in screening and treatment. Dental conditions such as caries, apical lesions, and PBL are relatively prevalent, making it easy to build the dataset to train and optimize AI systems. Despite these well-suited conditions for AI, systems based on convolutional neural networks (CNNs) have only been recently adopted in dental radiograph research, and the applications based on these technologies are only now entering the clinical arena.26
Related art and challenges of AI
Lee et al27 proposed an end-to-end CNN-based deep learning system to detect 19 cephalometric landmarks in dental x-ray images. Multiple CNN-based regression systems were created to predict the coordinate variables from the images. A total of 38 regression systems with the same CNN structure were trained to compute 38 coordinate variables. Finally, 19 landmarks were extracted by pairing the regressed coordinates.
Song et al28 proposed a 2-step method to detect cephalometric landmarks on skeletal x-ray images. The first step involves extracting the regions of interest by cropping patches and registering the test images to the training images. The second step uses the state-of-the-art CNN-based ResNet5029 model to detect the landmarks in the extracted patches.
Bouchahma et al30 proposed an automated system to detect decay and predict required treatment from dental x-ray images. The CNN-based system is designed to predict 3 types of treatment, namely, fluoride, filling, and root canal. The CNN architecture was trained on 200 images and tested on 35 images. The system was able to achieve a total success rate of 86% for treatment prediction.
Muresan et al31 developed a novel approach using a CNN to detect teeth and classify 14 different dental issues in panoramic x-ray images. The classes consisted of healthy tooth, missing tooth, dental restoration, implant, fixed prosthetics work, mobile prosthetics work (dentures), root canal device, fixed prosthetic work and root canal device, fixed prosthetic work and implant, fixed prosthetic work and devitalized tooth, devitalized tooth and restoration, dental inclusion, polished tooth, another problem, and background. The CNN system was trained using 1000 panoramic images and reached an accuracy of 89% in detecting the teeth. Finally, a label was generated for each tooth, identifying the problem affecting it by using a histogram-based majority voting system.
Kim et al32 proposed a fully automated network using CNNs to detect PBL using panoramic dental radiographs. The overall framework consists of multiple stages: the first stage was trained to extract the region of interest (the teeth region), the second stage focused on training a network segment and predict the PBL lesion, the third stage consisted of using the pretrained weight of the second stage encoder to create a classification network that predicts the existence of PBL in each tooth, and the final stage consists of a classification network that predicts the existence of PBL lesion specifically for the molar and premolar teeth. The network was trained and tested on panoramic dental radiographs. Kim et al32 proposed the utility of a deep learning–based CNN algorithm for chronological age estimation. A total of 9435 panoramic dental x-rays were used in this study with ages ranging from 2 to 98 years. The author employed a curriculum learning strategy, together with the state-of-the-art DenseNet33-based CNN model to predict the chronological age from these images.
As suggested in Schwendicke et al,34 some of the main reasons for AI technologies not being fully adopted in dentistry are (1) dental data are not readily accessible because of data protection and privacy concerns; (2) datasets lack structure, are complex and multidimensional, and are often biased with overly sick, overly healthy, or overly affluent data points; (3) datasets are relatively small compared with other image-based datasets for AI; (4) a lack of “hard” gold standards and the requirement of an expert to label; and, (5) a lack of trust in AI, as it does not provide any feedback on how or why it arrived at the prediction. Furthermore, when trained AI models are tested on data never encountered in the training phase, they may produce irrelevant results that could lead to misdiagnosis.
Eye tracking and its applications in AI
Eye tracking has been extensively used for research in the field of marketing, image interpretation, and psychology.35-37 In particular, eye tracking has been evolving and expanding in understanding diagnostic interpretations in the field of medicine.38-47 In dentistry, the interpretation of radiographs interweaves the process of perception (visually scanning the image) and cognition (decision making and diagnostic reasoning).48 Eye tracking technology can be employed to specifically determine what the observer is focusing on within the image and further illustrate patterns in the scanning process. The application of eye tracking in dentistry has provided novel opportunities to study the interpretive process and elucidate the differences in decision making, perception, misinterpretation, and misdiagnosis between novices and experts.
There is a lack in availability of annotated data and a need for a framework of incorporating eye tracking data into AI systems. Annotating data typically involves manually tracing out regions or objects in the images. This approach is time consuming when dealing with a large dataset and requires an expert, such as a radiologist, to perform these annotations. Moreover, most of the annotation work is performed outside clinical hours and by novice clinicians, possibly reducing the accuracy of the annotations. With the help of AI-based systems and eye tracking technology, the process of annotating data can be automated and performed by experts during clinical hours. Advances are being made to improve disease classification by integrating eye tracking data with deep learning techniques.42,46,49,50 Figure 2 demonstrates the automated annotation process. As seen in Figure 2, the radiographs are first presented to the expert on a screen. The eye tracking information is captured while the expert analyzes the radiograph. The captured eye tracking data are then fed into the AI system to generate the annotations. In some cases, the findings of the radiographs are transcribed and can be used within the AI to highlight the type of abnormality. This system can tremendously speed up the process of annotation and ensure accuracy.
AI has demonstrated a particularly impressive ability to recognize patterns in data through correlation.
Researchers are striving to create AI models that can match or even surpass human capabilities. To meet this expectation, it is crucial to have accurate data to develop such models that can mimic the behavior of a human expert. AI has demonstrated a particularly impressive ability to recognize patterns in data through correlation. But such models are fundamentally incapable of seeing the cause and effect. For example, unlike a real doctor, AI algorithms cannot explain why a particular image may suggest a disease. Furthermore, human involvement is pivotal in the field of medicine; however, current state-of-the-art systems are limited to visual imagery alone, and humans turn into passive observers. To truly mimic an expert, the AI system must consider the visual perception and cognition of the human. Figure 3 illustrates an AI system trained to perceive radiographs as humans do. In the first pass, the radiograph is fed into the AI system and trained to predict what the expert looked at. The second pass involves passing the learned eye-gaze map with the radiograph to classify the type of abnormality. Fusing different aspects to train an AI model paves the way for a new generation of AI systems to highlight the identified features and explain the reasons for the outputs generated. The trained AI system will significantly reduce errors and serve as a second opinion to the radiologist. It can further address and bridge the gaps in understanding how different radiologists perceive similar radiographs and analyze patterns or shortcuts used by an expert.
The trained AI system can further be used to teach and assist novice clinicians in interpreting radiographs. Instructors and clinicians can use the trained AI to demonstrate and assess gaze patterns during medical training and education, in turn accelerating the transition from novice to expert. Figure 4 illustrates a training tool example that evaluates the findings of novice clinicians. First, both the AI and the student are presented with a radiograph. The eye tracker records the movements and areas concentrated on by the novice. This acquired gaze map is then compared with the gaze map generated by the “expert AI” and the results are presented to the novice.