Robot-assisted surgery is becoming gradually more popular due to its clinical advantages. Meanwhile, artificial intelligence and augmented reality in robotic surgery are developing rapidly and receiving much attention.
However, current methods have not discussed the coherent integration of AI and AR in robotic surgery. In this research, we developed a novel system through seamless integration of AI module and AR visualization to automatically generate surgical guidance for robotic surgery teaching.
Specifically, we first leverage reinforcement propensity to learn from expert representations and then generate a 3D guidance path, providing prior contextual information about the surgical procedure. In addition to other information such as text hint, the 3D track is then overlaid in the stereo view of the dVRK, where the user can perceive the 3D guidance and know the action.
The proposed system is evaluated through a pilot trial on surgical education task transfer, proving its feasibility and potential as the next generation of robot-assisted surgery education solutions.
Artificial intelligence (AI) and augmented reality (AR) are important and increasingly important technologies that need to be developed for the next generation of robotic surgery. Until now, AI and AR have individually focused on different perspectives.
In particular, AI focuses on recognizing and planning surgical activities in a way similar to what surgeons would do, based on the calculation and analysis of collected sensory data such as endoscopic videos and robotic kinematics.
Recent advances in artificial intelligence have greatly enhanced a number of tasks such as surgical situation awareness and automation of certain procedures using surgical robots.
Meanwhile, AR aims to enhance the surgical environment in a way that facilitates the surgeon’s process and decision-making, based on the visualization and integration of additional information that is calculated offline or in real time.
Augmented reality, equipped with an immersive view in the surgical robotic console, has shown its effectiveness in teaching novice surgeons, and it is envisaged that it would be very useful if it could be adopted during a surgical procedure.
Unfortunately, the benefits of artificial intelligence and augmented reality have not yet been integrated in a reasonable way into robotic surgery. The interesting combination of AI and AR is emerging as a versatile topic, and has been embodied in a number of application scenarios, such as gaming, driver training and virtual patients.
Giving intelligence to augmented reality not only enhances the virtual experience, but also exploits the powerful power of learning-based algorithms in challenging tasks such as surgical education. However, there have been only a few attempts to combine AI and AR in surgical robotics.
Some authors have proposed using computer vision models to identify anatomy regions of interest and then fitting the results into a camera view.
However, these solutions merely represent existing clues perceptually, without shedding light on human-like decision-making behaviors. At the same time, reinforcement learning (RL) is widely recognized as an effective way to learn skills, but its potential is not fully exploited in the field of surgical robotics.
One interesting scenario to explore these issues is surgical education, where an intelligent RL-based agent is expected to reason about surgical tasks and develop constructive guidelines for novices.
Such embodied intelligence promises to significantly increase accessibility and reduce the cost of surgical training. Achieving intelligent guidance, in the form of augmented reality visualization on surgical robotics platforms, can further enhance ease of use and user experience, yet how to achieve this goal remains unclear.
The application of artificial intelligence in robotic surgery has been intensively studied over the past decade since the da Vinci Surgical System was introduced clinically in 2000.
Many important topics have been found and widely studied such as surgical instrument segmentation, gesture recognition, workflow recognition, and surgical scene reconstruction.
They can support intraoperative decision-making and provide valuable databases for surgical training and assessment.
Although these works are promising, they only provide supplementary information without visualizing surgical plans, such as predicting surgical instrument paths.
Recently, the advent of reinforcement learning has opened the door to a new set of policy-based learning strategies. The effectiveness of reinforcement learning in surgical gesture classification, surgical scene understanding, and machine learning is revealed.
By learning from expert demonstrations, the RL agent can automatically generate meaningful solutions depending on the task at hand.
The authors propose the use of deep deterministic policy gradients (DDPG) with behavior cloning (BC) to perform the surgical tasks of manual needle reabsorption and autonomous blood aspiration.
Both show encouraging results, demonstrating that an RL-based framework may relax the requirements for expert guidance.
Other than what was mentioned above, there are some works that use AI in surgical education, such as providing metrics and performance feedback based on training records and differentiating between experience levels while taking into account stylistic characteristics.
But few of them consider its application in augmented reality surgery education, where there is a great demand for the precise integration of augmented reality and artificial intelligence to guide the trainee clearly and automatically.
Given the superiority of knowledge-based learning introduced above, it is necessary to integrate AR-based learning as a basic unit of AI, for example, as a decision maker, which encourages the AR system to produce content objectively.
The entire system is built on the dVRK platform, the first standard and generic da Vinci surgical platform that was published openly and later developed to be available for other researchers to explore.
They have been widely used in research on surgical imaging and perception, device design and control, system simulation, and surgical task automation, proving their high reliability and flexibility.
In this work, we propose to take full advantage of the advantage of the dVRK platform to combine augmented reality and artificial intelligence to create an automated surgical teaching system.
The dVRK platform has a binocular on the ECM controller that can capture the binary video stream for 3D detection and perception. In this case, we collect the video signal from the dVRK binoculars using a video recording card to convert the video signal into a USB video stream, which can be retrieved by computer. Using the Application Programming Interface (API) provided by ROS packages for the operating system of the dVRK robot, users can obtain the motion information (including tool half position, speed, and rotation) of two PSMs on the dVRK platform and input the motion information to control the movement of the PSMs. In addition, the PSMs can be controlled directly by the user using the hands on the dVRK controller.
To study the preliminary results of the proposed method, we decided to choose the spike transfer task as an experimental task in our work, which is one of the Fundamentals of Laparoscopic Surgery (FLS) tasks for teaching surgical skills.
In this experiment, we move the spike from one pin to another pin using a single PSM on the dVRK platform. The whole process contains three procedures: 1) lifting the clamp, 2) moving the clamp, and 3) placing the clamp.
We train a reinforcement learning (RL) policy on this task and generate a spike transfer path based on the trained model. We then overlay the spike transmission path as an augmented reality (AR) path on the 3D viewer for visualization.
We run the experiment on a computer running Ubuntu 18.04, which has an 8-core Intel Xeon(R) W-2123 processor at 3.60 GHz, an NVIDIA TITAN RTX graphics card, and 16 GB of RAM. All algorithms are written in Python. The image resolution of the binoculars we use is 640 x 480.
To better evaluate the overlay delay, we conducted an experiment that measures the computation time of the overlay process (imaging and imaging 3D waypoints on video frames). We evaluated the time cost when using different sets of path points (200-5000 points), where we randomly tuned the locations of these 3D points and repeated the experiment 300 times.
Finally, we presented the mean and standard deviation of the time cost. From the graph, it can be seen that when we provide the system with more points, the computation cost may increase and show a huge increase in time.
When using about 2600 points or less, we can achieve real-time processing (>30 Hz). In the case of our spike transfer task, the reinforcement learning policy will generate about 100 waypoints that cost only about 1 ms to image and overlay; This is easily sufficient for real-time visualization.
Based on the above results, the system we designed can efficiently overlay the trajectory on video frames to form a real-time augmented reality visualization, and this has great potential for more complex real-time realistic robot surgery teaching scene.
CONCLUSIONS AND FUTURE WORK
In this work, we give intelligence to AR by integrating an AI module as a path generation decision maker. The AI module is harnessed using reinforcement learning to learn policy-based operations and dynamically create the route based on the user’s task.
To provide immersive visualization and facilitate intuitive learning, the route plan is photographed and then overlaid on 3D video, helping the trainee visualize the smart instructions in augmented reality.
A simple and efficient calibration process is also designed with human-computer interaction to achieve coordinate calibration through the user interface.
We experimented with surgical training in a spike transfer task, demonstrating the feasibility and potential of using AI-powered augmented reality in robotic surgery education.
In our future work, we will explore in more detail how to design and integrate additional AI and AR modules into the system, such as virtual instrument guidance, overlay of surgical workflow recognition results and instrument separation.
Moreover, we will integrate network communication to realize remote surgery education. Methods can also be included to measure the surgeon’s performance against an established “expert” pathway and conduct a user study to evaluate the effectiveness of the proposed system.