Developing Expert Gaze Pattern in Laparoscopic Surgery Requires More than Behavioral Training

Expertise in laparoscopic surgery is realized through both manual dexterity and efficient eye movement patterns, creating opportunities to use gaze information in the educational process. To better understand how expert gaze behaviors are acquired through deliberate practice of technical skills, three surgeons were assessed and five novices were trained and assessed in a 5-visit protocol on the Fundamentals of Laparoscopic Surgery peg transfer task. The task was adjusted to have a fixed action sequence to allow recordings of dwell durations based on pre-defined areas of interest (AOIs). Trained novices were shown to reach more than 98% (M = 98.62%, SD = 1.06%) of their behavioral learning plateaus, leading to equivalent behavioral performance to that of surgeons. Despite this equivalence in behavioral performance, surgeons continued to show significantly shorter dwell durations at visual targets of current actions and longer dwell durations at future steps in the action sequence than trained novices (ps ≤ .03, Cohen’s ds > 2). This study demonstrates that, while novices can train to match surgeons on behavioral performance, their gaze pattern is still less efficient than that of surgeons, motivating surgical training programs to involve eye tracking technology in their design and evaluation.


Introduction
Laparoscopic surgery is a type of minimally invasive surgery in which narrow tubes are inserted into the body through small incisions, allowing surgeons to manipulate, cut, and sew tissue with relatively less trauma, leading to faster patient recovery and lower morbidity compared to open surgical techniques (Fuchs, 2002). Although ultimately beneficial for the patient, laparoscopic surgery can create challenges for surgeons. One such challenge is that laparoscopic surgeons cannot directly view the tissue they are operating on but instead must view 2-dimensional video, captured by the laparoscope inserted inside the body and projected to a display at eye level through a closedcircuit camera. In addition, laparoscopic surgeons must deal with the "fulcrum effect", whereby the tips of the surgical tools move in the opposite direction of tool handles with little tactile feedback (Gallagher et al., 2001). Therefore, successful laparoscopic surgery entails expertise in depth perception from 2-dimensional images, as well as complex visual-motor coordination and transformation, among other important skills. Even though all these skills can be reasonably trained in existing and validated simulation programs , such training takes a substantial amount of time from surgical trainees who are regularly fatigued from other professional commitments and face restricted working hours (Ahmed et al., 2014). An important need therefore exists to further improve the efficiency of training programs in laparoscopic surgery, motivating research to characterize the gaps between experts and non-experts and innovation to create effective interventions to minimize such gaps.
One line of research that has showed promise for elucidating surgical expertise is the use of eye tracking technology (Hermens et al., 2013;Tien et al., 2014). Eye tracking is particularly well-suited as a research tool in laparoscopic surgery as it takes full advantage of the range of attentional focus defined by the monitor that displays monocular images at approximately eye level. Past research comparing gaze patterns between experts and non-experts has revealed that laparoscopic expertise can not only be detected with behavioral metrics, such as task completion times, but also on "eye" metrics in simulation tasks. Relative to non-experts, laparoscopic surgery experts were shown to gaze more at surgical targets (i.e., elements to be manipulated) than surgical tools (i.e., instruments used to manipulate) in mostly one-handed simulation tasks (Law et al., 2004;Wilson et al., 2010; and when watching surgical recordings (Khan et al., 2012). Because gazing at upcoming surgical targets represents a feedforward sampling strategy, whereas gazing at currently engaged targets or surgical tools represents an online sampling strategy, such a finding suggests a proactive attentional profile among experts. However, tasks that have been tested thus far are limited to relatively simple surgery simulations that focus on short-duration, singlehand movements, which cannot fully capture the task demands required for laparoscopic surgery. In addition, because laparoscopic surgery trainees are mostly provided with behavioral simulation opportunities in their training, it is still unknown whether expert gaze pattern in laparoscopic surgery are concurrently developed through highvolume behavioral training.
Further insights in the area can be gleaned by exploring gaze behaviors in complex and dynamic bimanual laparoscopic tasks. One such task is the peg transfer task from the Fundamentals of Laparoscopic Surgery (FLS) training program, whose criterion-based completion is required for all surgery residents by the American Board of Surgery (Peters et al., 2004). The peg transfer task emphasizes bimanual coordination by using two Maryland Dissectors in a procedure of moving six plastic objects, in turn, from the left to the right side of a flat pegboard, and later reversing the entire process to move the objects back to the left side.
To be successful at this task, one must attend to the transfer of the object between the two dissectors, then the placement of the object onto the target peg to make sure that it arrives flush on the pegboard, before looking ahead to the next object that will be moved and engaging it. This validated task simulates the critical action of transferring and positioning a needle between needle holders in suturing ). An additional strength of the peg transfer task is that behavioral learning curves on this task are fit well by an inverse function model that is capable of estimating one's "learning plateau", the theoretical best score one can reach with an unlimited amount of practice (Feldman et al., 2009). Therefore, given its strengths of ecological validity and the strong model-based explanation of behavioral learning, the peg transfer task offers the opportunity to explore differences in gaze patterns between experts and non-experts and understand how gaze patterns evolve in individuals as they practice and gain proficiency.
The present study attempted to accomplish these goals by examining gaze and behavioral patterns during the peg transfer task. Comparisons were made between experienced surgeons and novice participants with no surgical experience, prior to, and after, they completed a training paradigm designed to meet a theoretically derived behavioral learning plateau. For this purpose, dwell-based gaze metrics, which can be regarded as one's perceived area of importance (Tien et al., 2014), were extracted within a set of pre-defined areas-of-interest (AOIs) that helped quantify online and feedforward sampling during timed task performance. Based on previous research, greater feedforward sampling control was expected in laparoscopic surgery experts, relative to novices, prior to practice. As novices learned and approached their behavioral learning plateau, however, it was expected that they would show similar gaze patterns to experts by demonstrating greater feedforward sampling control.

Participants
Three experienced surgeons (2 female) and five novices (3 female) participated in this study. Sample size was determined based on previous eye tracking and behavioral research (Khan et al., 2012;Ritter & Scott, 2007), which has demonstrated consistent eye tracking and behavioral performance with small numbers of laparoscopic experts on laparoscopic simulation tasks, resulting in large effect sizes (i.e., Cohen's ds > 1) when compared to those of nonexperts. All three surgeons held faculty positions in the Department of Surgery within the Duke University School of Medicine with specialties in surgical oncology and bariatric surgery, with seven, two, and eight years of post-fellowship independent surgical practice, individually. All surgeons reported prior performance with the FLS peg transfer task and reported right hand dominance in their surgical activities. Their average age was 41 (SD = 1.73) years. All novice participants were right-handed with an average age of 26.4 (SD = 3.43) and none had prior experience with laparoscopic surgery or the FLS curriculum. Informed consent was given by all participants at the start of participation, and the research protocol was reviewed and approved by the Duke Health Institutional Review Board (Pro00078782) and abided by the ethical standards in the Declaration of Helsinki. All participants were compensated at the rate of $15/hr.

Task & Apparatus
As illustrated in Figure 1A, the peg transfer task was conducted on a cart mounted FLS trainer box (Limbs & Things Ltd., Savannah, GA). In order to provide a standardized procedure for peg transfers, the objects were labeled one through six, as was the pegboard at the base of the pegs on the left and right sides of the board ( Figure  1B). Consistent with instructions provided in the FLS curriculum tutorial video, participants were instructed to move the objects, one at a time, from the numbered peg on the left side, to the corresponding numbered peg on the right side. Once all six objects were placed on the right side, they then reversed this procedure and returned the objects to the left, always in order from one to six. For each of these transfers, the objective was to be picked up with the dissector on the same side, and passed in midair to the other dissector, before placing it down on the appropriate peg. As such, one repetition of the task is divided into 12 transfers, each of which begins when the dissector touches the object at the starting peg and ends when the object is lowered on the target peg and is flat on the surface of the pegboard. The Argus ETMobile system (Argus Science, Tyngsborough, MA) was used to track foveal vision at 30 Hz. The eye-tracker featured eye and scene cameras mounted on a pair of light-weight glasses that compute gaze locations using both camera recordings via the pupil-to-corneal-reflection technique. This technique relies on modeling spatial relationships between the black pupil and mirror reflections of three infrared lights from the cornea front surface. The gaze point is represented by a circular cursor spanning 1° of visual angle on the scene camera recordings. The eye tracking system was further set up so that recordings from the closed-circuit camera installed inside the FLS trainer box was recorded to the same eye tracking software system. Post processing allowed spatial and temporal alignment of both scene and trainer box camera recordings so that gaze locations could be calculated by transferring coordinates from the scene camera recording to the trainer box recording using the stimulus tracking algorithm from the ETAnalysis software (Argus Science, Tyngsborough, MA). Such coordinate transfer enables control for head movements during the task recording. The gaze locations on the trainer box recording were subsequently used to calculate fixations, defined as a single gaze of at least 100 ms within 1° of visual angle.

Measures
The primary eye-tracking metric used in this study was the Percent Dwell Duration. This normalized dwell measure was calculated by dividing the dwell duration in which consecutive fixations remain in a given AOI by the total dwell duration recorded in the corresponding transfer. The adoption of percent dwell duration helps control for individual differences regarding the overall time spent on the task, and the use of AOIs surrounding the static pegs allows for quantification of sequential steps in this stereotyped action sequence. To quantify gaze behavior and feedforward sampling tendencies in participants' eye movements, square-shaped AOIs were defined ( Figure  1C) and interpreted for each transfer as: 1. AOITP represents the region including and surrounding the Target Peg (TP) on each transfer. Fixations here reflect 1-step, feedforward sampling control.
2. AOINO represents the region including and surrounding the Next Object (NO) to be transferred. Fixations here reflect 2-step, feedforward sampling control.
3. AOIOutside represents the region excluding the AOITP and AOINO. Fixations in this AOI are outside of the other regions and primarily correspond to gaze on the currently moving object, reflecting online sampling control during movement.
Because the 12 th transfer is the end of the repetition and does not have an AOINO, it is not included in the calculation of scores. To test the consistency of the AOI definitions, the AOI sizes (pixel 2 ) for both AOITP and AOINO were extracted and tested between the surgeon and novice groups, resulting in non-significant differences (ps > .39).
The behavioral Performance Score followed previous research (Cox et al., 2020) and was computed by accounting for both task completion time and errors. Errors included drops within the field-of-view, drops outside of the field-of-view, and improper transfers (e.g., using the wrong dissectors to move an object or resting the object on a peg during transfer), and were penalized by increased task completion time (seconds). Specifically, drops within the field-of-view and improper transfers added one second each to the completion time, while drops outside the fieldof-view added three seconds each to the completion time, reflecting a heavier penalty for this more serious error. The final task completion time was converted into a performance score, by dividing, if applicable, the penalized task completion time by 12 which is the number of objects transferred. The performance score thus adopts the unit of "seconds per object" transferred with lower scores corresponding to better performance.

Procedure
All participants underwent identical acclimation procedures at the start of the study. Specifically, they heard a brief verbal description of the procedures, gave informed consent, and completed a demographic survey, prior to watching an instructional video of the FLS peg transfer task and familiarization with the instruments. After this acclimation, the study activities differed between the two groups ( Figure 2).
During the remainder of surgeons' only experimental visit, they were asked to complete 10 repetitions of the task as quickly and accurately as possible while wearing the eye tracker. These repetitions were timed and constituted their performance test. Novices, however, proceeded to complete both testing (with the eye tracker) and training (without the eye-tracker) over five visits that occurred within two weeks. Following introduction and acclimation in Visit 1, novices completed a pre-test in which they performed six timed repetitions of the peg transfer task as quickly and accurately as possible while the eye tracker recorded their gaze behaviors. The eye tracker was then removed and novices were given a 5-minute break prior to completing two 20-minute practice blocks with a 5-minute break between the blocks. During both Visit 2 and 4, break, but did not perform any testing with the eye tracking system on. Visit 3 began with a mid-test in which they again completed six timed repetitions as quickly and accurately as possible as eye tracker was worn to record their gaze. This was followed by approximately 33 minutes of practice without the eye tracker to round out a total of 40 minutes of exposure to the peg transfer task on this visit. Visit 5 consisted of only a post-test during which novices performed six timed repetitions of the task as quickly and accurately as possible with the eye tracker to record their gaze.

Analysis
R and JASP (v0.11.1) were used for statistical analyses. For novices, the effect of training was evaluated using both individualized learning plateaus estimated with an inverse function model (Feldman et al., 2009) and changes across testing sessions using ANOVAs. Specifically, a 6 (Repetitions: 1 to 6) by 3 (Session: pre-test, mid-test, posttest) ANOVA was run with performance scores, and a 3 (AOI: outside, target peg, next object]) by 3 (Session) ANOVA was performed with percent dwell duration. To compare surgeons to novices, both prior to and after training, group differences on performance score were tested using 6 (Repetition) by 2 (Expertise: surgeons, novices) ANOVAs, and group differences on percent dwell duration were tested using 3 (AOI) by 2 (Expertise) ANOVA. In both of these analyses, only the first six repetitions of the surgeons' test were used in order to match the six repetitions collected with the novices during their tests, given that no statistical differences were identified across all the 10 repetitions for the surgeons. In order to further investigate gaze differences and determine if subtle differences in the quantification of dwell patterns influenced the findings, identical ANOVA analyses were also performed using an alternative dwell metric, Percent Dwell Count, whose results can be found in the Appendix. The Greenhouse-Geisser correction on degree-of-freedom was used when Mauchly's test for sphericity reached statistical significance. When post-hoc pairwise comparison was needed, the familywise alpha level was controlled using the Holm-Bonferroni method. The alpha level was set at .05.

Individual Learning Curves
The total amount of time spent performing the peg transfer task (including time spent testing) ranged between 171 and 177 (M = 173, SD = 2.71) minutes among the five novices. As shown in Figure 3, the inverse function model fit well to the behavioral performance data from the five novices, ps ≤ .02, R 2 s ≥ .30. Specifically, the results indicated that all of the novices spent less than 60 (M = 24.40, SD = 18.88) minutes performing the task to reach 90% of their behavioral performance plateaus, and all reached approximately their estimated plateau by the end of training (M = 98.62%, SD = 1.06%).

Discussion
This study aimed to extend previous research applying eye tracking technology to understand expertise and skill acquisition in laparoscopic surgery. Here, both behavioral and eye tracking metrics were compared between experienced laparoscopic surgeons and novices, as novices underwent a multi-visit training protocol aimed at providing high-volume behavioral training. The peg transfer task was selected to highlight bimanual coordination in laparoscopic surgery, while offering a validated approach to model individualized skill acquisition process through training. To adapt to eye-tracking constraints, the FLS peg transfer task was adjusted to fix the action sequence into a constant ordering, allowing objective AOIs to be defined. Results revealed that, although all the novices reached post-training behavioral performance that approximated their individualized learning plateaus and was statistically indistinguishable from that of the laparoscopic surgeons, their training experiences did not lead to the development of equivalent expert gaze behaviors. In particular, whereas novices focused more on the AOIOutside, implying online visual sampling of object currently being moved, surgeons focused more on the AOITP and AOINO, suggesting greater focus on feedforward sampling control. The evidence thereby supports the hypothesis that surgeons used a more proactive gaze strategies than novices, but not the hypothesis that training novices to expert-level behavioral performance would be accompanied by expert-like gaze pattern in a complex bimanual laparoscopic surgery task. The following discussion, therefore, addresses the implication of these findings, the strengths and weaknesses in this design, and future directions for this research.
The current study utilized the validated FLS peg transfer task Peters et al., 2004) to characterize learning and explore group differences between novices and experienced surgeons. Each novice spent approximately three hours practicing the task, resulting in saturated behavioral learning according to both individual-and group-based results. As illustrated by the inverse model (Figure 3), on average novices were able to improve to 90% of their behavioral plateaus in 24 minutes, with 8% more improvement over the remaining training, leading to a learning rate ratio of 70 (i.e., [90%/8%]×[150 min/24 min]). This decelerated learning rate across training time is unsurprising given the skill learning literature (Schmidt et al., 2018), especially when the design is aimed to maximize the training volume (~3 hours) for possible alteration in corresponding gaze behaviors during task performance. However, the opportunity cost in offering this approximately saturated behavioral training volume becomes concerning when little change is observed in novices' gaze patterns that consistently differ from that of surgeons in the laparoscopic simulation and in light of the restricted training hours available to surgical trainees (Ahmed et al., 2014). The finding thus implies that surgical training programs focused on manual coordination alone cannot result in the complete development of eye-hand coordination patterns produced by surgeons.
One interesting observation from the current findings is that the surgeons showed highly consistent patterns of behavioral and eye-tracking results across individuals, as evidenced by relatively small standard deviations within the group. This illustrates a learned template with relatively greater feedforward visual sampling, where 40.4% of total dwell duration is distributed to AOIs that capture the upcoming targets in the action sequence for a given transfer. This finding is consistent with previous research and supports the view that visual expertise in surgery features flexibility in attentional distribution, accurate prediction of ensuing actions, and skillful use of parafoveal vision in controlling surgical tools (Law et al., 2004;Wilson et al., 2010;. Such a feedforward, externally oriented, and autonomous attentional style has been shown to reduce electromyographical noise (Vance et al., 2004), enhance short-loop reflexes in motor control (Liu et al., 2015;Wulf, 2007), and produce greater neural efficiency (Yarrow et al., 2009), which are earmarks of perceptual-motor expertise and may account for expert performance in surgery. Future research is thus encouraged to explore the mechanisms underlying expert gaze pattern along these directions.
A close examination of the eye tracking results indicates that the observed gaze pattern may bear different meanings when comparing surgeons to novices prior to and after training. Although novices always showed longer dwell duration in AOIOutside than surgeons, their gaze in the AOINO and AOITP produced different profiles across the training. Specifically, relative to surgeons, novices demonstrated equivalent dwell duration on the target peg and less dwell duration on the next object at pre-test, whereas they showed shorter dwell duration on the target peg and equivalent dwell duration on the next object at post-test. Such an evidentiary pattern suggested a tradeoff in dwell duration between the two AOIs that were supposed to gauge feedforward visual sampling. One possible explanation from the novice's standpoint is that, at pre-test, novices required greater monitoring of the dissectors and target objects in order to complete the transfer and successfully drop the object on the target peg. This challenge may have increased the proportion of gaze in AOITP, which does not reflect feedforward visual sampling per se. As novices gained proficiency at the task, however, they may have been able to divert attention earlier from the AOITP to the AOINO, reflecting an intention to work on the next object for faster task completion. The fact that AOINO did not differ between novices and surgeons after training may indicate the ability to shift, with practice, from sampling one step ahead on the target peg, to two steps ahead to view the next object in the sequence. A second possibility from the surgeon's standpoint is that, because dropping the target object to the target peg in the task is designed to simulate the starting actions of suturing in laparoscopic surgery, surgeons may show the "cognitive slowing down", reflecting a refocusing effort to increase attention towards a critical location in the surgical task due to professional experience (Harvey et al., 2014;Moulton et al., 2007). Yet another possibility is that the truth lies in a combination of factors from both novices' training and surgeons' experience. The clarification of such subtle findings would merit future study. Finally, while the mean age of the novices (26.4 years) and surgeons (41 years) differed, the relative stability of motor control and learning across these age ranges implies that such an age difference is not likely a strong determinant in the observed effects (Shea et al., 2006), though it is still of interest for future research to examine age-related performance questions with larger samples in the surgical context.
The current findings have several implications for laparoscopic surgery research using eye-tracking. First, given the current evidence that novices do not completely develop the expert gaze pattern from manual practice alone, it may be advisable to also implement eye-tracking technology in surgical training programs so that the expertise gap can be closed not only on behavioral criteria but also on gaze pattern. Preliminary evidence has shown that using pre-recorded (Vine et al., 2012(Vine et al., , 2013 or simultaneous (Chetwood et al., 2012) gaze behaviors from experts to help non-experts recognize and learn the "gaze template" during practice can lead to faster learning and better performance. Principles of such a training program may include facilitating (a) awareness of the expert gaze pattern, (b) contrast between one's own and expert gaze pattern, (c) recognition of targets in surgical actions, and (d) the self-regulatory "cognitive slow down". Secondly, eye tracking measures, demonstrate greater sensitivity to surgical expertise than behavioral measures. This advantage of eye tracking metrics has been observed in other relevant research (Lebeau et al., 2016;Wilson et al., 2010;, and may be explained by the greater temporal sensitivity of eye tracking that is able to measure dozens/hundreds of fixations for each single peg transfer repetition. Therefore, through aggregation of data into normalized metrics, such as the current percent dwell duration, it is possible to increase the signal/noise ratio in data collected during simulation tasks. Finally, the current research made an effort to capture visual attention by defining AOIs that captured the sequential steps in stereotyped peg transfer action sequence. Such a paradigm of AOI definition helped explore means to investigating relatively complex and dynamic tasks in surgical settings.

Conclusions
Novices achieved profound behavioral learning through training, leading to performance scores equivalent to those of experienced surgeons. Despite this, surgeons continued to demonstrate more feedforward visual sampling control than novices by gazing at surgical targets of ensuing actions with such differences persisting even after training of novices. It can thus be inferred that, while traditional simulation-based laparoscopic skill training can improve novices' behavioral task performance to a level similar to surgeons, differences in gaze pattern remain, motivating future surgical training programs to involve eye tracking technology in its design and evaluation. Future studies may look to replicate these findings with larger samples, while working towards implementing these gaze patterns into training programs aimed at improving surgical education.

Ethics and Conflict of Interest
The author(s) declare(s) that the contents of the article are in agreement with the ethics described in http://biblio.unibe.ch/portale/elibrary/BOP/jemr/ethics.html and that there is no conflict of interest regarding the publication of this paper.