Music sight-reading expertise , visually disrupted score and eye movements

All cultures enjoy music, but only some write it down (Huron, 2001). In order to 'read' the music, one has to be able to see it, contextualize it and then reproduce it on an instrument of choice not unlike reading text. Successful text reading requires efficient eye movements and the relationship between eye movement patterns and cognitive processing is well documented (Ashby, Rayner & Clifton, 2005: Balota, Pollatsek, & Rayner, 1985; Binder, Pollatsek, & Rayner, 1999; Dee-Lucas, Just, Carpenter, & Daneman, 1982; Ehrlich & Rayner, 1981; Fleisher, 1986; Gobet, Lane, Croker, Cheng, Jones, Oliver, & Pine, 2001; Juhasz & Rayner, 2003; Just & Carpenter, 1976; Kennison & Clifton, 1995; Meseguer, Carreiras, & Clifton, 2002, Miellet and Sparrow, 2004; Underwood, Hubbard, & Wilkinson, 1990). As a consequence, eye movement patterns can expose difficulties in reading comprehension (Rayner, Chace, Slattery, & Ashby, 2006; Underwood, Hubbard, & Wilkinson, 1990). Music sight-reading expertise, visually disrupted score and eye movements.

Music sight-reading expertise, visually disrupted score and eye movements.
Similarly, eye movements can reveal visual processing expertise.Object details can be clumped or 'chunked' into recognizable groups or patterns.An 'expert' in a particular domain, reading text for example, is characterized by the ability to chunk elements of that domain into smaller units for more efficient processing (Ashby et al., 2005;Gobet et al., 2001;Heller, 1982;Kowler, 2011;Legge, 2007;Meseguer et al., 2002;Rayner, 1998;Rayner et al., 2006;Truitt et al., 1997;Underwood et al., 1990).It is a direct result of extensive, structured domain knowledge and is achieved by employing fewer fixations of shorter duration relative to more non-expert sight-readers (Underwood et al., 1990).The resultant increase in speed of performance is characteristic of expertise generally (Bilalic, Langner, Ulrich, & Grodd, 2011;Ericsson, Krampe, Tesch-Romer, 1993;Ericsson, Roring, & Nandagopal, 2007;Farrington-Darby & Wilson, 2006;Gauthier & Bukach, 2007).Similar patterns have been found when researching the eye movements of musicians as they read music Furneaux & Land, 1999;Goolsby, 1987;Kinsler & Carpenter, 1995;Schmidt, 1981;Sloboda 1974Sloboda , 1977;;Truitt et al., 1997;Wolf, 1976;Wurtz, Mueri, & Wiesendanger, 2009).Sightreading is a subset of music-reading skills where prompt performance of a piece of music is required when read directly from notated music, known as the score.Sight-reading differs from reading music in other practice or performance contexts as it requires the musician to reproduce the music with little to no prior experience of the piece to be played.It is a skill that is invaluable for repetiteurs, accompanists and a very useful skill-set for piano teachers, performers and musicians generally.Researchers and musicians alike, vary in the definition of sight-reading.Some describe sight-reading as only occurring the first time an unfamiliar piece of music is played while others consider that familiarization with a piece before playing would also constitute a sight-reading task (Lehmann & McArthur, 2002).Musicians exhibit a vast range of ability in sight-reading.
Past research has shown that some musicians attain 'expert' status in this domain.It is suggested that sight-reading is a matter of skilled pattern recognition (Wolf, 1976) so that the better one becomes at recognizing the patterns, the more 'expertise' one should demonstrate in performing a sight-reading task (Kinsler & Carpenter, 1995).Evidence for music sight-reading expertise has been found in the eye movement patterns of musicians reading score and these are similar to those found in text reading.Specifically, the ability to 'chunk' groups of notes into a single unit for processing rather than reading each note individually (Sloboda 1974, Wolf 1976, Truitt et al., 1997, Furneaux and Land 1999) and the disruption of fixation patterns when unexpected harmonic notes are introduced (Sloboda, 1987).
It is also known that more expert sight-readers can exhibit what has become known as 'proof-reader error'.This occurs when a note that is not part of the harmonic language of a piece is presented in the score, but an incorrect note is played that conforms to the overall harmonic structure rather than the note that is written (Sloboda, 1976;Wolf, 1976).This points towards further evidence of the expert's ability to 'chunk' groups of notes rather than reading each note individually.Less experiences sight-readers tend not to make the same errors.
When musical score was simplified, the EM pattern changed (Servant & Baccino, 1999).However, in this study, these were not visual manipulations of the score, but simplifications.As the original version, known as 'first pass', had already been performed, it might be expected that the EMs would be different as it is the 'second pass' which is known to be different (Goolsby, 1994) without the additional variation of a simplified score.Also, there was no guarantee that the subjects were not already familiar with the piece.Consequently, the results of this and many of the earlier studies are far from conclusive (Madell & Hebert, 2008).
More recently, researchers have shown that when dynamic markings on the score are mismatched with the music being played, different EM patterns are deployed by expert and non-experts (Drai-Zerbib & Baccino, 2014).However, this study did not deal strictly with sight-reading in that the subjects were not required to perform the music on an instrument, but rather was a test of domain specific crossmodal match/mismatch competence.Nevertheless, it does demonstrate that visual aspects of the score could result in EM differences based on expertise.
What has yet to be investigated in detail is how musicians' EMs responds when visual features of the music other than harmonic are disrupted in a simple sight-reading task.For example, the physical appearance of the notes on the score page contain temporal information, that is, notes are spaced roughly according to their duration and bar lines demarcate groups of notes according to the duration and number qualities indicated in the time signature at the beginning of the piece.fMRI studies have shown that an area of the brain involved in spatial processing, the left occipital cortex, is activated when reading music but not when reading text and is thought to suggest that the distances between the notes are a relevant part of music reading processing in relation to pitch (Fourie, 2004).It is not unreasonable to suggest that spacing between the notes is involved in the temporal processing of music notation.In addition, the note stems or beams frequently encode pitch information by being directed above or below the note depending on their position on the stave.Therefore, the alteration of these expected parameters, within the context of conventional music notation, may be expected to alter fixation patterns in a similar way to unexpected harmonic structures.
In relation to text reading, it is known that saccadic latency increases with uncertainty with an average value of 250ms (Cameron, 1995), these findings were for adults with text reading expertise and did not examine novice readers under the same conditions.Given that expert text readers fixate less frequently and for shorter periods of time (Underwood, et al., 1990) with the opposite being true for nonexperts (Rayner et al., 2006), creating uncertainty in music score might be expected to alter fixation patterns, as fixations lengthen when targets are disrupted for text reading (Staub, 2013).Changing spacing in text affects word identification and manifests in saccade programming as increased latency resulting in shorter and/or cancelled saccades and longer fixations (Perea & Acha, 2009).Decreased reading rate is also associated with spacing manipulations as is an increase in regressive saccades (Rayner, Fischer, & Pollatsek, 1998).Not only is spacing important, but also phrasing has been shown to aid comprehension in speech and syntax (Restle, 1972).What is yet to be investigated, is whether a similar eye movement response occurs when sight-reading music and whether this may be modulated by expertise.This study investigated the response in eye movements patterns observed in expert and non-expert music sightreaders when features of the music's notational structure are unexpectedly changed by the removal of bar lines, alteration of stem redirection and variation of the inter-note spacing.The study findings help to expand our understanding of the role that visual expectations play in the visual processing expertise in music sight-reading in terms of working memory capacity, cross-modal integration of sensory information and peripheral crowding of visual stimuli.

Hypothesis
That both groups will show some disruption to their eye movement patterns when sight-reading a visually disrupted score, but such changes will be specific to their level of visual processing expertise.

Participants
Ethics approval was granted by an Australian University Advisory Committee.Participants were drawn from that university's student body and reimbursed for their time.Study inclusion was based on the ability to play a short musical excerpt as it appeared on the recruitment poster, (see Figure 1a).The participants self-selected based on this criterion.All subjects were able to resolve N5 print at a distance of 60cm.An expert music sight-reader was defined as being able to perfectly or near perfectly perform a 6 th Grade AMEB sight-reading examination piece on piano.This level had previously been shown to elicit expertise in eye movements (Waters et al. 1998).A total of 20 people participated in the study -9 were assigned to the expert sight-reader group and 13 to the non-expert sightreader group according to the 6 th grade criteria.All participants were aged between 18 and 21 years of age.

Stimulus
The current study adopted a sight-reading definition based on having no familiarization of the music to be played and with pre-reading actively discouraged.Ten, 4-bar melodies were individually composed, (see Figure 1a).Each melody was written in the treble clef, to be played by the right hand and limited to white notes only.Identical rhythmic components were used for each in largely non-identical combinations and differing melodic content.These comprised minims, crotchets, quavers, dotted quavers, semiquavers and crotchet and quaver rests.
Four pianists were questioned to elicit an approximate viewing distance of music when placed at a standard upright piano.60cm was then chosen as the testing distance: the range was from 30 to 60cms, with 3 values between 50 and 65cm.In order to only examine the effect of the disruption of the score and other visual cues on eye movement patterns, the music stimulus was presented with no blur and at a size equivalent to an optotype of N10.This size has been shown to fall comfortably within the Critical Print Size (CPS) for text reading, a range of letter sizes for which eye movements can be executed at their most efficient (Legge, 2007) and is approximately equivalent to a 10/72" (3.5 mm) letter when viewed at 14" (35.5cm).The note head size was adjusted to yield the same angular subtense at the eye when viewed at 60cm, that is, approximately 5.9mm.

Procedure
Eye movement data was collected using the Arrington Research 'ViewPoint' USB220 eye tracker, the sampling rate being 220frames/second.The images were generated using a custom written programme for MATLAB (Version 2014b, image processing toolbox TM ) and presented on a linearized 27-inch Mitsubishi Diamond Pro monitor driven at a frame rate of 80Hz.The tracker was driven by a Hewlett Packard 'Elitebook 8470p' PC (Intel Core i5 2.60GHz processor/8.00GBRAM/16-bit Operating System).The apparatus consisted of a single infrared camera mounted on a chin and headrest assembly that was mounted on an instrument table.The table was set so that the viewing distance to the screen was 60cm.The participant's height was carefully aligned using a canthus mark that was level with the centre of the computer screen.The camera was then calibrated according to the manufacturer's instructions.Once calibration was successfully performed, a practice session was performed in order for the participant to become familiar and comfortable with the testing process: 4 seconds after a tone sounded, the music stimulus would appear on the computer screen.Participants were instructed to start playing the piece as soon as it appeared on the screen, as quickly and as accurately as possible, without looking down at the hand, without prereading and without stopping regardless of errors.After the participant finished playing, a visual noise patch was presented on the screen.The participant was instructed to fixate on it to eliminate any afterimages that may have been generated by the test stimulus.Sufficient time was given to re-orientate the hands into position by touch between presentations.After 6 trials, the full procedure was undertaken, following the same procedure as the practice session.While it is known that preventing visual feedback in a sight-reading task can increase errors in performance (Banton, 1995), the pieces in the present study were written in the treble clef for white keys only, within an octave span and the subjects were permitted to reposition their hands correctly between trials.Therefore, it was not considered to be difficult to complete without visual reinforcement; particularly as the subjects had self-selected their participation in the study based on their ability to sightread the reference piece on the recruitment poster.It was necessary to ensure that only EMs involved in the reading of the music were included in the analysis.Variations in the time to start playing after seeing the stimulus, differences in the cessation of relevant EMs towards the end of the piece and inconsistent times in ending recording sequence after playing had ceased all needed to be eliminated.Therefore, the time that playing commenced, T1, through to the time that playing ceased at the end of bar 3, T2, was used as the sound window for analysis.The location of T1 and T2 was determined using Fleximusic TM Audio Editor.The sound files were imported and the points on the wave file for T1 and T2 were determined by first filtering for noise and then manually marking the location of T1 and T2.This process was found to be repeatable to within 0.05 second.Once T1 and T2 were known in relation to the length of the sound file, it was then possible to calculate the number of samples between points T1 and subT2.Therefore, EM parameters calculated between T1 and T2 pertain only to the time period of interest: when the music was being read.Participants sight-read the 9 specifically composed musical excerpts of 4 bars duration (see Figure 1a).In order to minimize any possible familiarity with the pieces, each normal piece and its disrupted counterpart were not presented consecutively.Rather, all 9 pieces in the normal form were played first, followed by the 9 disrupted forms of the score (see Figure 1b).Fixation and saccade characteristics were measured and compared between the normal and disrupted score condition performances, each from T1 to T2.

Results
Separate 2-way ANOVA were performed to determine if specific effects existed between expert and non-expert music sight-readers when music score was disrupted and significance was assigned at the 0.05 level.The results were summarized in Figure 2. Normal and Disrupted Score were plotted against Total Time (Figure 2a), Number of Fixations (Figure 2b), Total Fixation Duration (Figure 2c), Fixation Duration minus Saccadic Latency (Figure 2d), Saccadic Latency (Figure 2d), Number of Forward Saccades (Figure 2e), Forward Saccade Speed (Figure 2g), Number of Regressive Saccades (Figure 2h) and Regressive Saccade Speed (Figure 2i) when performing musical excerpts from T1 to T2 for expert and non-expert music sight-readers.Error bars = SEM.

Total Time
The results revealed a significant effect of expertise; F (1,40) = 28.16,p < 0.0001.Expert sight-readers performed significantly faster than non-experts over both conditions.There was no significant interaction between score disruption and expertise for time; F (1,40) = 0.025, p = 0.88.

Saccadic latency
No overall expertise effect was found for score disruption: F (1,39) = 0.48, p = 0.49.However, the disruption in score caused the expert group to have a significant increase in saccadic latency: F (1, 7) = 2.82, p = 0.03, while the nonexperts showed little change.

Fixation Duration
The saccadic latencies were subtracted from the duration measure from the eye tracker.No general expertise effects were found: F (1,39) = 0.27, p = 0.61 and no interaction between expertise and score disruption was found for FD: F (1,39) = 0.06, p = 0.80.

Number of Forward Saccades
No general expertise effects were found: F (1,38) = 1.49, p = 0.23 and no interaction between expertise and score disruption was found for the number of forward saccades: F (1,38) = 0.15, p = 0.70.

Number of Regressive Saccades
Regressive saccades behaved in a similar fashion to forward saccades showing no significant expertise effects: F (1,38) = 2.28, p = 0.14 and no interaction between expertise and score disruption was found: F (1,38) = 0.27, p = 0.60.

Forward Saccade Speed
No general expertise effects were found: F (1,38) = 1.13, p = 0.29 and no interaction between expertise and score disruption was found for the number of forward saccade speed: F (1,38) = 0.01, p = 0.92.

Regressive Saccade Speed
In a similar fashion, no general expertise effects were found: F (1,38) = 2.24, p = 0.14 and no interaction between expertise and score disruption was found for the number regressive saccade speed: F (1,38) = 9.45, p = 0.99.
In summary, expert sight-readers performed significantly faster than non-experts: p < 0.0001.Score disruption had no significant effect on the Total Time within either group.Saccadic latency was the only other measure to reach significance and this was for experts only when encountering disrupted scorethe latency increased significantly: p = 0.03.

Discussion
The Total Time taken to perform from T1 to T2 was unaffected by disruption to the music score for either group.However, the expert group was significantly faster overall and the two groups employed different strategies in order to maintain their speed of performance despite the disruption of the score.This may be explained by comparing sight-readers with typists.It was found that their self-selected speed to ensure accuracy was somewhat conservative and approximately 10-20% below potential (Ericsson et al., 2007).Therefore, each group may have been performing well below their absolute limit in the initial playing and the score disruption was insufficient to impact upon their total time.Previous research has found that, when reading text, Fixation Duration increases when targets are visually disrupted (Staub, 2013).The Total Fixation Duration was found to increase in this study, though not significantly for either group in either condition (see Figure 2c).However, when the saccadic latency is taken into account the results tell a different story.The expert music sight-readers were affected by the disruption and this was shown by their significant increase in saccadic latency: p = 0.03 (see Figure 2d).This finding is in agreement with previous studies on text reading (Cameron, 1995) but did not appear to be the case for non-expert music sight-readers.This may be due to the fact that when testing eye movements on text readers, the participants are adults and exhibit expertise for text reading.The Cameron, 1995 study did not explore the saccadic latency in novice text readers to facilitate comparison with text reading experts.The current study participants were all adults, but only some with expertise in sight-reading.The only significant result was the saccadic latency change in the experts.Perhaps an easier reading task for the non-experts that is then disrupted might show a significant latency change.However, opinion is divided on the relationship between fixations and saccades.Some researchers have suggested that uncertainty causes saccade cancellation and increased fixation duration (Perea & Acha, 2009, Yang & McConkie, 2001).Others advocate that the response to uncertainty is for longer latencies with fewer and shorter saccades (Cameron, 1995;Kowler & Anton, 1987).The results from the current study appear to agree with the latter model for the experts, as a significant increase in latency was found.However, no firm conclusions can be drawn regarding the non-expert group due to the large within group variability.Nevertheless, the non-experts do not appear to adopt the saccade cancellation strategy just to be different from the experts.Rather, they appear to consolidate their 'novice' actions by increasing the number of saccades with shorter duration fixations, but no parameters were significant.Neither group showed an increase in regressive saccades.This was a notable variation from text reading literature where experts increased regressive saccades with uncertainty in the text (Rayner et al., 2006).This this may be due to the task not being difficult enough to elicit such a response or a fundamental difference in how meaning is assigned when reading text as compared with reading music score.For example, punctuation in text is essential for the reader to understanding meaning.Removing key elements of punctuation completely alters the meaning of a sentence, even though the same words may be used in exactly the same order.For example, the sentence, "Go and eat, Grandma," does not mean the same thing as 'Go and eat Grandma."The absence of the comma between 'eat' and 'grandma' changes the sentence from an appeal for Grandma to have a meal to a request for someone to have Grandma as the meal!In the context of music generally, the bar line acts as a temporal marker: notes and rests between bar lines must conform to the note count indicated!In the context of music generally, the bar line acts as a temporal marker: notes and rests between bar lines must conform to the note count indicated by the time signature at the beginning of the music.Removing the bar line does not alter that 'meaning' of the group of notes; unlike the sentence above.A minim followed by a crotchet followed by two quavers is a group whether or not they are separated by a bar lines or not.
Altering the space between the notes or the direction of the beam also does not alter note pitch or duration, it is merely incorrectly notated in the context of a single melodic line of music.This may in turn impact upon a musician's automaticity of processing as the note spacing does not correspond with the notated duration or the beam with the pitch of the notes on the stave.It is not unreasonable, therefore, that the experts were able to play relatively unperturbed by such visual disruptions expect for an increase in saccadic latency because they were not grouped in a predictable manner.Similarly, the non-experts became more 'non-expert' by showing a general increase in eye movement activity.Regardless, having shown that there is a different eye movement response between the two groups, can this be attributed to the key delineator of expertise -Working Memory Capacity (Hambrick, Altmann, Oswald, Meinz, Gobet, & Campitelli, 2014;Meinz & Hambrick, 2010) and the cross modal nature of music reading (Drai-Zerbib & Baccino, 2014;Drai-Zerbib, Baccino, & Bigland, 2012, Meyer & Wuerger, 2001;Wong & Gauthier, 2009)?
The working memory model developed by (Baddeley & Hitch, 19874) proposes that short-term visual and shortterm auditory holding facility are processed by the 'visuospatial sketch-pad' and the 'phonological loop' memory stores, respectively.It is from these stores that information can be processed by the 'central executive'.This differed from other theories of the time which held that there was a Short Term Memory facility that fed directly into Long Term Memory (Atkinson & Shiffrin, 1968).Baddeley and Hitch's model differed by the addition of a 'working memory' that could process using information from these short-term stores without necessarily involving long term memory.Their key findings related to the capacity and interactions between these two storage systems and the central executive.
Regarding cross-modal integration of music stimuli, it has been shown that musicians initially convert a visual stimulus unto an auditory modality for retrieval (Simoens & Tervaniemi, 2013).This may involve the phonological loop and Baddeley has suggested that maintaining information in the phonological loop requires fewer attentional reserves (Baddeley, 2007).Baddeley further suggests, on page 19 of chapter 11, that very familiar objects are subject to being 'cleaned up' during storage in the loop by accessing stored knowledge.This is unlike the visuospatial short term memory which is likely to be more involved with the processing of novel material and requires more conscious attention (Baddeley, 2007).Therefore, disruption of the excepted patterns of music score may be impeding the conversion of visual stimuli to auditory storage and/or confounding the 'clean up' process because the visual presentations do not conform to existing knowledge.That is, the central executive is required to devote more conscious attention to the disrupted score because information is more difficult to store in the phonological loop.This may explain the increase in latency observed in the expert group there is a disruption to working memory due to the uncertainty that has been created (Cameron, 1995).Another study found that experts gazed longer at the score when there was a mismatch between the auditory and visual stimuli suggesting an interruption to cross-modal integration (Drai-Zerbib & Baccino, 2014).Perhaps experts' less crowded peripheral vision for musical notes (Wong & Gauthier, 2012) was somehow sabotaged by the unexpected spaces and inappropriate structures in the field of view.Irrespective of the exact etiology of the problem, the chunking mechanisms were interrupted for the experts and this was evidenced by an increase in saccadic latency.Future inquiries introducing greater visual complexity and disruption to the normality of the score such as the inclusion of unexpected, non-musical symbols and the individual review of space, bar lines and beaming -might shine further light on the nature of interference effects in relation to expertise and music reading processes in general.The more extreme the disruption, the more the expert sight-readers may not be able to maintain their normal processing strategies.They may begin to show more of a note by note approach in order to maintain an effectual performance.This was shown to be the case for expert violinists when the score was visually complex rather than predictable (Wurtz et al., 2009).In addition, assessing the specific effects of note spacing or the removal of bar lines or changing the beam direction might individ-ually have on EM patterns may more concisely demonstrate the nature of the visual interference responsible for affecting processing efficiency.Such findings may help to further understand the cognitive relationships between text and music reading.The disrupted condition found the non-experts executing more forward saccades at a faster speed than the experts.While this result was not significant, it is a somewhat counterintuitive finding considering that experts are said to look ahead more when sight-reading music.The act of looking ahead has been shown to be more efficient for expert sight-readers and has more to do with 'chunking' a greater amount of visual information into a single fixation than consciously looking ahead as an attempt to gather more information (Sloboda, 1985).
Previous studies have shown that object identification can be attained following a fixation of as little as 80-100ms duration (Salthouse & Ellis, 1980).As technology has improved and/or the ability to measure and account for the noise in the system, the figure has diminished -50ms (Rayner, 1998) and 40ms (Nystrom & Holmqvist, 2010).The current study was not sufficiently sensitive to detect such small fixation durations.The role of these micro fixations has yet to be determined in relation to visual processing expertise and may yield valuable insights into differing processing strategies.Utilizing greater amounts of visual disruption in order to generate more visual processing uncertainty, along with more sensitive settings to detect variations in the durations of fixations, would be necessary to investigate the differences between expert and non-expert music sight-readers.
The aim of the current study was to detect differences in eye movement patterns when unexpected visual presentations of music score are read by expert and non-expert music sight-readers.Saccadic latency measures between trials for the expert group was the only significant finding from this study.It is a particularly interesting result as it occurred on their second pass reading.This might not be expected if the first pass reading is so important in this context as the visual disruptions seemed to affect the experts more than any familiarity that might have resulted from the first pass reading.Measuring of first and second pass EMs for each group and each condition separately in future studies would help to clarify this point.Aspects of working memory, cross-modal integration and peripheral visual processing have been proposed as possible mechanisms to account for this.Whether these eye movement responses involve similar cognitive processes as those related to unexpected harmonic structures is an interesting subject for future investigation.

Conclusion
Visual disruption of the music score, as expected, significantly affected the eye movement patterns of expert sightreaders.This was demonstrated by a significant increase in saccadic latency showing that their ability to recognize note grouping had been compromised by the unexpected and unusual patterns in the notation.
The non-experts showed some generalized disturbance of their eye movement patterns: mainly more frequent fixations of shorter duration.None of these reached significance and suggest that the non-expert group maintained their more note-by-note visual processing strategy in this study.