<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">

<article article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
 <front>
    <journal-meta>
	<journal-id journal-id-type="publisher-id">Jemr</journal-id>
      <journal-title-group>
        <journal-title>Journal of Eye Movement Research</journal-title>
      </journal-title-group>
      <issn pub-type="epub">1995-8692</issn>
	  <publisher>								
	  <publisher-name>Bern Open Publishing</publisher-name>
	  <publisher-loc>Bern, Switzerland</publisher-loc>
	</publisher>
    </journal-meta>
    <article-meta>
	<article-id pub-id-type="doi">10.16910/jemr.11.4.5</article-id> 
	  <article-categories>								
				<subj-group subj-group-type="heading">
					<subject>Research Article</subject>
				</subj-group>
		</article-categories>
      <title-group>
        <article-title>A single-camera gaze tracking system under natural light</article-title>
      </title-group>
	   <contrib-group> 
				<contrib contrib-type="author">
					<name>
						<surname>Xiao</surname>
						<given-names>Feng</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Zheng</surname>
						<given-names>Dandan</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Huang</surname>
						<given-names>Kejie</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Qiu</surname>
						<given-names>Yue</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Shen</surname>
						<given-names>Haibin</given-names>
					</name>
					<xref ref-type="aff" rid="aff1">1</xref>
				</contrib>                        				
        <aff id="aff1">
		<institution>Institute of VLSI Design, Zhejiang University</institution>,   <country>China</country>
        </aff>
		</contrib-group>   

		
	  <pub-date date-type="pub" publication-format="electronic"> 
		<day>20</day>  
		<month>10</month>
        <year>2018</year>
      </pub-date>
	  <pub-date date-type="collection" publication-format="electronic"> 
	  <year>2018</year>
	</pub-date>
      <volume>11</volume>
      <issue>4</issue>
	 <elocation-id>10.16910/jemr.11.4.5</elocation-id> 
	<permissions> 
	<copyright-year>2018</copyright-year>
	<copyright-holder>Xiao, F., Zheng, D., Huang, K., Qiu, Y., &#x26; Shen, H.</copyright-holder>
	<license license-type="open-access">
  <license-p>This work is licensed under a Creative Commons Attribution 4.0 International License, 
  (<ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">
    https://creativecommons.org/licenses/by/4.0/</ext-link>), which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p>
</license>
	</permissions>
      <abstract>
        <p>Gaze tracking is a human-computer interaction technology, and it has been widely studied
in the academic and industrial fields. However, constrained by the performance of the
specific sensors and algorithms, it has not been popularized for everyone. This paper
proposes a single-camera gaze tracking system under natural light to enable its versatility.
The iris center and anchor point are the most crucial factors for the accuracy of the system.
The accurate iris center is detected by the simple active contour snakuscule, which is
initialized by the prior knowledge of eye anatomical dimensions. After that, a novel anchor
point is computed by the stable facial landmarks. Next, second-order mapping functions
use the eye vectors and the head pose to estimate the points of regard. Finally, the
gaze errors are improved by implementing a weight coefficient on the points of regard of
the left and right eyes. The feature position of the iris center achieves an accuracy of
98.87% on the GI4E database when the normalized error is lower than 0.05. The accuracy
of the gaze tracking method is superior to the-state-of-the-art appearance-based and feature-
based methods on the EYEDIAP database.</p>
      </abstract>
      <kwd-group>
        <kwd>Eye movement</kwd>
        <kwd>eye tracking</kwd>
        <kwd>gaze</kwd>
        <kwd>usability</kwd>
        <kwd>single-camera</kwd>
        <kwd>facial landmark</kwd>
        <kwd>iris center</kwd>
        <kwd>anchor point</kwd>
        <kwd>head pose</kwd>
        <kwd>mapping functions</kwd>
      </kwd-group>
    </article-meta>
  </front>	
  <body>

    <sec id="S1">
      <title>Introduction</title>

<p>Gaze tracking is a kind of human-computer interaction technology that creates
an easy and effective interaction for serving the disabled, learning,
entertainment, etc. Meanwhile, it is also a research tool, and it has
been widely used in marketing studies (<xref ref-type="bibr" rid="b1">1</xref>), reading research (<xref ref-type="bibr" rid="b2">2</xref>), and so
forth. Gaze tracking techniques can be divided into
electrooculography-based, coils-based, and video-based (infrared and
natural light) techniques and so on (<xref ref-type="bibr" rid="b3">3</xref>). The third technique is less
intrusive than the first two, which require physical contact sensors
such as electrodes and scleral coils.</p>

<p>Today, a variety of existing remote video-based gaze tracking systems
under infrared (IR) light in academia and industry have achieved
accurate results. For instance, the Dual-Purkinje-Image (DPI) gaze
tracker (<xref ref-type="bibr" rid="b4">4</xref>) achieves an accuracy better than 0.1° (<xref ref-type="bibr" rid="b5">5</xref>). The Eyelink 1000
system performs at an accuracy below 0.5° with a white background (<xref ref-type="bibr" rid="b6">6</xref>).
However, infrared sources are sensitive to ambient light. IR gaze
trackers also have reflection problems when people wear glasses.
Therefore, development of a gaze tracking system under natural light has
become an increasingly important field of research.</p>

<p>Recently, video-based gaze tracking system under natural light are
capable of tracking the gaze. However, some of them rely on multiple
cameras (<xref ref-type="bibr" rid="b7">7</xref>), High-Definition (HD) cameras (<xref ref-type="bibr" rid="b8">8</xref>) and RGB-D cameras (<xref ref-type="bibr" rid="b9 b19">9, 10</xref>),
which limit their applications. With the popularity of cameras, gaze
tracking with a single-camera under natural light becomes a research hot
spot, but one of the major challenges is the requirement for an accurate
gaze tracking algorithm. Therefore, we concentrate on a regression-based
gaze tracking system with a single-camera under natural light in this
paper.</p>

<p>The accuracy of regression-based gaze tracking is directly influenced
by the eye vectors that are derived from the iris centers and the facial
stable point (anchor point or reference point). However, the performance
of various iris/pupil center localization methods significantly degrades
in low resolution images because of interference such as glass/iris
reflection, and eyelid. In addition, the anchor points of the eye
corners vibrate with eye rotation (<xref ref-type="bibr" rid="b11">11</xref>) and can be blocked due to large
head movements. Therefore, an accurate localization method in low
resolution images (<xref ref-type="bibr" rid="b12">12</xref>) is improved to detect the iris center. Then, a
novel anchor point is proposed to overcome the drawbacks mentioned
above. Finally, the Points of Regard (POR) of the left and right eyes
are combined to improve the accuracy of the system. Compared with other
regression-based methods, the main contributions in this paper are
listed in the following:</p>

<p>(1) An accurate feature position localization method for the iris
center is implemented in low resolution images by combining facial
landmarks, the prior knowledge of eye anatomical dimensions, and the
simple active contour snakuscule(<xref ref-type="bibr" rid="b13">13</xref>).</p>

<p>(2) A novel anchor point is computed by averaging the stable facial
landmarks, which improves the accuracy of the gaze tracking system.</p>

<p>(3) A weight coefficient is used on the POR of the left and right
eyes to revise the final POR, which reduces the error of the gaze
tracking.</p>

<p>The rest of the paper is structured as follows: In the next section,
the related work is presented. The details of the proposed method are
covered in the <italic>Methods</italic> section. The evaluation of the
proposed scheme and statistical results on public databases are shown in
the <italic>Evaluation</italic> section. The discussion is presented in
the final section.</p>

    <sec id="S2">
      <title>Related work</title>

<p>This section overviews gaze tracking systems under natural light. The
systems can be classified into feature-based and appearance-based
methods (<xref ref-type="bibr" rid="b14">14</xref>).</p>
    </sec>
	
    <sec id="S2a">
      <title>Feature-based methods</title>

<p>Feature-based methods extract features such as the iris/pupil center,
eye corners and iris/pupil contours. Then, model-based and
regression-based methods use the features to track the gaze. Model-based
methods (<xref ref-type="bibr" rid="b10 b15">10, 15</xref>) use a geometric eye model to compute the gaze direction
from the features. Regression-based methods (<xref ref-type="bibr" rid="b16 b17">16, 17</xref>) compute a mapping
function between the gaze direction and eye vectors.</p>

<p>The performance of model-based methods relies on the accurate
detection of the iris center. In (<xref ref-type="bibr" rid="b10">10</xref>), the iris center was obtained by
an ellipse fitting algorithm, where the ellipse of the iris in the image
was described by the yaw and pitch angles. J.Li and S.Li (<xref ref-type="bibr" rid="b10">10</xref>) achieved
7.6° and 6.7° in horizontal and vertical directions on the public
EYEDIAP database (<xref ref-type="bibr" rid="b18">18</xref>) with an execution speed of 3 frames per second
(fps) on a 2.5-GHz Inter(R) Core(TM) i5-2400S processor. In (<xref ref-type="bibr" rid="b15">15</xref>), the
shape of the iris was estimated by ellipse fitting. Then, an accuracy of
7° of the gaze direction was inferred by the hypothesis that the shape
of the iris appears to deform from circular to elliptical when the iris
orientation changes. Wood and Bulling (<xref ref-type="bibr" rid="b15">15</xref>) achieved an execution speed
of 12 fps on a commodity tablet computer with a quad-core 2 GHz
processor. Ellipse fitting has a low consistency and reliability because
iris edges or points cannot be accurately extracted in low resolution
images.</p>

<p>In addition to the iris center, the anchor point is one of the key
features influencing the accuracy of the regression-based methods. In
(<xref ref-type="bibr" rid="b16">16</xref>), the eye corner was used as the anchor point. Instead of detecting
the eye corners, the anchor point in (<xref ref-type="bibr" rid="b17">17</xref>) was set as the center
coordinate of the patch which contains the inner eye corners and eyebrow
edges. The proposed system yielded a mean accuracy of 2.33° and 1.8° in
the horizontal and vertical directions on their self-built database and
7.53° on the public UulmHPG database (<xref ref-type="bibr" rid="b19">19</xref>). However, eye corners or the
center of the patch cannot be accurately detected in low resolution
images with a large head rotation.</p>
    </sec>
	
    <sec id="S2b">
      <title>Appearance-based methods</title>

<p>Appearance-based methods do not extract specific features and usually
learn a mapping function from eye images to gaze directions. In (<xref ref-type="bibr" rid="b20">20</xref>),
gaze estimation was learned by random regression forests with a
significantly larger dataset, which reduced the error by 50% from the
work in (<xref ref-type="bibr" rid="b21">21</xref>) with an error larger than 10°. In (<xref ref-type="bibr" rid="b9">9</xref>), k-nearest neighbor
regression and adaptive linear regression were used to learn mapping
functions between eye images and gaze directions, which achieved a mean
accuracy of 7.2° (keeping the head still) and 8.9° (head movement) on
the EYEDIAP database. With the development of deep learning,
convolutional neural networks (CNNs) have been used to estimate the gaze
with millions of eye images in (<xref ref-type="bibr" rid="b22">22</xref>). They proved that a largescale
dataset and a large variety of data could improve the accuracy of the
appearance-based model for gaze tracking, which achieved errors of 1.71
cm and 2.53 cm without calibration on mobile phones and tablets,
respectively. Krafka, et al. (<xref ref-type="bibr" rid="b22">22</xref>) achieved a detection rate of 10–15 fps
on a typical mobile device. One of the main drawbacks to
appearance-based methods is that the appearance of the eyes is
significantly affected by the head pose (<xref ref-type="bibr" rid="b17">17</xref>). In addition, compared with
feature-based methods, appearance-based methods generally require larger
numbers of training images.</p>
    </sec>
    </sec>

    <sec id="S3">
      <title>Methods</title>

<p>The flow chart of the gaze tracking system is depicted in Figure 1.
The system includes calibration and testing phases. In the calibration
phase, mapping functions are regressed by the head pose, eye vectors and
gaze directions. Afterwards, the head pose, eye vectors and regressive
mapping functions are used to track the gaze in the testing phase. The
feature extraction consists of three parts for the iris centers, anchor
point and head pose calculations.</p>

<fig id="fig01" fig-type="figure" position="float">
					<label>Figure. 1</label>
					<caption>
						<p>Flow chart of the gaze tracking system</p>
					</caption>
					<graphic id="graph01" xlink:href="jemr-11-04-e-figure-01.png"/>
				</fig>

<p>First, the eye Region of Interests (ROIs) are extracted by twelve
points around the eyes, which are tracked by the facial landmarks
algorithm in (<xref ref-type="bibr" rid="b23">23</xref>). The eye ROIs are resized to twice the original sizes.
Then, greyscale erosion is applied. After that, the snakuscule is used
to locate the iris centers.</p>

<p>Second, thirty-six stable facial landmarks are used to compute the
anchor point. Thereafter, the left and right eye vectors are computed by
the iris centers and the anchor point, respectively.</p>

<p>Third, the head pose is estimated based on the six facial landmarks
of eye corners, nose tip, mouth corners and chin by using the OpenCV
(<xref ref-type="bibr" rid="b24">24</xref>) iterative algorithm.</p>

<p>Details of the overall system are discussed in the following
subsections.</p>

    <sec id="S3a">
      <title>Iris center localization</title>

<p>The eye ROI should be detected before locating the iris center.
Facial landmarks can provide more precise positioning of the mouth,
eyes, nose, etc. Therefore, an ensemble of regression trees algorithm
(<xref ref-type="bibr" rid="b23">23</xref>) is used to detect 68 facial landmarks. This algorithm uses
intensity differences between pixels to estimate the positions of 68
facial landmarks. The locations of 68 facial landmarks are shown in
Figure 2.</p>

<fig id="fig02" fig-type="figure" position="float">
					<label>Figure. 2</label>
					<caption>
						<p>The locations of 68 facial landmarks</p>
					</caption>
					<graphic id="graph02" xlink:href="jemr-11-04-e-figure-02.png"/>
				</fig>

<p>The rectangular eye ROIs are next extracted by the twelve points
around the eyes. The boundary coordinates of the eye ROIs are computed
by the equations in Table 1.</p>

<table-wrap id="t01" position="float">
					<label>Table 1.</label>
					<caption>
						<p>The boundary coordinates of eye ROIs.</p>
					</caption>
					<table frame="hsides" rules="groups" cellpadding="3">

    <thead>
      <tr>
        <th>Left eye</th>
        <th>Right eye</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td><italic>X<sub>l</sub> = P<sub>43x</sub></italic></td>
        <td><italic>X<sub>l</sub> = P<sub>37x</sub></italic></td>
      </tr>
      <tr>
        <td><italic>X<sub>r</sub> = P<sub>46x</sub></italic></td>
        <td><italic>X<sub>r</sub> = P<sub>40x</sub></italic></td>
      </tr>
      <tr>
        <td><italic>Y<sub>t</sub> = min{P<sub>44y</sub>
        ,P<sub>45y</sub>}-3</italic></td>
        <td><italic>Y<sub>t</sub> = min{P<sub>38y</sub>
        ,P<sub>39y</sub>}-3</italic></td>
      </tr>
      <tr>
        <td><italic>Y<sub>b</sub> = max{P<sub>47y</sub>
        ,P<sub>48y</sub>}+3</italic></td>
        <td><italic>Y<sub>b</sub> = max{P<sub>41y</sub>
        ,P<sub>42y</sub>}+3</italic></td>
      </tr>
    </tbody>
  </table>
					<table-wrap-foot>
						<fn id="FN1">
						<p>Note: X<sub>l</sub> , X<sub>r</sub> , Y<sub>t</sub> and Y<sub>b</sub>
are the left, right, top and bottom coordinates of the eye ROIs.
P<sub>ix</sub> and P<sub>iy</sub> are respectively the x and y
coordinates of the i<sup>th</sup> facial landmark. max{,} and min{,}
denote taking the maximum and minimum values respectively among the two
values. The coordinate origin is in the top left corner of the
image.</p>
						</fn>
					</table-wrap-foot>  
</table-wrap>

<p>Results in Figure 3 show that accurate eye ROIs can be extracted even
when large head rotations occur. Then, the eye ROIs dimensions are
magnified by a factor of two. Grayscale erosion with a 1-pixel disk
structure element is used in the eye ROIs to delete possible noise.
Finally, the simple active contour snakuscule is used to locate the iris
centers.</p>

<fig id="fig03" fig-type="figure" position="float">
					<label>Figure. 3</label>
					<caption>
						<p>Eye ROIs extraction on the EYEDIAP database</p>
					</caption>
					<graphic id="graph03" xlink:href="jemr-11-04-e-figure-03.png"/>
				</fig>

<p>As shown in Figure 4, snakuscule is an area-based circular snake that
contains an outer annulus and an inner disk. It performs well in
detecting circular regions with the maximum gray difference of the outer
annulus and the inner disk. <italic>β</italic> (Figure 4) is defined as
the ratio of outer to inner radius.</p>

<fig id="fig04" fig-type="figure" position="float">
					<label>Figure. 4</label>
					<caption>
						<p>Structure of a snakuscule</p>
					</caption>
					<graphic id="graph04" xlink:href="jemr-11-04-e-figure-04.png"/>
				</fig>

<p>Based on the sclera and the iris having the maximum gray difference,
snakuscule can expand or shrink to maximize the values of the outer
annulus and the inner disk. However, uncontrolled expansion or shrinkage
of the snakuscule needs numerous iterations before its final
convergence. To overcome the shortcoming, the snakuscule’s inner radius
is initialized by the eye anatomical dimensions that the radius of the
eyeball is in the range of 12-13 mm (<xref ref-type="bibr" rid="b25">25</xref>) and the radius of the iris is
approximately equal to an anatomical constant (approximately 7 mm) (<xref ref-type="bibr" rid="b26">26</xref>)
for most people. In addition, the method works well as the width of the
eye ROI extracted by facial landmarks is close to the diameter of the
eyeball in the image. Therefore, the snakuscule inner radius is
initialized by</p>

<fig id="eq01" fig-type="equation" position="anchor">
					<label>(1)</label>
					<graphic id="equation01" xlink:href="jemr-11-04-e-equation-01.png"/>
				</fig>

<p>where <italic>N</italic> is the width of the eye ROI and
<italic>α</italic> is a constant that involves the ratio of iris radius
to the eye ROI width.</p>

<p>Using the initialized snakuscule, the gray difference of the outer
annulus and the inner disk is calculated by use of formula (2) is
suggested in (<xref ref-type="bibr" rid="b12">12</xref>).</p>

<fig id="eq02" fig-type="equation" position="anchor">
					<label>(2)</label>
					<graphic id="equation02" xlink:href="jemr-11-04-e-equation-02.png"/>
				</fig>

<p>where <italic>f(p)</italic> denotes the image gray value at the
position <italic>p</italic>. Formula (2) is used to compute the gray
difference of <italic>G(p<sub>i</sub>)</italic>, where
<italic>p<sub>i</sub></italic> = (<italic>x<sub>i</sub></italic>,
<italic>y<sub>i</sub></italic>), <italic>x<sub>i </sub></italic>∈
[<italic>βr</italic>, <italic>N</italic>-<italic>βr</italic>],
<italic>x<sub>i</sub></italic> is an integer with a minimum interval of
1, <italic>y<sub>i</sub></italic> = [<italic>M</italic>/2] and
<italic>M</italic> is the height of the eye ROI. In other words, the
snakuscule is used to compute gray differences from left to right along
the horizontal centerline in the eye ROI. The location
<italic>p<sub>rc</sub></italic>(<italic>x<sub>rc</sub></italic>,
<italic>y<sub>rc</sub></italic>) with the maximum
<italic>G</italic>(<italic>p<sub>i</sub></italic>) is the rough iris
center.</p>

<p>As shown in Figure 5, (2<italic>δ</italic>+1)×(2<italic>δ</italic>+1)
iris center candidate points are determined by the rough iris center
<italic>(x<sub>rc</sub>, y<sub>rc</sub>)</italic> in the eye ROI. The
unit of <italic>δ</italic> is the pixel. The iris center candidate
points are used to accurately locate the iris center. In other words, in
the range of [<italic>x<sub>rc</sub></italic>±<italic>δ</italic>,
<italic>y<sub>rc</sub></italic>±<italic>δ</italic>],
(2<italic>δ</italic>+1)×(2<italic>δ</italic>+1) gray differences of G
were calculated by formula (2). The location
<italic>p<sub>c</sub>(x<sub>c</sub>, y<sub>c</sub>)</italic> with the
maximum G of the (2<italic>δ</italic>+1)×(2<italic>δ</italic>+1) iris
center candidate points was considered as the final iris center.</p>

<fig id="fig05" fig-type="figure" position="float">
					<label>Figure. 5</label>
					<caption>
						<p>(2δ+1)×(2δ+1) iris center candidate points
determined by the rough iris center (x<sub>rc</sub>,
y<sub>rc</sub>)</p>
					</caption>
					<graphic id="graph05" xlink:href="jemr-11-04-e-figure-05.png"/>
				</fig>

    </sec>

    <sec id="S3b">
      <title>Anchor point</title>

<p>Anchor point is used as a reference point to compute the eye vector.
The use of the inner or outer eye corners as the anchor points is the
common approach (<xref ref-type="bibr" rid="b1 b16">11, 16</xref>) among regression-based methods for gaze tracking
under natural light. However, Sesma et al. (<xref ref-type="bibr" rid="b11">11</xref>) showed that the eye
corners vibrate with eye rotations, which introduces errors in the gaze
tracking. In (<xref ref-type="bibr" rid="b17">17</xref>), the anchor point was set as the center of the image
patch tracked by the Lucas–Kanade inverse affine transform. However,
sometimes eye corners or the center of the patch cannot be accurately
tracked because they may be blocked or deformed in an image with a large
head rotation. Therefore, a novel anchor point is designed as the
reference point in this paper. The anchor point
<italic>p<sub>a</sub></italic>(<italic>x<sub>a</sub></italic>,
<italic>y<sub>a</sub></italic>) is computed by</p>

<fig id="eq03" fig-type="equation" position="anchor">
					<label>(3)</label>
					<graphic id="equation03" xlink:href="jemr-11-04-e-equation-03.png"/>
				</fig>

<p>where <italic>n</italic> is the number of facial landmarks and
(<italic>x<sub>i</sub></italic>, <italic>y<sub>i</sub></italic>) is the
coordinate of the <italic>i<sup>th</sup></italic> facial landmark.</p>

<p>There are two advantages to the anchor point. (1) The anchor point
computed by stable facial landmarks does not vibrate with eye rotations
and is not blocked with large head movements. (2) The mean of the facial
landmarks can reduce the error compared with the single feature point
near the eye area.</p>
    </sec>

    <sec id="S3c">
      <title>Head pose</title>

<p>The action of looking at objects usually involves head movement
towards the object, and eye rotation focusing on the object. Fridman et
al. (<xref ref-type="bibr" rid="b27">27</xref>) used the head pose to track the gaze. However, Kennedy et al.
(<xref ref-type="bibr" rid="b28">28</xref>) found that gaze tracking merely based on the head pose is neither
accurate nor consistent in human-robot interactions. Therefore, gaze
tracking should synchronize eye rotation and head movement.</p>

<p>For the past several years, different methods for head pose
estimation have been developed. The 2D-3D point correspondence methods
achieve robust performance and can address large head movements.
Therefore, the OpenCV iterative (Levenberg-Marquardt optimization)
algorithm is used to estimate the head pose.</p>
    </sec>

    <sec id="S3d">
      <title>Mapping functions</title>

<p>After the eye vectors, head pose and screen coordinates have been
obtained, the regression strategy is used to establish the mapping
function between them. The linear terms, squared terms, cubic terms and
interactions summarized in (<xref ref-type="bibr" rid="b29">29</xref>) are widely used for mapping eye vectors
to screen coordinates. Unlike the head pose that was used to improve the
eye vectors in (<xref ref-type="bibr" rid="b30">30</xref>), it is directly introduced in the mapping functions.
The mapping functions of n points with a polynomial of n or fewer terms
can be expressed by</p>

<fig id="eq04" fig-type="equation" position="anchor">
					<label>(4)</label>
					<graphic id="equation04" xlink:href="jemr-11-04-e-equation-04.png"/>
				</fig>

<p>where <italic>g<sub>h</sub></italic> and
<italic>g<sub>v</sub></italic> are the POR of the horizontal and
vertical directions, the coefficients <italic>a<sub>k</sub></italic> and
<italic>b<sub>k</sub></italic> are determined by the calibration phase,
<italic>e<sub>h</sub></italic> and <italic>e<sub>v</sub></italic> are
the eye vectors of the horizontal and vertical directions, and
<italic>h<sub>p</sub></italic>, <italic>h<sub>y</sub></italic> and
<italic>h<sub>r</sub></italic> are the head pose angles of the pitch,
yaw and roll, respectively. In this paper, six mapping functions derived
by formula (4) are used to estimate the gaze. As shown in Table 2, the
mapping functions of No.1 and No.2 use the linear and squared terms of
eye vectors. No.3, No.4, No.5 and No.6 mix the linear and squared terms
of eye vectors and head pose.</p>

<table-wrap id="t02" position="float">
					<label>Table 2.</label>
					<caption>
						<p>Six mapping functions derived by formula (4), where
the subscript “i” denotes “h” or “v”.</p>
					</caption>
					<graphic id="graph10" xlink:href="jemr-11-04-e-figure-10.png"/>
					</table-wrap>

<p>Ocular dominance theory is common and long-standing (<xref ref-type="bibr" rid="b31">31</xref>). In (<xref ref-type="bibr" rid="b32">32</xref>), a
dominant eye is shown to be more accurate on SMI HiSpeed 500-Hz eye
tracker systems. In addition, Quartley and Firth (<xref ref-type="bibr" rid="b33">33</xref>) found that
observers favor the left eye for leftward targets and the right eye for
rightward targets. Furthermore, for relatively small eye-in-head
rotations, Cui and Hondzinski (<xref ref-type="bibr" rid="b34">34</xref>) proved that taking the average POR of
the two eyes for gaze tracking is more accurate than using only one eye
on remote eye tracker. In addition, one of the eyes may be blocked due
to a large head movement. To unify these situations, a weight
coefficient is used on the POR of the left and right eyes to revise the
final POR of the horizontal <italic>g<sub>fh</sub></italic> and vertical
<italic>g<sub>fv</sub></italic> directions.</p>

<fig id="eq05" fig-type="equation" position="anchor">
					<label>(5)</label>
					<graphic id="equation05" xlink:href="jemr-11-04-e-equation-05.png"/>
				</fig>

<p>where <italic>w</italic> is the weight coefficient and
<italic>g<sub>lh</sub></italic>, <italic>g<sub>rh</sub></italic>,
<italic>g<sub>lv</sub></italic> and <italic>g<sub>rv</sub></italic> are
POR of the horizontal and vertical directions of the left and right
eyes, respectively.</p>
    </sec>
    </sec>

    <sec id="S4">
      <title>Evaluation</title>
    <sec id="S4a">
      <title>Databases</title>

<p>The GI4E database (<xref ref-type="bibr" rid="b35">35</xref>) consists of 1236 images (800×600) from 103
different participants. Each participant has 12 images in which the
participant gazed at different points on the screen. A large number of
participants with low resolution images make it suitable for evaluating
the performance of the proposed iris center localization method.</p>

<p>The EYEDIAP database contains RGB (640×480), RGB-D and HD (1920×1080)
video clips from 16 participants. Continuous Screen (CS), Discrete
Screen (DS) and 3D Floating Target (FT) are the stimuli that were used
for the participants to gaze at. As shown in Figure 6, on the computer
screen, DS target was drawn every 1.1 seconds on random locations and CS
target was programmed to move along a random trajectory for 2s. The
participants were asked to keep an approximately Static (S) or perform
head Movements (M) when they gazed at the visual target. Each
participant was recorded for 2 to 3 minutes. The proposed method was
implemented on the RGB video clips that contains Discrete Screen with
Static (DSS) and Discrete Screen with head Movements (DSM), and
Continuous Screen with Static (CSS) and Continuous Screen with head
Movements (CSM).</p>

<fig id="fig06" fig-type="figure" position="float">
					<label>Figure. 6</label>
					<caption>
						<p>Example of screen coordinates for a video clips
using (a) <italic>Discrete Screen target, (b) Continuous Screen
target on the EYEDIAP database.</italic></p>
					</caption>
					<graphic id="graph06" xlink:href="jemr-11-04-e-figure-06.png"/>
				</fig>

<p>On the EYEDIAP database, the frame-by-frame screen target
coordinates, head pose tracking states and eyes tracking states
including the eyeballs’ 3D coordinates have been provided in the files
of &quot;screen_coordinates.txt&quot;, &quot;head_pose.txt&quot; and
&quot;eyes tracking.txt&quot;, respectively. It is noted that, a total
of 52 RGB video clips of 13 participants were used to estimate the gaze
in this paper because the 12<sup>th</sup> and 13<sup>th</sup>
participants only recorded the video clips for 3D FT and the
7<sup>th</sup> participant’s facial landmarks can be tracked on a small
fraction of the entire RGB video clips due to the poor contrast.</p>
    </sec>

    <sec id="S4b">
      <title>Evaluation of iris center localization</title>

<p>The computation of the estimated eye center normalized error by use
of formula (6) is suggested in (<xref ref-type="bibr" rid="b36">36</xref>).</p>

<fig id="eq06" fig-type="equation" position="anchor">
					<label>(6)</label>
					<graphic id="equation06" xlink:href="jemr-11-04-e-equation-06.png"/>
				</fig>


<p>where <italic>d<sub>left</sub></italic> and
<italic>d<sub>right</sub></italic> are the distances between the
estimated and labelled iris centers of the left and right eyes, and
<italic>d</italic> is the distance between the labeled left and right
iris centers. The estimated eye centers in the range of the normalized
error <italic>e</italic> ≤ 0.05 that are equivalent to locate in the
pupil can be used for gaze tracking applications (<xref ref-type="bibr" rid="b37">37</xref>). Therefore,
<italic>e</italic> ≤ 0.05 is used as the benchmark to evaluate the
optimal parameters of the iris center localization method in this
paper.</p>

<p>For iris center localization, <italic>α</italic>, <italic>β</italic>
and <italic>δ</italic> with different values were used on the GI4E
database. The optimal values can be obtained when the number of eyes
with a normalized error <italic>e</italic> ≤ 0.05 reaches the maximum
value. Therefore, values for <italic>α</italic> from 0.21 to 0.25 with
the minimum interval of 0.05, <italic>β</italic> from 1.32 to 1.52 with
the minimum interval of 0.04 and <italic>δ</italic> from 1 to 4 with the
minimum interval of 1 were assessed in this paper</p>

<p>As shown in Figure 7, the maximum number of images with a normalized
error <italic>e</italic> ≤ 0.05 is 1222 when <italic>α</italic> = 0.25,
<italic>β</italic> = 1.4 and <italic>δ</italic> = 1 or 2.</p>

<fig id="fig07" fig-type="figure" position="float">
					<label>Figure. 7</label>
					<caption>
						<p>The number of images from the GI4E database with a
normalized error e ≤ 0.05 for different values ofα, β and δ, where (a)δ=
1 (b)δ= 2 (c)δ= 3 and (d)δ= 4.</p>
					</caption>
					<graphic id="graph07" xlink:href="jemr-11-04-e-figure-07.png"/>
				</fig>
    </sec>

    <sec id="S4c">
      <title>Evaluation of different mapping functions</title>

<p>The performance of different mapping functions was compared on the
DSS, CSS, DSM and CSM RGB video clips. Iris centers were detected by the
optimal parameters of <italic>α</italic> = 0.25, <italic>β</italic> =
1.4 and <italic>δ</italic> = 1 or 2. The anchor point was computed by
formula (3), where the unstable facial landmarks around the mouth and
eye areas were removed. Therefore, the parameter n equals 36. Then, the
eye vectors for the horizontal and vertical directions were computed by
the iris centers and the anchor point. Head pose was estimated by the
iterative with six points (9, 31, 37, 46, 49 and 55 in Figure 2).
Finally, the six mapping functions listed in Table 2 were used to
estimate the gaze from the eye vectors and the head pose.</p>

<p>For each RGB video clip on the EYEDIAP database, the first 1000
frames that the faces could be detected were used as calibration frames,
and the remaining frames were used as testing frames. The gaze tracking
errors for the 13 participants were computed by averaging the results of
the participants’ testing frames. The gaze tracking error of each frame
is computed by the POR of the left eye, the original 3D coordinate of
the eye gaze screen point and the original 3D coordinate of the left
eyeball.</p>

<p>The average gaze tracking errors of 52 RGB video clips that were used
to evaluate the optimal mapping functions are shown in Table 3. Overall,
the mapping functions of No.4 and No.2 achieved the best results in the
horizontal and vertical directions, respectively. In addition, the gaze
tracking errors show that <italic>δ</italic> = 2 performs better than
<italic>δ</italic> = 1. Therefore, the following experimental results
that involve the iris centers localization are conducted with
<italic>δ</italic> = 2. Meanwhile, the horizontal and vertical gaze
tracking errors in the following experiments are regressed by the
mapping functions of No.4 and No.2, respectively.</p>

<table-wrap id="t03" position="float">
					<label>Table 3.</label>
					<caption>
						<p>The average gaze tracking errors (degrees) of 52 RGB
video clips on the EYEDIAP database computed by six mapping
functions.</p>
					</caption>
					<table frame="hsides" rules="groups" cellpadding="3">

    <thead>
      <tr>
        <th>No.</th>
        <th colspan="3"><italic>δ</italic> = 1</th>
        <th colspan="3"><italic>δ</italic> = 2</th>
      </tr>
      <tr>
        <td></td>
        <td>H</td>
        <td>V</td>
        <td>C</td>
        <td>H</td>
        <td>V</td>
        <td>C</td>
      </tr>
     </thead>
    <tbody>     
      <tr>
        <td>1</td>
        <td>6.4</td>
        <td>4.0</td>
        <td>7.5</td>
        <td>6.3</td>
        <td>3.9</td>
        <td>7.4</td>
      </tr>
      <tr>
        <td>2</td>
        <td>6.4</td>
        <td>3.9</td>
        <td>7.5</td>
        <td>6.2</td>
        <td><bold>3.8</bold></td>
        <td>7.3</td>
      </tr>
      <tr>
        <td>3</td>
        <td>6.0</td>
        <td>4.1</td>
        <td>7.3</td>
        <td>5.8</td>
        <td>4.1</td>
        <td>7.1</td>
      </tr>
      <tr>
        <td>4</td>
        <td>5.8</td>
        <td>4.0</td>
        <td>7.0</td>
        <td><bold>5.7</bold></td>
        <td>4.0</td>
        <td>7.0</td>
      </tr>
      <tr>
        <td>5</td>
        <td>6.2</td>
        <td>4.6</td>
        <td>7.7</td>
        <td>6.1</td>
        <td>4.6</td>
        <td>7.6</td>
      </tr>
      <tr>
        <td>6</td>
        <td>6.1</td>
        <td>4.5</td>
        <td>7.6</td>
        <td>5.9</td>
        <td>4.4</td>
        <td>7.4</td>
      </tr>
    </tbody>
  </table>
					<table-wrap-foot>
						<fn id="FN3">
						<p>Note: No. denotes the mapping functions in Table 2. H, V and C are
the horizontal, vertical and combined gaze tracking errors,
respectively. C is the sum of the squares of the H and V. The minimum
errors are marked as bold.</p>
						</fn>
					</table-wrap-foot>  
</table-wrap>
    </sec>

    <sec id="S4d">
      <title>Evaluation of the weight coefficient <italic>w</italic></title>

<p>In this paper, the weight coefficient <italic>w</italic> ∈ [0,1] with
a minimum value of 0.1 was used to compute the gaze tracking errors on
the EYEDIAP database. As shown in Figure 8, <italic>w</italic> = 0.5,
0.6 and 0.5 achieve the best horizontal, vertical and combined gaze
tracking errors, respectively. For simplicity, <italic>w</italic> = 0.5
was used in this paper.</p>

<fig id="fig08" fig-type="figure" position="float">
					<label>Figure. 8</label>
					<caption>
						<p>The H, V and C gaze tracking errors <italic>computed by
different w on the EYEDIAP database.</italic></p>
					</caption>
					<graphic id="graph08" xlink:href="jemr-11-04-e-figure-08.png"/>
				</fig>

<p>To provide more comprehensive improved results by <italic>w</italic>
for the 13 participants on the EYEDIAP database, the horizontal,
vertical and combined gaze tracking errors on the DSS and CSS RGB video
clips are shown in Figure 9. The results on the DSM and CSM RGB video
clips are shown in Table 4. L (w = 1), R (w = 0) and L+R (w = 0.5) are
the gaze tracking error computed by the POR of the left, right eyes and
improved, the original 3D coordinate of the eye gaze screen point and
the original 3D coordinate of the left and right eyeball. Meanwhile, the
Total Frames (TF) and the Detected Frames (DF) are presented in the
table. DF is a frame in which the face can be detected. Compared with
the average face detection rate (DF/TF) of 86.9% in (<xref ref-type="bibr" rid="b10">10</xref>) on the CSM RGB
video clips, an average of 97.2% was obtained in this paper. Considering
the low quality frames and large head pose variations in the video
clips, we believe the face detection rate is robust. As shown in Figure
9 and Table 4, most of the single eye gaze tracking errors are improved
by averaging the POR of both eyes. The method achieved the average
combined gaze tracking errors 5.5°, 4.6°, 7.2° and 7.6° on the DSS, CSS,
DSM and CSM RGB video clips. Compared with the gaze tracking error of
2.9° under natural light in (<xref ref-type="bibr" rid="b17">17</xref>). The reason the errors are high is
because that the EYEDIAP database has the lower quality eye image (the
iris radius ≈4.5 pixels) compared to (<xref ref-type="bibr" rid="b17">17</xref>) self-built database (the iris
radius ≈9 pixels).</p>

<fig id="fig09" fig-type="figure" position="float">
					<label>Figure. 9</label>
					<caption>
						<p>The H, V and C gaze tracking errors on the EYEDIAP
(a) DSS and (b) CSS RGB video clips.</p>
					</caption>
					<graphic id="graph09" xlink:href="jemr-11-04-e-figure-09.png"/>
				</fig>

<table-wrap id="t04" position="float">
					<label>Table 4.</label>
					<caption>
						<p>The gaze tracking errors (degrees) on the EYEDIAP
CSM and DSM RGB video clips, where No. is the participant of the EYEDIAP
database.</p>
					</caption>
					<table frame="hsides" rules="groups" cellpadding="3">

    <thead>
      <tr>
        <th></th>
        <th colspan="12">DSM</th>
      </tr>

      <tr>
        <td></td>
        <td></td>
        <td></td>
        <td></td>
        <td colspan="3">H</td>
        <td colspan="3">V</td>
        <td colspan="3">C</td>
      </tr>
      <tr>
        <td>No.</td>
        <td>TF</td>
        <td>DF</td>
        <td>DF/TF</td>
        <td>L</td>
        <td>R</td>
        <td>L+R</td>
        <td>L</td>
        <td>R</td>
        <td>L+R</td>
        <td>L</td>
        <td>R</td>
        <td>L+R</td>
      </tr>
    </thead>
    <tbody>      
      <tr>
        <td>1</td>
        <td>4465</td>
        <td>3893</td>
        <td>87.2%</td>
        <td>7.9</td>
        <td>8.0</td>
        <td>7.7</td>
        <td>4.5</td>
        <td>4.5</td>
        <td>4.5</td>
        <td>9.1</td>
        <td>9.2</td>
        <td>8.9</td>
      </tr>
      <tr>
        <td>2</td>
        <td>4464</td>
        <td>4335</td>
        <td>97.1%</td>
        <td>7.6</td>
        <td>6.5</td>
        <td>6.8</td>
        <td>3.7</td>
        <td>3.8</td>
        <td>3.7</td>
        <td>8.4</td>
        <td>7.6</td>
        <td>7.7</td>
      </tr>
      <tr>
        <td>3</td>
        <td>4433</td>
        <td>4322</td>
        <td>97.5%</td>
        <td>5.9</td>
        <td>6.9</td>
        <td>5.9</td>
        <td>5.7</td>
        <td>5.5</td>
        <td>5.3</td>
        <td>8.2</td>
        <td>8.8</td>
        <td>7.9</td>
      </tr>
      <tr>
        <td>4</td>
        <td>4464</td>
        <td>4402</td>
        <td>98.6%</td>
        <td>7.7</td>
        <td>7.2</td>
        <td>5.7</td>
        <td>5.0</td>
        <td>4.8</td>
        <td>4.5</td>
        <td>9.2</td>
        <td>8.7</td>
        <td>7.3</td>
      </tr>
      <tr>
        <td>5</td>
        <td>4465</td>
        <td>4465</td>
        <td>100%</td>
        <td>5.9</td>
        <td>5.2</td>
        <td>4.5</td>
        <td>4.5</td>
        <td>4.1</td>
        <td>4.0</td>
        <td>7.4</td>
        <td>6.7</td>
        <td>6.0</td>
      </tr>
      <tr>
        <td>6</td>
        <td>4464</td>
        <td>4464</td>
        <td>100%</td>
        <td>5.3</td>
        <td>5.5</td>
        <td>5.1</td>
        <td>4.6</td>
        <td>4.6</td>
        <td>4.5</td>
        <td>7.0</td>
        <td>7.2</td>
        <td>6.8</td>
      </tr>
      <tr>
        <td>8</td>
        <td>4465</td>
        <td>4020</td>
        <td>90.0%</td>
        <td>9.2</td>
        <td>10.1</td>
        <td>8.7</td>
        <td>4.8</td>
        <td>4.5</td>
        <td>4.4</td>
        <td>10.4</td>
        <td>11.0</td>
        <td>9.8</td>
      </tr>
      <tr>
        <td>9</td>
        <td>4464</td>
        <td>4362</td>
        <td>97.7%</td>
        <td>10.1</td>
        <td>6.8</td>
        <td>7.5</td>
        <td>3.7</td>
        <td>3.7</td>
        <td>3.6</td>
        <td>10.8</td>
        <td>7.8</td>
        <td>8.3</td>
      </tr>
      <tr>
        <td>10</td>
        <td>4464</td>
        <td>4450</td>
        <td>99.7%</td>
        <td>9.1</td>
        <td>9.5</td>
        <td>9.2</td>
        <td>4.5</td>
        <td>4.8</td>
        <td>4.6</td>
        <td>10.2</td>
        <td>10.7</td>
        <td>10.3</td>
      </tr>
      <tr>
        <td>11</td>
        <td>4465</td>
        <td>4465</td>
        <td>100%</td>
        <td>4.7</td>
        <td>3.7</td>
        <td>3.5</td>
        <td>4.5</td>
        <td>6.4</td>
        <td>3.5</td>
        <td>6.5</td>
        <td>7.4</td>
        <td>4.9</td>
      </tr>
      <tr>
        <td>14</td>
        <td>4465</td>
        <td>4464</td>
        <td>100%</td>
        <td>4.7</td>
        <td>4.3</td>
        <td>3.6</td>
        <td>3.7</td>
        <td>3.6</td>
        <td>3.3</td>
        <td>6.0</td>
        <td>5.5</td>
        <td>4.9</td>
      </tr>
      <tr>
        <td>15</td>
        <td>4465</td>
        <td>4465</td>
        <td>100%</td>
        <td>3.8</td>
        <td>3.4</td>
        <td>3.2</td>
        <td>4.1</td>
        <td>4.2</td>
        <td>4.2</td>
        <td>5.6</td>
        <td>5.4</td>
        <td>5.3</td>
      </tr>
      <tr>
        <td>16</td>
        <td>4465</td>
        <td>4286</td>
        <td>96.0%</td>
        <td>6.1</td>
        <td>6.6</td>
        <td>4.9</td>
        <td>4.0</td>
        <td>4.1</td>
        <td>3.9</td>
        <td>7.3</td>
        <td>7.8</td>
        <td>6.2</td>
      </tr>
      <tr>
        <td>Avg.</td>
        <td>4462</td>
        <td>4338</td>
        <td>97.2%</td>
        <td>6.8</td>
        <td>6.5</td>
        <td>5.9</td>
        <td>4.4</td>
        <td>4.5</td>
        <td>4.2</td>
        <td>8.1</td>
        <td>7.9</td>
        <td>7.2</td>
      </tr>
      <tr>
        <td></td>
        <td colspan="12">CSM</td>

      </tr>
      <tr>
        <td>1</td>
        <td>4457</td>
        <td>3370</td>
        <td>75.6%</td>
        <td>9.7</td>
        <td>11.2</td>
        <td>10.3</td>
        <td>4.0</td>
        <td>4.1</td>
        <td>4.0</td>
        <td>10.5</td>
        <td>12.0</td>
        <td>11.0</td>
      </tr>
      <tr>
        <td>2</td>
        <td>4457</td>
        <td>4360</td>
        <td>97.8%</td>
        <td>7.9</td>
        <td>7.6</td>
        <td>7.6</td>
        <td>3.2</td>
        <td>3.2</td>
        <td>3.1</td>
        <td>8.5</td>
        <td>8.2</td>
        <td>8.2</td>
      </tr>
      <tr>
        <td>3</td>
        <td>4458</td>
        <td>3962</td>
        <td>88.9%</td>
        <td>6.0</td>
        <td>7.0</td>
        <td>5.9</td>
        <td>4.1</td>
        <td>3.8</td>
        <td>4.0</td>
        <td>7.3</td>
        <td>8.0</td>
        <td>7.1</td>
      </tr>
      <tr>
        <td>4</td>
        <td>4494</td>
        <td>4333</td>
        <td>96.4%</td>
        <td>8.1</td>
        <td>7.0</td>
        <td>6.7</td>
        <td>3.7</td>
        <td>3.8</td>
        <td>3.7</td>
        <td>8.9</td>
        <td>8.0</td>
        <td>7.7</td>
      </tr>
      <tr>
        <td>5</td>
        <td>4458</td>
        <td>4394</td>
        <td>98.6%</td>
        <td>5.3</td>
        <td>6.1</td>
        <td>5.1</td>
        <td>3.8</td>
        <td>3.6</td>
        <td>3.7</td>
        <td>6.5</td>
        <td>7.1</td>
        <td>6.3</td>
      </tr>
      <tr>
        <td>6</td>
        <td>4458</td>
        <td>4458</td>
        <td>100%</td>
        <td>7.7</td>
        <td>8.4</td>
        <td>7.6</td>
        <td>4.4</td>
        <td>4.5</td>
        <td>4.1</td>
        <td>8.8</td>
        <td>9.6</td>
        <td>8.6</td>
      </tr>
      <tr>
        <td>8</td>
        <td>4458</td>
        <td>3510</td>
        <td>78.7%</td>
        <td>10.6</td>
        <td>9.3</td>
        <td>9.6</td>
        <td>4.1</td>
        <td>4.5</td>
        <td>4.3</td>
        <td>11.4</td>
        <td>10.3</td>
        <td>10.5</td>
      </tr>
      <tr>
        <td>9</td>
        <td>4457</td>
        <td>4199</td>
        <td>94.2%</td>
        <td>7.4</td>
        <td>7.2</td>
        <td>7.2</td>
        <td>4.0</td>
        <td>3.8</td>
        <td>3.8</td>
        <td>8.4</td>
        <td>8.1</td>
        <td>8.1</td>
      </tr>
      <tr>
        <td>10</td>
        <td>4492</td>
        <td>4492</td>
        <td>100%</td>
        <td>6.5</td>
        <td>7.3</td>
        <td>6.6</td>
        <td>5.0</td>
        <td>4.9</td>
        <td>4.9</td>
        <td>8.2</td>
        <td>8.8</td>
        <td>8.3</td>
      </tr>
      <tr>
        <td>11</td>
        <td>4458</td>
        <td>4360</td>
        <td>97.8%</td>
        <td>6.0</td>
        <td>6.5</td>
        <td>6.2</td>
        <td>3.6</td>
        <td>3.6</td>
        <td>3.6</td>
        <td>7.0</td>
        <td>7.4</td>
        <td>7.1</td>
      </tr>
      <tr>
        <td>14</td>
        <td>4458</td>
        <td>4439</td>
        <td>99.6%</td>
        <td>4.1</td>
        <td>3.3</td>
        <td>3.4</td>
        <td>3.1</td>
        <td>4.1</td>
        <td>3.4</td>
        <td>5.2</td>
        <td>5.2</td>
        <td>4.9</td>
      </tr>
      <tr>
        <td>15</td>
        <td>4458</td>
        <td>4458</td>
        <td>100%</td>
        <td>3.6</td>
        <td>3.5</td>
        <td>2.9</td>
        <td>3.2</td>
        <td>3.2</td>
        <td>3.1</td>
        <td>4.8</td>
        <td>4.8</td>
        <td>4.2</td>
      </tr>
      <tr>
        <td>16</td>
        <td>4458</td>
        <td>4293</td>
        <td>96.3%</td>
        <td>5.5</td>
        <td>9.7</td>
        <td>5.5</td>
        <td>4.2</td>
        <td>5.8</td>
        <td>3.8</td>
        <td>6.9</td>
        <td>11.3</td>
        <td>6.6</td>
      </tr>
      <tr>
        <td>Avg.</td>
        <td>4463</td>
        <td>4202</td>
        <td>94.2%</td>
        <td>6.8</td>
        <td>7.2</td>
        <td>6.5</td>
        <td>3.9</td>
        <td>4.1</td>
        <td>3.8</td>
        <td>7.9</td>
        <td>8.4</td>
        <td>7.6</td>
      </tr>
    </tbody>
  </table>
</table-wrap>

    </sec>

    <sec id="S4e">
      <title>Computational cost</title>

<p>The method was realized by using the C++ language with Microsoft
Visual Studio 2017, OpenCV and the dlib (<xref ref-type="bibr" rid="b38">38</xref>) library on a laptop with a
2.7-GHz Intel(R) Core(TM) i7-7500 processor and 8-GB RAM. Data from the
EYEDIAP database and the laptop camera were used to measure the
execution time, which was computed by averaging the processing time of
all testing frames. The execution times of the proposed method are shown
in Table 5. It is noted that facial landmarks detection includes face
detection and landmarks detection. Experiment results show that face
detection consumes most of the processing time. Therefore, the original
resolution (640×480) was resized to improve the efficiency of face
detection. Facial landmarks are tracked on the faces from the raw
frames, in which the faces are obtained by use of the rectangular face
ROI detected in the resized frames. Unfortunately, the face detection
rate of the EYEDIAP database decreased when the resolution is lower than
512×380 because the RGB video clips have small faces. Therefore, the
execution speed is 22 fps for the EYEDIAP database. The mode of data
from the laptop camera is closer to the practical system, which had an
execution speed of 35 fps. Compared to the IR tracker with an execution
speed in excess of 100 fps, natural light trackers still have a long way
to go to be usable in practice.</p>

<table-wrap id="t05" position="float">
					<label>Table 5.</label>
					<caption>
						<p>The execution time of the gaze tracking
system.</p>
					</caption>
					<table frame="hsides" rules="groups" cellpadding="3">

    <thead>
      <tr>
        <th></th>
        <th></th>
        <th colspan="2">Execution time (milliseconds)</th>
        <th></th>

      </tr>

      <tr>
        <td>Data</td>
        <td>Resolution</td>
        <td>Facial landmarks detection</td>
        <td>Gaze tracking</td>
        <td>fps</td>
      </tr>
    </thead>
    <tbody>      
      <tr>
        <td>EYEDIAP</td>
        <td>512×380</td>
        <td>44.5</td>
        <td>0.7</td>
        <td>22</td>
      </tr>
      <tr>
        <td>Camera</td>
        <td>320×240</td>
        <td>27.4</td>
        <td>1.2</td>
        <td>35</td>
      </tr>
    </tbody>
  </table>
</table-wrap>

    </sec>
    </sec>

    <sec id="S5">
      <title>Discussion</title>

<p>This paper aims to provide a gaze tracking system with a
single-camera under natural light to extend its generality. The
accuracies of the iris centers and the usability of the anchor point
result in more applicable eye vectors. Using the eye vectors and the
estimated head pose, second-order polynomial mapping functions are used
to compute the POR of the horizontal and vertical directions on the
screen. By implementing a weight coefficient on the POR of the left and
right eyes, the final gaze errors improved. The iris center localization
method has been shown to be accurate on the GI4E database, which
consists of low resolution images under realistic conditions of 103
participants. With a normalized error <italic>e</italic> ≤ 0.05, the
feature position of the iris center has achieved an error as low as
1.13%.</p>

<p>Compared with the accuracy of 93.92% in (<xref ref-type="bibr" rid="b35">35</xref>), the proposed iris
center localization method presents a more accurate result of 98.87% for
the feature position of the iris center. Moreover, it also outperforms
all previous iris center localization methods in the same database.
Compared with the average combined gaze tracking errors of 7.2° and 8.9°
on the EYEDIAP CSS and CSM RGB video clips in (<xref ref-type="bibr" rid="b9">9</xref>), the proposed gaze
tracking method reduced the errors by 36% and 14.6%, respectively.
Compared with the average gaze tracking errors of 7.6° and 6.7° in
horizontal and vertical directions on the EYEDIAP CSM RGB video clips in
(<xref ref-type="bibr" rid="b10">10</xref>), 1.1° and 2.9°, respectively were reduced by the proposed method.
Furthermore, the RGB and RGB-D video clips both were used as inputs in
(<xref ref-type="bibr" rid="b9 b10">9, 10</xref>).</p>

<p>The gaze tracking errors are significantly better than the
appearance-based and the model-based methods, indicating the
effectiveness of the regression-based gaze tracking method in low
quality images. However, limited by the random gaze trajectories/points
on the screen of which the EYEDIAP database is built, 1000 detected
frames from the RGB video clips are used in the calibration phase. It is
equivalent to the use of approximately 34 seconds from the 2 or 3
minutes of RGB video clips. In a practical application system, the
calibration time could be reduced by calibration strategies summarized
in (<xref ref-type="bibr" rid="b17">17</xref>) and the gaze tracking errors could be reduced by
post-calibration regression in (<xref ref-type="bibr" rid="b39">39</xref>). In addition, considering the
average gaze tracking errors shown in Table 3, the introduction of head
pose in the mapping functions does not improve the accuracy of the
vertical direction, but reduces the errors of the horizontal direction.
The reason is that the eye vectors derived by the iris centers and the
anchor point already contain some information of the head pose.</p>

<p>Although the algorithm in (<xref ref-type="bibr" rid="b23">23</xref>) presents robustness and accuracy,
facial landmarks still cannot be tracked in some images especially on
the CSM and DSM RGB video clips. Hence, in future work, the facial
landmarks’ algorithm should be improved in low quality images with large
head movements. From the results in Figure 9 and Table 4, most of the
single eye gaze tracking errors are improved by averaging the POR of
both eyes. However, when one eye is blocked due to a large head
movement, the <italic>w</italic> of the blocked eye should be decreased
or set to 0. Meanwhile, <italic>w</italic> may be affected by the
dominant eye that changes with the direction of the gaze (<xref ref-type="bibr" rid="b32">32</xref>).
Therefore, in the future, a dedicated database with large head pose
variations, and various directions of gaze can be built to study
choosing a better value of <italic>w</italic>. In addition, the mapping
functions are regressed by person-special eye vectors, which results in
a person-dependent gaze tracking system. A person-independent gaze
tracking system can be researched by normalizing different people’s eye
vectors in a feature space.</p>

<p>A gaze tracking method with a non-intrusive sensor under natural
light renders the system suitable for universal use on smartphones,
laptops or tablets with a camera. The system, with an accuracy of
approximately 6°, can be used in secure authentication of biometrics
(<xref ref-type="bibr" rid="b40">40</xref>) and gaze-based password entry fields for reducing shoulder-surfing
(<xref ref-type="bibr" rid="b41">41</xref>). The proposed gaze tracking method further bridges the interaction
gap between humans and machines.</p>
    </sec>

    <sec id="S6" sec-type="COI-statement">
      <title>Ethics and Conflict of Interest</title>

<p>The author(s) declare(s) that the contents of the article are in
agreement with the ethics described in
<ext-link ext-link-type="uri" xlink:href="http://biblio.unibe.ch/portale/elibrary/BOP/jemr/ethics.html" xlink:show="new">http://biblio.unibe.ch/portale/elibrary/BOP/jemr/ethics.html</ext-link>
and that there is no conflict of interest regarding the publication of
this paper.</p>
    </sec>
</body>
<back>
<ref-list>
<ref id="b29"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Blignaut</surname>, <given-names>P.</given-names></name></person-group> (<year>2016</year>). <article-title>Idiosyncratic Feature-Based Gaze Map-ping.</article-title> <source>Journal of Eye Movement Research</source>, <volume>9</volume>(<issue>3</issue>), <fpage>1</fpage>. <pub-id pub-id-type="doi" specific-use="author">10.16910/jemr.9.3.2</pub-id><issn>1995-8692</issn></mixed-citation></ref>
<ref id="b39"><mixed-citation publication-type="book-chapter" specific-use="linked"><person-group person-group-type="author"><name><surname>Blignaut</surname> <given-names>P</given-names></name>, <name><surname>Holmqvist</surname> <given-names>K</given-names></name>, <name><surname>Nystr&#246;m</surname> <given-names>M</given-names></name>, <name><surname>Dewhurst</surname> <given-names>R</given-names></name></person-group>. <chapter-title>Improving the accuracy of video-based eye tracking in real time through post-calibration regression.</chapter-title> In: Horsley M, Eliot M, Knight B, Reilly R, editors. In Current Trends in Eye Tracking Research. Cham: springer; <year>2014</year> p. 77-100. doi: <pub-id pub-id-type="doi" specific-use="author">10.1007/978-3-319-02868-2_5</pub-id></mixed-citation></ref>
<ref id="b40"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Boehm</surname> <given-names>A</given-names></name>, <name><surname>Chen</surname> <given-names>D</given-names></name>, <name><surname>Frank</surname> <given-names>M</given-names></name>, <name><surname>Huang</surname> <given-names>L</given-names></name>, <name><surname>Kuo</surname> <given-names>C</given-names></name>, <name><surname>Lolic</surname> <given-names>T</given-names></name>, <name><surname>Martinovic</surname> <given-names>I</given-names></name>, <name><surname>Song</surname>, <given-names>D.</given-names></name></person-group> <article-title>Safe: Secure authentication with face and eyes.</article-title> <source>2013 International Conference on Privacy and Security in Mobile Systems (PRISMS)</source>; <conf-date>2013 Jun 1-8</conf-date>; <conf-loc>Atlantic City, NJ, USA</conf-loc>.<publisher-loc>Piscataway (NJ)</publisher-loc>: <publisher-name>IEEE</publisher-name>; 2013 p. <fpage>1</fpage>-<lpage>8</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/PRISMS.2013.6927175</pub-id></mixed-citation></ref>
<ref id="b30"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Cheung</surname>, <given-names>Y. M.</given-names></name>, &amp; <name><surname>Peng</surname>, <given-names>Q.</given-names></name></person-group> (<year>2015</year>). <article-title>Eye gaze tracking with a web camera in a desktop environment.</article-title> <source>IEEE Transactions on Human-Machine Systems</source>, <volume>45</volume>(<issue>4</issue>), <fpage>419</fpage>&#8211;<lpage>430</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1109/THMS.2015.2400442</pub-id><issn>2168-2291</issn></mixed-citation></ref>
<ref id="b4"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Crane</surname>, <given-names>H. D.</given-names></name>, &amp; <name><surname>Steele</surname>, <given-names>C. M.</given-names></name></person-group> (<year>1985</year>). <article-title>Generation-V dual-Purkinje-image eyetracker.</article-title> <source>Applied Optics</source>, <volume>24</volume>(<issue>4</issue>), <fpage>527</fpage>&#8211;<lpage>537</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1364/AO.24.000527</pub-id><pub-id pub-id-type="pmid">18216982</pub-id><issn>0003-6935</issn></mixed-citation></ref>
<ref id="b34"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Cui</surname>, <given-names>Y.</given-names></name>, &amp; <name><surname>Hondzinski</surname>, <given-names>J. M.</given-names></name></person-group> (<year>2006</year>). <article-title>Gaze tracking accuracy in humans: Two eyes are better than one.</article-title> <source>Neuroscience Letters</source>, <volume>396</volume>(<issue>3</issue>), <fpage>257</fpage>&#8211;<lpage>262</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1016/j.neulet.2005.11.071</pub-id><pub-id pub-id-type="pmid">16423465</pub-id><issn>0304-3940</issn></mixed-citation></ref>
<ref id="b5"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Deubel</surname>, <given-names>H.</given-names></name>, &amp; <name><surname>Schneider</surname>, <given-names>W. X.</given-names></name></person-group> (<year>1996</year>). <article-title>Saccade target selection and object recognition: Evidence for a common attentional mechanism.</article-title> <source>Vision Research</source>, <volume>36</volume>(<issue>12</issue>), <fpage>1827</fpage>&#8211;<lpage>1837</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1016/0042-6989(95)00294-4</pub-id><pub-id pub-id-type="pmid">8759451</pub-id><issn>0042-6989</issn></mixed-citation></ref>
<ref id="b6"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Drewes</surname>, <given-names>J.</given-names></name>, <name><surname>Zhu</surname>, <given-names>W.</given-names></name>, <name><surname>Hu</surname>, <given-names>Y.</given-names></name>, &amp; <name><surname>Hu</surname>, <given-names>X.</given-names></name></person-group> (<year>2014</year>). <article-title>Smaller is better: Drift in gaze measurements due to pupil dynamics.</article-title> <source>PLoS One</source>, <volume>9</volume>(<issue>10</issue>), <fpage>e111197</fpage>. <pub-id pub-id-type="doi" specific-use="author">10.1371/journal.pone.0111197</pub-id><pub-id pub-id-type="pmid">25338168</pub-id><issn>1932-6203</issn></mixed-citation></ref>
<ref id="b8"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>El-Hafi</surname> <given-names>L</given-names></name>, <name><surname>Ding</surname> <given-names>M</given-names></name>, <name><surname>Takamatsu</surname> <given-names>J</given-names></name>, <name><surname>Ogasawara</surname> <given-names>T.</given-names></name></person-group> <article-title>Gaze Tracking and Object Recognition from Eye Images.</article-title> <source>IEEE International Conference on Robotic Computing (IRC)</source>; <conf-date>2017 Apr 10-12</conf-date>; <conf-loc>Taichung, Taiwan</conf-loc>. <publisher-loc>Piscataway (NJ)</publisher-loc>: <publisher-name>IEEE</publisher-name>; 2017 p. <fpage>310</fpage>-<lpage>315</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/IRC.2017.44</pub-id> <pub-id pub-id-type="doi">10.1299/jsmermd.2017.2A1-I12</pub-id></mixed-citation></ref>
<ref id="b27"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Fridman</surname>, <given-names>L.</given-names></name>, <name><surname>Langhans</surname>, <given-names>P.</given-names></name>, <name><surname>Lee</surname>, <given-names>J.</given-names></name>, &amp; <name><surname>Reimer</surname>, <given-names>B.</given-names></name></person-group> (<year>2016</year>). <article-title>Driver Gaze Region Estimation without Use of Eye Movement.</article-title> <source>IEEE Intelligent Systems</source>, <volume>31</volume>(<issue>3</issue>), <fpage>49</fpage>&#8211;<lpage>56</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1109/MIS.2016.47</pub-id><issn>1541-1672</issn></mixed-citation></ref>
<ref id="b9"><mixed-citation publication-type="web-page" specific-use="unparsed"><person-group person-group-type="author"><name><surname>Ghiass</surname> <given-names>RS</given-names></name>, <name><surname>Arandjelovic</surname> <given-names>O</given-names></name></person-group>. <article-title>Highly accurate gaze estimation using a consumer RGB-D sensor.</article-title> In Pro-ceedings of the Twenty-Fifth International Joint Con-ference on Artificial Intelligence; <year>2016</year> <month>July</month> <day>09-15</day>; New York, USA. CA: AAAI Press; 2016 p.3368-3374. URL: <ext-link ext-link-type="uri" xlink:href="https://dl.acm.org/citation.cfm?id=3061092">https://dl.acm.org/citation.cfm?id=3061092</ext-link></mixed-citation></ref>
<ref id="b14"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Hansen</surname>, <given-names>D. W.</given-names></name>, &amp; <name><surname>Ji</surname>, <given-names>Q.</given-names></name></person-group> (<year>2010</year>). <article-title>In the eye of the beholder: A survey of models for eyes and gaze.</article-title> <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>, <volume>32</volume>(<issue>3</issue>), <fpage>478</fpage>&#8211;<lpage>500</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1109/TPAMI.2009.30</pub-id><pub-id pub-id-type="pmid">20075473</pub-id><issn>0162-8828</issn></mixed-citation></ref>
<ref id="b3"><mixed-citation publication-type="book" specific-use="restruct"><person-group person-group-type="author"><name><surname>Holmqvist</surname>, <given-names>K.</given-names></name>, &amp; <name><surname>Andersson</surname>, <given-names>R.</given-names></name></person-group> (<year>2017</year>). <source>Eye tracking: A compre-hensive guide to methods, paradigms, and measures</source> (<edition>2nd ed.</edition>). <publisher-loc>Lund, Sweden</publisher-loc>: <publisher-name>Lund Eye-Tracking Research Institute</publisher-name>.</mixed-citation></ref>
<ref id="b2"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Huck</surname>, <given-names>A.</given-names></name>, <name><surname>Thompson</surname>, <given-names>R. L.</given-names></name>, <name><surname>Cruice</surname>, <given-names>M.</given-names></name>, &amp; <name><surname>Marshall</surname>, <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>Effects of word frequency and contextual predictability on sentence reading in aphasia: An eye movement analysis.</article-title> <source>Aphasiology</source>, <volume>31</volume>(<issue>11</issue>), <fpage>1307</fpage>&#8211;<lpage>1332</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1080/02687038.2017.1278741</pub-id><issn>0268-7038</issn></mixed-citation></ref>
<ref id="b36"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Jesorsky</surname> <given-names>O</given-names></name>, <name><surname>Kirchberg</surname> <given-names>KJ</given-names></name>, <name><surname>Frischholz</surname> <given-names>RW</given-names></name></person-group>. <article-title>Robust face detection using the hausdorff distance.</article-title> In: <person-group person-group-type="editor"><name><surname>Bigun</surname> <given-names>J.</given-names></name>, <name><surname>Smeraldi</surname> <given-names>F</given-names></name><role>, editors</role></person-group>. <source>International Conference on Audio-and Video-Based Biometric Person Authenti-cation</source>; <conf-date>2001 June 6-8</conf-date>; <conf-loc>Halmstad, Sweden</conf-loc>. <publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>; <year>2001</year>. p. <fpage>90</fpage>-<lpage>95</lpage>.doi: <pub-id pub-id-type="doi" specific-use="author">10.1007/3-540-45344-X_14</pub-id></mixed-citation></ref>
<ref id="b24"><mixed-citation publication-type="book" specific-use="restruct"><person-group person-group-type="author"><name><surname>Kaehler</surname>, <given-names>A.</given-names></name>, &amp; <name><surname>Bradski</surname>, <given-names>G.</given-names></name></person-group> (<year>2016</year>). <source>Learning OpenCV 3: computer vision in C++ with the OpenCV library</source>. <publisher-loc>CA</publisher-loc>: <publisher-name>O'Reilly Media, Inc.</publisher-name></mixed-citation></ref>
<ref id="b23"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Kazemi</surname> <given-names>V</given-names></name>, <name><surname>Sullivan</surname> <given-names>J</given-names></name></person-group>. <article-title>One millisecond face alignment with an ensemble of regression trees.</article-title> In <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>; <conf-date>2014 Jun 24-27</conf-date>; <conf-loc>Columbus, OH, USA</conf-loc>. <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>; <year>2014</year>. p. <fpage>1867</fpage>-<lpage>1874</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/CVPR.2014.241</pub-id></mixed-citation></ref>
<ref id="b28"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Kennedy</surname> <given-names>J</given-names></name>, <name><surname>Baxter</surname> <given-names>P</given-names></name>, <name><surname>Belpaeme</surname> <given-names>T</given-names></name></person-group>. <article-title>Head pose estimation is an inadequate replacement for eye gaze in child-robot interaction.</article-title> In <source>Proceedings of the tenth annual acm/ieee international conference on human-robot interaction extended abstracts</source>; <conf-date>2015 Mar 02-05</conf-date>; <conf-loc>Portland, Oregon, USA</conf-loc>. <publisher-loc>New York</publisher-loc>: <publisher-name>ACM</publisher-name>; 2015 p. <fpage>35</fpage>-<lpage>36</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1145/2701973.2701988</pub-id></mixed-citation></ref>
<ref id="b25"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>KN</given-names></name>, <name><surname>Ramakrishna</surname> <given-names>RS</given-names></name></person-group>. <article-title>Vision-based eye-gaze tracking for human computer interface.</article-title> IEEE SMC'99 Conference on Systems, Man, and Cybernetics Proceedings; <year>1999</year> <month>Oct</month> <day>12-15</day>; Tokyo, Japan. Piscataway (NJ): IEEE; 1999 p. 324-329. Tokyo, Japan. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/ICSMC.1999.825279</pub-id></mixed-citation></ref>
<ref id="b38"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>King</surname>, <given-names>D. E.</given-names></name></person-group> (<year>2009</year>). <article-title>Dlib-ml: A machine learning toolkit.</article-title> <source>Journal of Machine Learning Research</source>, <volume>10</volume>, <fpage>1755</fpage>&#8211;<lpage>1758</lpage>.<issn>1532-4435</issn></mixed-citation></ref>
<ref id="b22"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Krafka</surname> <given-names>K</given-names></name>, <name><surname>Khosla</surname> <given-names>A</given-names></name>, <name><surname>Kellnhofer</surname> <given-names>P</given-names></name>, <name><surname>Kannan</surname> <given-names>H</given-names></name>, <name><surname>Bhan-darkar</surname> <given-names>S</given-names></name>, <name><surname>Matusik</surname> <given-names>W</given-names></name>, <name><surname>Torralba</surname> <given-names>A</given-names></name></person-group>. <article-title>Eye tracking for everyone.</article-title> In <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>; <conf-date>2016 Jun 26-Jul 1</conf-date>; <conf-loc>Las Vegas, NV, USA</conf-loc>. <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>; 2016 p. <fpage>2176</fpage>-<lpage>2184</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/CVPR.2016.239</pub-id></mixed-citation></ref>
<ref id="b41"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Kumar</surname> <given-names>M</given-names></name>, <name><surname>Garfinkel</surname> <given-names>T</given-names></name>, <name><surname>Boneh</surname> <given-names>D</given-names></name>, <name><surname>Winograd</surname> <given-names>T</given-names></name></person-group>. <article-title>Re-ducing shoulder-surfing by using gaze-based pass-word entry.</article-title> InProceedings of the 3rd symposium on Usable privacy and security; <year>2007</year> <month>Jul</month> <day>18-20</day>; Pitts-burgh, Pennsylvania, USA. New York: ACM; 2007 p. 13-19. doi: <pub-id pub-id-type="doi" specific-use="author">10.1145/1280680.1280683</pub-id></mixed-citation></ref>
<ref id="b1"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Lahey</surname>, <given-names>J. N.</given-names></name>, &amp; <name><surname>Oxley</surname>, <given-names>D.</given-names></name></person-group> (<year>2016</year>). <article-title>The power of eye tracking in economics experiments.</article-title> <source>The American Economic Review</source>, <volume>106</volume>(<issue>5</issue>), <fpage>309</fpage>&#8211;<lpage>313</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1257/aer.p20161009</pub-id><issn>0002-8282</issn></mixed-citation></ref>
<ref id="b10"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Li</surname>, <given-names>J.</given-names></name>, &amp; <name><surname>Li</surname>, <given-names>S.</given-names></name></person-group> (<year>2016</year>). <article-title>Gaze estimation from color image based on the eye model with known head pose.</article-title> <source>IEEE Transactions on Human-Machine Systems</source>, <volume>46</volume>(<issue>3</issue>), <fpage>414</fpage>&#8211;<lpage>423</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1109/THMS.2015.2477507</pub-id><issn>2168-2291</issn></mixed-citation></ref>
<ref id="b31"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Miles</surname>, <given-names>W. R.</given-names></name></person-group> (<year>1930</year>). <article-title>Ocular dominance in human adults.</article-title> <source>The Journal of General Psychology</source>, <volume>3</volume>(<issue>3</issue>), <fpage>412</fpage>&#8211;<lpage>430</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1080/00221309.1930.9918218</pub-id><issn>0022-1309</issn></mixed-citation></ref>
<ref id="b18"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Mora</surname> <given-names>KAF</given-names></name>, <name><surname>Monay</surname> <given-names>F</given-names></name>, <name><surname>Odobez</surname> <given-names>JM</given-names></name></person-group>. <article-title>Eyediap: A data-base for the development and evaluation of gaze esti-mation algorithms from rgb and rgb-d cameras.</article-title> In <source>Proceedings of the Symposium on Eye Tracking Re-search and Applications</source>; <conf-date>2014Mar 26-28</conf-date>; <conf-loc>Safety Harbor, Florida</conf-loc>. <publisher-loc>New York</publisher-loc>: <publisher-name>ACM</publisher-name>; 2014 p. <fpage>255</fpage>-<lpage>258</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1145/2578153.2578190</pub-id></mixed-citation></ref>
<ref id="b21"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Mora</surname> <given-names>KAF</given-names></name>, <name><surname>Odobez</surname> <given-names>JM</given-names></name></person-group>. <article-title>Gaze estimation from mul-timodal kinect data.</article-title> IEEE Computer Society Confe-rence on Computer Vision and Pattern Recognition Workshops (CVPRW); Washington, DC: IEEE Com-puter Society; <year>2012</year> <month>Jun</month> <day>16-21</day>; Providence, RI, USA. 2012 p. 25-30. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/CVPRW.2012.6239182</pub-id></mixed-citation></ref>
<ref id="b26"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Newman</surname> <given-names>R</given-names></name>, <name><surname>Matsumoto</surname> <given-names>Y</given-names></name>, <name><surname>Rougeaux</surname> <given-names>S</given-names></name>, <name><surname>Zelinsky</surname> <given-names>A</given-names></name></person-group>. <article-title>Real-time stereo tracking for head pose and gaze es-timation.</article-title> <source>Fourth IEEE International Conference on Automatic Face and Gesture Recognition</source>; <conf-date>2000 Mar 28-30</conf-date>; <conf-loc>Grenoble, France</conf-loc>. <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>; 2000 p. <fpage>122</fpage>-<lpage>128</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/AFGR.2000.840622</pub-id></mixed-citation></ref>
<ref id="b32"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Nystr&#246;m</surname>, <given-names>M.</given-names></name>, <name><surname>Andersson</surname>, <given-names>R.</given-names></name>, <name><surname>Holmqvist</surname>, <given-names>K.</given-names></name>, &amp; <name><surname>van de Weijer</surname>, <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>The influence of calibration method and eye physiology on eyetracking data quality.</article-title> <source>Behavior Research Methods</source>, <volume>45</volume>(<issue>1</issue>), <fpage>272</fpage>&#8211;<lpage>288</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.3758/s13428-012-0247-4</pub-id><pub-id pub-id-type="pmid">22956394</pub-id><issn>1554-351X</issn></mixed-citation></ref>
<ref id="b7"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Pan</surname> <given-names>Y</given-names></name>, <name><surname>Steed</surname> <given-names>A.</given-names></name></person-group> <article-title>A gaze-preserving situated multiview telepresence system.</article-title> In <source>Proceedings of the 32nd an-nual ACM conference on Human factors in compu-ting systems</source>; <conf-date>2014 Apr 26-May 01</conf-date>; <publisher-loc>Toronto, Ontario, Canada; New York</publisher-loc>: <publisher-name>ACM</publisher-name>; 2014 p. <fpage>2173</fpage>-<lpage>2176</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1145/2556288.2557320</pub-id></mixed-citation></ref>
<ref id="b33"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Quartley</surname>, <given-names>J.</given-names></name>, &amp; <name><surname>Firth</surname>, <given-names>A. Y.</given-names></name></person-group> (<year>2004</year>). <article-title>Binocular sighting ocular dominance changes with different angles of horizontal gaze.</article-title> <source>Binocular Vision &amp; Strabismus Quarterly</source>, <volume>19</volume>(<issue>1</issue>), <fpage>25</fpage>&#8211;<lpage>30</lpage>.<pub-id pub-id-type="pmid">14998366</pub-id><issn>1088-6281</issn></mixed-citation></ref>
<ref id="b11"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Sesma</surname> <given-names>L</given-names></name>, <name><surname>Villanueva</surname> <given-names>A</given-names></name>, <name><surname>Cabeza</surname> <given-names>R</given-names></name></person-group>. <article-title>Evaluation of pupil center-eye corner vector for gaze estimation us-ing a web cam.</article-title> In <source>Proceedings of the symposium on eye tracking research and applications</source>; <conf-date>2012Mar 28-30</conf-date>; <conf-loc>Santa Barbara, California</conf-loc>.<publisher-loc>New York</publisher-loc>: <publisher-name>ACM</publisher-name>;2012 p. <fpage>217</fpage>-<lpage>220</lpage>. ACM. doi: <pub-id pub-id-type="doi" specific-use="author">10.1145/2168556.2168598</pub-id></mixed-citation></ref>
<ref id="b17"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Skodras</surname>, <given-names>E.</given-names></name>, <name><surname>Kanas</surname>, <given-names>V. G.</given-names></name>, &amp; <name><surname>Fakotakis</surname>, <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>On visual gaze tracking based on a single low cost camera.</article-title> <source>Signal Processing Image Communication</source>, <volume>36</volume>, <fpage>29</fpage>&#8211;<lpage>42</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1016/j.image.2015.05.007</pub-id><issn>0923-5965</issn></mixed-citation></ref>
<ref id="b20"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Sugano</surname> <given-names>Y</given-names></name>, <name><surname>Matsushita</surname> <given-names>Y</given-names></name>, <name><surname>Sato</surname> <given-names>Y.</given-names></name></person-group> <article-title>Learning-by-synthesis for appearance-based 3d gaze estimation.</article-title> In <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>; <conf-date>2014 Jun 24-27</conf-date>; Co-lumbus, OH, USA. <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>; <year>2014</year>. p. <fpage>1821</fpage>-<lpage>1828</lpage>. doi: <pub-id pub-id-type="doi" specific-use="author">10.1109/CVPR.2014.235</pub-id></mixed-citation></ref>
<ref id="b13"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Thevenaz</surname>, <given-names>P.</given-names></name>, &amp; <name><surname>Unser</surname>, <given-names>M.</given-names></name></person-group> (<year>2008</year>). <article-title>Snakuscules.</article-title> <source>IEEE Transactions on Image Processing</source>, <volume>17</volume>(<issue>4</issue>), <fpage>585</fpage>&#8211;<lpage>593</lpage>. <pub-id pub-id-type="doi" specific-use="author">10.1109/TIP.2007.914742</pub-id><pub-id pub-id-type="pmid">18390366</pub-id><issn>1057-7149</issn></mixed-citation></ref>
<ref id="b16"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Valenti</surname> <given-names>R</given-names></name>, <name><surname>Staiano</surname> <given-names>J</given-names></name>, <name><surname>Sebe</surname> <given-names>N</given-names></name>, <name><surname>Gevers</surname> <given-names>T</given-names></name></person-group>. <article-title>Webcam-based visual gaze estimation.</article-title> In: Foggia P, Sansone C, Vento M, editors. Image Analysis and Processing&#8211;ICIAP <year>2009</year>; 2009 Sept 8-11; Vietrisul Mare, Italy. Berlin: Springer; 2009. p. 662-671. doi: <pub-id pub-id-type="doi" specific-use="author">10.1007/978-3-642-04146-4_71</pub-id></mixed-citation></ref>
<ref id="b37"><mixed-citation publication-type="book" specific-use="restruct"><person-group person-group-type="author"><name><surname>Timm</surname>, <given-names>F.</given-names></name>, &amp; <name><surname>Barth</surname>, <given-names>E.</given-names></name></person-group> (<year>2011</year>). <source>Accurate Eye Centre Localisation by Means of Gradients.VISAPP 2011 - Proceedings of the Sixth International Conference on Computer Vision Theory and Applications;2011 Mar 5-7;Vilamoura, Algarve, Portugal</source> (pp. <fpage>125</fpage>–<lpage>130</lpage>). <publisher-loc>Portugal</publisher-loc>: <publisher-name>Inst. for Syst. and Technol. of Inf. Control and Commun</publisher-name>; URL <ext-link ext-link-type="uri" xlink:href="https://pdfs.semanticscholar.org/c931/1a0c5045d86a617bd05a5cc269f44e81508d.pdf">https://pdfs.semanticscholar.org/c931/1a0c5045d86a617bd05a5cc269f44e81508d.pdf</ext-link></mixed-citation></ref>
<ref id="b35"><mixed-citation publication-type="journal" specific-use="restruct"><person-group person-group-type="author"><name><surname>Villanueva</surname>, <given-names>A.</given-names></name>, <name><surname>Ponz</surname>, <given-names>V.</given-names></name>, <name><surname>Sesma-Sanchez</surname>, <given-names>L.</given-names></name>, <name><surname>Ariz</surname>, <given-names>M.</given-names></name>, <name><surname>Porta</surname>, <given-names>S.</given-names></name>, &amp; <name><surname>Cabeza</surname>, <given-names>R.</given-names></name></person-group> (<year>2013</year>). <article-title>Hybrid method based on topogra-phy for robust detection of iris center and eye corners.</article-title> <comment>[TOMM]</comment>. <source>ACM Transactions on Multimedia Computing Communications and Applications</source>, <volume>9</volume>(<issue>4</issue>), <fpage>25</fpage>. <pub-id pub-id-type="doi" specific-use="author">10.1145/2501643.2501647</pub-id><issn>1551-6857</issn></mixed-citation></ref>
<ref id="b19"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Weidenbacher</surname> <given-names>U</given-names></name>, <name><surname>Layher</surname> <given-names>G</given-names></name>, <name><surname>Strauss</surname> <given-names>PM</given-names></name>, <name><surname>Neumann</surname> <given-names>H</given-names></name></person-group>. A comprehensive head pose and gaze database. 3rd IET International Conference on Intelligent Environments (IE 07). Ulm, Germany. United Kingdom: Institution of Engineering and Technology; <year>2007</year>, <month>Sept</month> <day>24-25</day>. 2007. p. 455-458. doi: <pub-id pub-id-type="doi" specific-use="author">10.1049/cp:20070407</pub-id></mixed-citation></ref>
<ref id="b15"><mixed-citation publication-type="conference" specific-use="linked"><person-group person-group-type="author"><name><surname>Wood</surname> <given-names>E</given-names></name>, <name><surname>Bulling</surname> <given-names>A.</given-names></name></person-group> Eyetab: Model-based gaze esti-mation on unmodified tablet computers. In Proceed-ings of the Symposium on Eye Tracking Research and Applications; <year>2014</year>, <month>Mar</month> <day>26-28</day>; Safety Harbor, Florida. New York: ACM; 2014. p. 207-210. doi: <pub-id pub-id-type="doi" specific-use="author">10.1145/2578153.2578185</pub-id></mixed-citation></ref>
<ref id="b12"><mixed-citation publication-type="unknown" specific-use="linked"><person-group person-group-type="author"><name><surname>Xiao</surname> <given-names>F</given-names></name>, <name><surname>Huang</surname> <given-names>K</given-names></name>, <name><surname>Qiu</surname> <given-names>Y</given-names></name>, <name><surname>Shen</surname> <given-names>H</given-names></name></person-group>. Accurate iris center localization method using facial landmark, snakus-cule, circle fitting and binary connected compo-nent.Multimedia Tools and Applications. <year>2018</year> <month>Oct</month>; 77(19):25333&#8211;25353.Epub 2018Feb23. doi: <pub-id pub-id-type="doi" specific-use="author">10.1007/s11042-018-5787-x</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>
