Abstract
Gaze estimation methods typically regress gaze directions directly from images using a deep network. We show that equipping a deep network with an explicit 3D shape model can: i) improve gaze estimation accuracy, ii) perform well with lower resolution inputs at high frame rates and, importantly, iii) provide a much richer understanding of the eye-region and its constituent gaze system, thus lending itself to a wider range of applications. We use an `eyes and nose' 3D Morphable Model (3DMM) to capture relevant local 3D facial geometry and appearance, and we equip this with a geometric vergence model of gaze to give an `active-gaze 3DMM'. Latent codes are used to express eye-region shape, appearance, pose, scale and gaze directions, with these being regressed using a tiny Swin transformer. We achieve fast real time at 89 fps without fitted model rendering and 34 fps with rendering. Our system shows state-of-the-art results on the Eyediap dataset, which provides 3D training supervision and highly competitive results on ETH-XGaze, despite a lack of 3D supervision and without modelling the kappa angle. Indeed, our method can learn with only the ground truth gaze target point and the camera parameters, without access to the ground truth gaze origin points, thus significantly widening applicability.
Original language | English |
---|---|
Title of host publication | The 18th IEEE International Conference on Automatic Face and Gesture Recognition |
Publisher | IEEE Computer Society |
Number of pages | 9 |
ISBN (Print) | 9798350394948 |
Publication status | Published - 27 May 2024 |
Event | The 18th IEEE International Conference on Automatic Face and Gesture Recognition - ITU campus, Istanbul, Turkey Duration: 27 May 2024 → 31 May 2024 https://fg2024.ieee-biometrics.org/ |
Conference
Conference | The 18th IEEE International Conference on Automatic Face and Gesture Recognition |
---|---|
Abbreviated title | FG 2024 |
Country/Territory | Turkey |
City | Istanbul |
Period | 27/05/24 → 31/05/24 |
Internet address |
Bibliographical note
©2024 IEEE. This is an author-produced version of the published paper. Uploaded in accordance with the University’s Research Publications and Open Access policy.Keywords
- Gaze estimation, 3D gaze, self supervised learning