In everyday life, people often need to track moving objects. Recently, a topic of discussion has been whether people rely solely on the locations of tracked objects, or take their directions into account in multiple object tracking (MOT). In the current paper, we pose a related question: do people utilise extrapolation in their gaze behaviour, or, in more practical terms, should the mathematical models of gaze behaviour in an MOT task be based on objects’ current, past or anticipated positions? We used a data-driven approach with no a priori assumption about the underlying gaze model. We repeatedly presented the same MOT trials forward and backward and collected gaze data. After reversing the data from the backward trials, we gradually tested different time adjustments to find the local maximum of similarity. In a series of four experiments, we showed that the gaze position lagged by approximately 110 ms behind the scene content. We observed the lag in all subjects (Experiment 1). We further experimented to determine whether tracking workload or predictability of movements affect the size of the lag. Low workload led only to a small non-significant shortening of the lag (Experiment 2). Impairing the predictability of objects’ trajectories increased the lag (Experiments 3a and 3b). We tested our observations with predictions of a centroid model: we observed a better fit for a model based on the locations of objects 110 ms earlier. We conclude that mathematical models of gaze behaviour in MOT should account for the lags.