Data scaling for navigation in unknown environments

Lauri Suomela 1 · Naoki Takahata 2 · Sasanka Kuruppu Arachchige 1

Harry Edelman 3 · Joni-Kristian Kämäräinen 1

1 Tampere University · 2 Tohoku University · 3 Turku University of Applied Sciences


Generalization of imitation-learned navigation policies to environments unseen in training remains a major challenge. We address this by conducting the first large-scale study of how data quantity and data diversity affect real-world generalization in end-to-end, map-free visual navigation. Using a curated 4,565-hour crowd-sourced dataset collected across 161 locations in 35 countries, we train policies for point goal navigation and evaluate their closed-loop control performance on sidewalk robots operating in four countries, covering 125 km of autonomous driving.

Our results show that large-scale training data enables zero-shot navigation in unknown environments, approaching the performance of policies trained with environment-specific demonstrations. Critically, we find that data diversity is far more important than data quantity. Doubling the number of geographical locations in a training set decreases navigation errors by ~15%, while performance benefit from adding data from existing locations saturates with very little data. We also observe that, with noisy crowd-sourced data, simple regression-based models outperform generative and sequence-based architectures.

Training data


We extract clean demonstrations from a raw 8,000 hour crowd-sourced dataset, and cluster the resulting 4,565 hours of navigation data into 161 distinct locations around the world. The locations span 35 countries and diverse environments, including urban, suburban, park, campus, and rural settings.

Examples from a subset of the training locations.

Test locations


Wuhan, China
Kisumu, Kenya
Port Louis, Mauritius
Selebi-Phikwe, Botswana

We test our policies in 4 distinct locations around the world. These locations were not included in the training data, allowing us to evaluate the generalization capabilities of the navigation policies.

Policy deployment examples


The policies were deployed on the Earth Rover Zero robots. The robots streamed sensor observations to a remote desktop machine running policy inference, and the computed control commands were sent back to the robots over a 4G connection.

Data scaling results


We trained policies with subsets of the total dataset, varying both the number of training locations and the amount of training data per location.

Success rate ($\uparrow$) as function of number of train locations, for fixed amounts of total training data.
Success rate ($\uparrow$) as function of training data per location, for fixed numbers of training locations.

The results indicate that navigation performance scales well with the number of training locations, while benefit from increasing the amount of data per location saturates early, below the range we considered in our experiments. This highlights the importance of data diversity for generalization in visual navigation.

Power law fits and correlation coefficients $r$ for Total Dataset Size = 128H and Hours per Location = 16H on a log-log scale.

Fitting a model $Y = \beta \cdot X^{\alpha}$ to the results, we find that the navigation Failure Rate = 1 - Success Rate follows the number of training locations with a power law relationship. Substituting the coefficients from the left panel, doubling the number of locations decreases failures by $1-\frac{Y(2X)}{Y(X)}=1-2^{-0.229}\approx15\%$.

Training with the full dataset


We also experimented with training on the full dataset. We compared the policy trained with the full train set (Zero-Shot) to a policy trained with both the full train set and additional in-domain data from the test locations (Scale + in-domain), and environment-specific policies trained only with data from each test location (In-domain only).

Policies trained with large-scale data achieve higher success rates compared to environment-specific policies. The zero-shot policy almost matches the policy that incorporates both large-scale data and data from the test environments.

Full runs


The videos demonstrate one complete navigation run from each test location.

BibTex

@misc{suomela2026data_scaling,
  title={Data Scaling for Navigation in Unknown Environments},
  author={Suomela, Lauri and Takahata, Naoki and Kuruppu Arachchige, Sasanka and Edelman, Harry and Kämäräinen, Joni-Kristian},
  journal={arXiv:2601.09444},
  year={2026},
}