ELLIS Delft Talk by Julian Kooij: Vehicle localization without SLAM: Learning to find your camera’s pose in an aerial image

03 December 2024 16:00 till 03 November 2024 17:00 - Location: Hybrid: Building 28, Room Hilbert (2.W510) / Zoom - By: ELLIS Delft | Add to my calendar

by Julian Kooij | Delft University of Technology

Abstract

Localizing an outdoor vehicle is a key task for self-driving, but GPS can be inaccurate in urban environments with tall buildings. This talk discusses the complementary use of vision-based localization, namely determining the 3-Degrees-of-Freedom camera pose of a given ground-level image. Ideally, such localization can improve on the rough localization estimates from GPS in a manner that accurately scales to large areas without the need for expensive SLAM or HD maps. One of the discussed directions is “fine-grained cross-view localization”, an emerging field aiming to estimate the pose of the camera w.r.t. only an aerial image of the local area (think Google maps).

In this talk, I will present two techniques we proposed for this task: Convolution Cross-View Pose Estimation (T-PAMI’23)[1] and SliceMatch (CVPR’23)[2]. Our most recent work has also looked at closing the domain gap when applying such data-driven models to new geographic areas (ECCV’24)[3]. Finally, if time permits, we shall look at the related task of Visual Place Recognition, where the “map” consists of a collection of geo-referenced ground images. For this we proposed SUE, a new simple approach for estimating VPR matching uncertainty (CVPR’24, poster highlight)[4].

[1]: CCVPE, T-PAMI’23: https://arxiv.org/abs/2303.05915

[2]: SliceMatch, CVPR’23: https://arxiv.org/abs/2211.14651

[3]: Adapting Fine-Grained Cross-View Localization, ECCV’24: https://arxiv.org/abs/2406.00474

[4]: Estimation of Image-matching Uncertainty on VPR, CVPR’24: https://arxiv.org/abs/2404.00546