Papers
Topics
Authors
Recent
Search
2000 character limit reached

The Wallpaper is Ugly: Indoor Localization using Vision and Language

Published 4 Oct 2024 in cs.CV | (2410.03900v1)

Abstract: We study the task of locating a user in a mapped indoor environment using natural language queries and images from the environment. Building on recent pretrained vision-LLMs, we learn a similarity score between text descriptions and images of locations in the environment. This score allows us to identify locations that best match the language query, estimating the user's location. Our approach is capable of localizing on environments, text, and images that were not seen during training. One model, finetuned CLIP, outperformed humans in our evaluation.

Authors (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.