Abstract: In this paper, we present a neat yet effective transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to the corresponding region ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果一些您可能无法访问的结果已被隐去。
显示无法访问的结果