This AI paper proposes "SuperGlue", a graphical neural network that simultaneously performs context aggregation, local feature matching and filtering for broad-based pose estimation.

This AI paper proposes “SuperGlue”, a graphical neural network that simultaneously performs context aggregation, local feature matching and filtering for broad-based pose estimation.

Imagine you have two photos of the same scene taken from different angles. Most of the objects in both images are the same, you’re just looking at them from different angles. In computer vision, objects are assumed to have certain characteristics like edges, corners, etc. Matching these characteristics is critical for some applications. But what would it take to match features between two images?

Finding the correspondence between images is the prerequisite for estimating 3D structure and camera poses in computer vision tasks such as simultaneous localization and mapping (SLAM) and structure from motion (SfM). ). This is done by matching local features, and it is difficult to achieve due to changes in lighting conditions, occlusion, blurring, etc.

Traditionally, feature matching is done through a two-step approach. First the front end The step extracts visual features from images. Second, the back end the step applies ensemble adjustment and pose estimation to help match the extracted visual features. Once this is done, the features are ready and feature matching is modeled as a linear assignment problem.

As in all other fields, deep neural networks have played a crucial role in recent years in feature matching problems. They have been used to learn better sparse detectors and local descriptors from data using convolutional neural networks (CNNs).

However, they were usually a component of the feature-matching problem, not an end-to-end solution. What if a single neural network could perform aggregation, matching, and context filtering in a single architecture? It’s time to introduce SuperGlue.

SuperGlue approaches present matching problems in a different way. It learns the matching process from pre-existing local features using a graphical neural network structure. This replaces existing approaches where first, task-independent features are learned, and they are matched using simple heuristics and methods. Being an end-to-end approach gives SuperGlue a strong advantage over existing methods. SuperGlue is an apprenticeship midrange that could be used to improve existing approaches.

So how does SuperGlue achieve this? It appears in a new window and views the feature matching problem as a partial assignment between two sets of local features. Instead of solving a linear assignment problem to match features, it treats it as an optimal transport problem. SuperGlue uses a graphical neural network (GNN) that predicts the cost function of this transport optimization.

We all know how processors have achieved tremendous success in natural language processing and, recently, in computer vision tasks. SuperGlue uses a transformer to take advantage of both the spatial relationships of key points and their visual appearances.

SuperGlue is formed end to end. The image pairs are used as training data. Priors for pose estimation are learned from a large set of labeled data; therefore, SuperGlue can have an understanding of the 3D scene.

SuperGlue can be applied to several problems where high quality feature matching is required for multi-view geometry. It runs in real time on commodity hardware and can be applied to both classic and learned functionality. You can find more information about SuperGlue at the links below.


Check paper, project, and coded. All credit for this research goes to the researchers on this project. Also don’t forget to register. our Reddit page and discord channelwhere we share the latest AI research news, cool AI projects, and more.


Ekrem Çetinkaya obtained his B.Sc. in 2018 and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis on image denoising using deep convolutional networks. He is currently pursuing a doctorate. degree at the University of Klagenfurt, Austria, and working as a researcher on the ATHENA project. His research interests include deep learning, computer vision and multimedia networks.


#paper #proposes #SuperGlue #graphical #neural #network #simultaneously #performs #context #aggregation #local #feature #matching #filtering #broadbased #pose #estimation

Leave a Comment

Your email address will not be published. Required fields are marked *