Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

这篇具有很好参考价值的文章主要介绍了Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

Rethinking Point Cloud Registration as Masking and Reconstruction

2023 ICCV

*Guangyan Chen, Meiling Wang, Li Yuan, Yi Yang, Yufeng Yue*; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 17717-17727

  • paper: Rethinking Point Cloud Registration as Masking and Reconstruction (thecvf.com)
  • code: CGuangyan-BIT/MRA (github.com)

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

这论文标题就很吸引人,但是研读下来作者只是想用MAE的结构,想要预测出对齐后点云,然后提高跨点云间配准点的特征描述一致性,辅助特征提取网络训练

Abstract

文章核心立题: the invisible parts of each point cloud can serve as inherent masks, whereas the aligned point cloud pair can be treated as the reconstruction objective .

  • 将点云配准视为masking and Reconstruction过程,以Point-MAE为基本思想,提出MRA(the Masked Reconstrction Auxiliary Network)。
  • MRA可以很容易的嵌入到其他方法中further improve registration accuracy
  • 基于MRA,提出一个novel、基于standard transformer-baesed method,MRT(the Masked Reconstruction Transformer)。

encode feauters -> inference the contextual features and overall structures of point cloud pairs -> the deviation correction modul to correct the spatial deviations in the putative corresponding point pairs

Description

  • input:
    • source point cloud \(X = \{x_1, x_2, …,x_M\} \subseteq \mathbb{R}^3\)
    • target point cloud \(Y = \{y_1, y_2, …, y_N\} \subseteq \mathbb{R}^3\)
  • output: the rigid transformation \(\{\hat{R} \in SO(3), \hat{t} \in \mathbb{R}^3\}\) that align the source point cloud with the target point cloud.

(MRT是用来提特征的,应该也是dense description,MRA是用来辅助训练MRT的。)

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

  1. MRT step: input point cloud pair \(X\) , \(Y\) 利用KPConv进行dense description,得到superpoints \([\widetilde{X}:F^{\widetilde{X}}]\) , \([\widetilde{Y}:F^{\widetilde{Y}}]\) 。然后其中的特征描述 \(F\) 经过Transformer Encoder Module提取contextual information and overall structure 重构每个特征描述 \(F^{\widetilde{X}}\) , \(F^{\widetilde{Y}}\)
  2. auxiliary network step:two module is parallel used to training MRT
    1. MRA: the MRA separately receives the encoded features of each point cloud and predicts the other aligned point cloud, reconstructing the complete point cloud.
    2. Registration network: predict point corrrespondences \(\hat{y},\ \hat{x}\) and overlap scores \(\hat{o}^{\widetilde{X}},\ \hat{o}^{\widetilde{Y}}\) in the Deviation Correction module, then use Wighted Procrustes module to regress the transformaion.

MRT Step

  1. KPConv network:
    1. input downsampled point clouds: \(X \in \mathbb{M \times 3}\), \(Y \in \mathbb{N \times 3}\)
    2. obtain superpoints and features: \([\widetilde{X} \in \mathbb{M^{'} \times 3}:F^{\widetilde{X}}\in \mathbb{R}^{M^{'} \times D}]\) , \([\widetilde{Y} \in \mathbb{M^{'} \times 3}:F^{\widetilde{Y}}\in \mathbb{R}^{N^{'} \times D}]\)
  2. Transformer Encoder:
    1. input superpoints and features into \(L_e\) - layer transformer encoder( cross-attention and sinusoidal positional encodings )
    2. output the encoded features \(\mathcal{F}^{\widetilde{X}}\) , \(\mathcal{F}^{\widetilde{Y}}\) .

cross-attention有助于两个point cloud提取一致性特征。

MRA Step

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

一个纯MAE style的网络结构, mask token 代表对齐后相应的point cloud patch表示。输入对齐前的point cloud patch,相应的token,根据GT rigid transformation信息生成的position embedding,和mask token,输出预测的对齐后point cloud patch,再与GT 生成的对齐结果做chamfer Loss。

虽然表面上这里有很多与变换相关的操作,但是细细思考会发现这里所有的变换信息都建立在GT上,所以我倾向于这里与MRT里的cross-attention一起提高了配准点对在特征上的表示一致性,当然肯定对特征表示的语义完整性有提高。

  1. input -> MRT outputs: super points pair and corresponding features: \([\widetilde{X} \in \mathbb{M^{'} \times 3}:\mathcal{F}^{\widetilde{X}}\in \mathbb{R}^{M^{'} \times D}]\) , \([\widetilde{Y} \in \mathbb{M^{'} \times 3}:\mathcal{F}^{\widetilde{Y}}\in \mathbb{R}^{N^{'} \times D}]\)
  2. output: chamfer L2 loss between predicted aligned point cloud patch and ground truth aligned point cloud.

步骤:

  1. use FPS to extract center points \(\widetilde{X}_c\) , \(\widetilde{Y}_c\) in super points. use KNN to generate point cloud patch; get the tokens \(T^{\widetilde{X}}\) , \(T^{\widetilde{Y}}\) by composing the encoded features \(\mathcal{F}^{\widetilde{X}}\) , \(\mathcal{F}^{\widetilde{Y}}\) .use mask token \(T^{\widetilde{X}}_m ∈ \mathbb{R}^{g×D}\), \(T^{\widetilde{Y}}_m ∈ \mathbb{R}^{g×D}\) to correspond the aligned point cloud patch in the output of decoder.
  2. use groud truth transformation from \(Y\) to \(X\), and from \(X\) to \(Y\) to generate the position embedding for each layer in decoder.

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

  1. self-attention and two-layer-FC transfromer decoder to reconstruct the mask token to represent the token of aligned point cloud patch.
  2. use two-layer MLP with two FC and ReLu to predict the aligned point cloud patch responding to the decoded mask tokens.
  3. chamfer loss: the ground truch aligned point cloud patch and the predicted one.

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

coarse registration step

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

由于MRT提取的特征强聚合(cross-attention的缘故)了跨点云间的语义信息,根据余弦相似性计算soft corresponding wighted,加权求和得到correspodence point pair,在拼接特征以及对应点对的坐标用MLP拟合加权求和得到的点对坐标与真实位置的偏差。构筑更鲁棒的匹配结果。(这种预测bias的方式经常见)。之后使用weighted procustes模块预测rigid transformaion。

我更想倾向于这样描述:单纯加权求和得到的坐标结果大概率与真实坐标有所偏差,引入另一个可变分量来对加权后的预测结果做调控,能够使得预测结果更加鲁棒,更加稳定,甚至能更加精确,从而在现象上,显示为偏差值。并且这里的余弦相似性从一定程度上可以提高非配准点之前的差异性。

  1. input: the feature \(\mathcal{F}^{\widetilde{X}}\) , \(\mathcal{F}^{\widetilde{Y}}\) extracted by MRT
  2. output: predicted rigid transformation: \([\hat{R}; \hat{t}]\)

步骤:

  1. predicted the corresponding points \(\mathcal{Y}\) for each super point \(\widetilde{X}\) :

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

  1. use features and MLP predict the deviations which needs to add to the predicted corresponding points:

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

  1. predict the overlap scores for each point. which indicatee probabilities of ponts lying in the overlap regions:

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

  1. use the wighted procrustes to predict the rigid transformation and compute the loss with GT.

Experiment

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读

MRA的plug-and play,确实可以:

Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读文章来源地址https://www.toymoban.com/news/detail-711677.html

到了这里,关于Rethinking Point Cloud Registration as Masking and Reconstruction论文阅读的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 图像融合论文阅读:MURF: Mutually Reinforcing Multi-Modal Image Registration and Fusion

    @article{xu2023murf, title={MURF: Mutually Reinforcing Multi-modal Image Registration and Fusion}, author={Xu, Han and Yuan, Jiteng and Ma, Jiayi}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, year={2023}, publisher={IEEE} } 论文级别:SCI A1 影响因子:23.6 📖[论文下载地址] 马佳义大佬团队2023年的一篇论文

    2024年02月21日
    浏览(33)
  • 【论文阅读】SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness

    论文链接:SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness | IEEE Journals Magazine | IEEE Xplore 代码: GitHub - Linfeng-Tang/SuperFusion: This is official Pytorch implementation of \\\"SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness\\\" 图像融合的重要性:由于摄影环境

    2024年03月19日
    浏览(33)
  • 03-25 周一 论文阅读 Train Large, Then Compress: Rethinking Model Size for Effcient Trainning and Inference

    03-25 周一 论文阅读 Train Large, Then Compress: Rethinking Model Size for Effcient Trainning and Inference of Transformers 时间 版本 修改人 描述 V0.1 宋全恒 新建文档  Lizhuohan是单位是UC Berkeley(加州大学伯克利分校)。这可以从文献的作者信息中得到确认,其中提到了 “1UC Berkeley” 作为其隶属单

    2024年04月27日
    浏览(29)
  • 论文笔记(二十九):BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

    作者:Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas Muller, Alex Evans, Dieter Fox, Jan Kautz, Stan Birchfield 来源:arXiv:2303.14158v1 [cs.CV] 24 Mar 2023 原文:https://arxiv.org/pdf/2303.14158.pdf 代码、数据和视频: https://bundlesdf.github.io/ 系列文章目录: 上一篇: https://blog.csdn.net/xzs1210652636?spm=

    2024年02月04日
    浏览(30)
  • 3D点云(3D point cloud)及PointNet、PointNet++

    https://www.youtube.com/watch?v=Ew24Rac8eYE 传统图像数据是2维的 3D点云是3维的,可以表达更多信息 比如对化工厂进行违章识别、安全隐患的识别 城市管理 点云分割 点云补全 点云生成 点云物体检测(3D物体检测) 点云配准(后续任务的基础) 一般点云数据都是基于激光雷达扫描生

    2024年02月02日
    浏览(26)
  • Point Cloud Library(PCL开源库)学习一

            点云库(PCL)是一个开源算法库,用于点云处理任务和3D几何处理。该库包含用于点云滤波、特征点估计、表面重建、3D配准、模型拟合、对象识别、分割和可视化的算法。PCL库有自己存储点云的数据格式——PCD,但也允许以部分其它格式加载和保存数据集。PCL库是基

    2023年04月16日
    浏览(25)
  • 论文阅读:PointCLIP: Point Cloud Understanding by CLIP

     CVPR2022 链接:https://arxiv.org/pdf/2112.02413.pdf         最近,通过对比视觉语言预训练(CLIP)的零镜头学习和少镜头学习在2D视觉识别方面表现出了鼓舞人心的表现,即学习在开放词汇设置下将图像与相应的文本匹配。然而,在二维大规模图像文本对的预训练下,CLIP识别能否推

    2024年02月04日
    浏览(37)
  • 无标记配准论文阅读(二)A Vision-Based Navigation System With Markerless Image Registration and Position-Sensing

    文章链接 A Vision-Based Navigation System With Markerless-- Image Registration and Position-Sensing Localization for Oral and Maxillofacial Surgery | IEEE Journals Magazine | IEEE Xplore D. Li, M. Zhu, S. Wang, Y. Hu, F. Yuan and J. Yu, \\\"A Vision-Based Navigation System With Markerless Image Registration and Position-Sensing Localization for Oral and Max

    2024年03月16日
    浏览(33)
  • PointMixer: MLP-Mixer for Point Cloud Understanding

    MLP-Mixer 最近崭露头角,成为对抗CNNs和Transformer领域的新挑战者。尽管相比Transformer更为简单,但通道混合MLPs和令牌混合MLPs的概念在图像识别任务中取得了显著的性能。与图像不同,点云本质上是稀疏、无序和不规则的,这限制了直接将MLP-Mixer用于点云理解。为了克服这些限

    2024年01月17日
    浏览(27)
  • 论文阅读:Offboard 3D Object Detection from Point Cloud Sequences

    目录 概要 Motivation 整体架构流程 技术细节 3D Auto Labeling Pipeline The static object auto labeling model The dynamic object auto labeling model 小结 论文地址: [2103.05073] Offboard 3D Object Detection from Point Cloud Sequences (arxiv.org)     该论文提出了一种利用点云序列数据进行离线三维物体检测的方法,称

    2024年02月06日
    浏览(36)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包