CVPR 2023 论文和开源项目合集

这篇具有很好参考价值的文章主要介绍了CVPR 2023 论文和开源项目合集。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

CVPR 2023 论文和开源项目合集

向AI转型的程序员都关注了这个号👇👇👇

【CVPR 2023 论文开源目录】

  • Backbone

  • CLIP

  • MAE

  • GAN

  • GNN

  • MLP

  • NAS

  • OCR

  • NeRF

  • DETR

  • Diffusion Models(扩散模型)

  • Avatars

  • ReID(重识别)

  • 长尾分布(Long-Tail)

  • Vision Transformer

  • 视觉和语言(Vision-Language)

  • 自监督学习(Self-supervised Learning)

  • 数据增强(Data Augmentation)

  • 目标检测(Object Detection)

  • 目标跟踪(Visual Tracking)

  • 语义分割(Semantic Segmentation)

  • 实例分割(Instance Segmentation)

  • 全景分割(Panoptic Segmentation)

  • 医学图像分割(Medical Image Segmentation)

  • 视频目标分割(Video Object Segmentation)

  • 参考图像分割(Referring Image Segmentation)

  • 图像抠图(Image Matting)

  • 图像编辑(Image Editing)

  • Low-level Vision

  • 超分辨率(Super-Resolution)

  • 去模糊(Deblur)

  • 3D点云(3D Point Cloud)

  • 3D目标检测(3D Object Detection)

  • 3D语义分割(3D Semantic Segmentation)

  • 3D目标跟踪(3D Object Tracking)

  • 3D人体姿态估计(3D Human Pose Estimation)

  • 3D语义场景补全(3D Semantic Scene Completion)

  • 医学图像(Medical Image)

  • 图像生成(Image Generation)

  • 视频生成(Video Generation)

  • 视频理解(Video Understanding)

  • 行为检测(Action Detection)

  • 文本检测(Text Detection)

  • 知识蒸馏(Knowledge Distillation)

  • 模型剪枝(Model Pruning)

  • 图像压缩(Image Compression)

  • 异常检测(Anomaly Detection)

  • 三维重建(3D Reconstruction)

  • 深度估计(Depth Estimation)

  • 轨迹预测(Trajectory Prediction)

  • 图像描述(Image Captioning)

  • 视觉问答(Visual Question Answering)

  • 手语识别(Sign Language Recognition)

  • 视频预测(Video Prediction)

  • 新视点合成(Novel View Synthesis)

  • Zero-Shot Learning(零样本学习)

  • 立体匹配(Stereo Matching)

  • 场景图生成(Scene Graph Generation)

  • 数据集(Datasets)

  • 新任务(New Tasks)

  • 其他(Others)

Backbone

Integrally Pre-Trained Transformer Pyramid Networks

  • Paper: https://arxiv.org/abs/2211.12735

  • Code: https://github.com/sunsmarterjie/iTPN

Stitchable Neural Networks

  • Homepage: https://snnet.github.io/

  • Paper: https://arxiv.org/abs/2302.06586

  • Code: https://github.com/ziplab/SN-Net

Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks

  • Paper: https://arxiv.org/abs/2303.03667

  • Code: https://github.com/JierunChen/FasterNet

BiFormer: Vision Transformer with Bi-Level Routing Attention

  • Paper: None

  • Code: https://github.com/rayleizhu/BiFormer

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

  • Paper: https://arxiv.org/abs/2303.02165

  • Code: https://github.com/alibaba/lightweight-neural-architecture-search

Vision Transformer with Super Token Sampling

  • Paper: https://arxiv.org/abs/2211.11167

  • Code: https://github.com/hhb072/SViT

Hard Patches Mining for Masked Image Modeling

  • Paper: None

  • Code: None

CLIP

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis

  • Paper: https://arxiv.org/abs/2301.12959

  • Code: https://github.com/tobran/GALIP

DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation

  • Paper: https://arxiv.org/abs/2303.06285

  • Code: https://github.com/Yueming6568/DeltaEdit

MAE

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

  • Paper: https://arxiv.org/abs/2212.06785

  • Code: https://github.com/ZrrSkywalker/I2P-MAE

Generic-to-Specific Distillation of Masked Autoencoders

  • Paper: https://arxiv.org/abs/2302.14771

  • Code: https://github.com/pengzhiliang/G2SD

GAN

DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation

  • Paper: https://arxiv.org/abs/2303.06285

  • Code: https://github.com/Yueming6568/DeltaEdit

NeRF

NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior

  • Home: https://nope-nerf.active.vision/

  • Paper: https://arxiv.org/abs/2212.07388

  • Code: None

Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

  • Paper: https://arxiv.org/abs/2211.07600

  • Code: https://github.com/eladrich/latent-nerf

NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via Novel-View Synthesis

  • Paper: https://arxiv.org/abs/2301.08556

  • Code: None

Panoptic Lifting for 3D Scene Understanding with Neural Fields

  • Homepage: https://nihalsid.github.io/panoptic-lifting/

  • Paper: https://arxiv.org/abs/2212.09802

  • Code: None

NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

  • Homepage: https://redrock303.github.io/nerflix/

  • Paper: https://arxiv.org/abs/2303.06919

  • Code: None

DETR

DETRs with Hybrid Matching

  • Paper: https://arxiv.org/abs/2207.13080

  • Code: https://github.com/HDETR

NAS

PA&DA: Jointly Sampling PAth and DAta for Consistent NAS

  • Paper: https://arxiv.org/abs/2302.14772

  • Code: https://github.com/ShunLu91/PA-DA

Avatars

Structured 3D Features for Reconstructing Relightable and Animatable Avatars

  • Homepage: https://enriccorona.github.io/s3f/

  • Paper: https://arxiv.org/abs/2212.06820

  • Code: None

  • Demo: https://www.youtube.com/watch?v=mcZGcQ6L-2s

ReID(重识别)

Clothing-Change Feature Augmentation for Person Re-Identification

  • Paper: None

  • Code: None

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID

  • Paper: https://arxiv.org/abs/2303.07065

  • Code: https://github.com/vimar-gu/MSINet

Diffusion Models(扩散模型)

Video Probabilistic Diffusion Models in Projected Latent Space

  • Homepage: https://sihyun.me/PVDM/

  • Paper: https://arxiv.org/abs/2302.07685

  • Code: https://github.com/sihyun-yu/PVDM

Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models

  • Paper: https://arxiv.org/abs/2211.10655

  • Code: None

Imagic: Text-Based Real Image Editing with Diffusion Models

  • Homepage: https://imagic-editing.github.io/

  • Paper: https://arxiv.org/abs/2210.09276

  • Code: None

Parallel Diffusion Models of Operator and Image for Blind Inverse Problems

  • Paper: https://arxiv.org/abs/2211.10656

  • Code: None

DiffRF: Rendering-guided 3D Radiance Field Diffusion

  • Homepage: https://sirwyver.github.io/DiffRF/

  • Paper: https://arxiv.org/abs/2212.01206

  • Code: None

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

  • Paper: https://arxiv.org/abs/2212.09478

  • Code: https://github.com/researchmm/MM-Diffusion

HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising

  • Homepage: https://aminshabani.github.io/housediffusion/

  • Paper: https://arxiv.org/abs/2211.13287

  • Code: https://github.com/aminshabani/house_diffusion

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

  • Paper: https://arxiv.org/abs/2303.05762

  • Code: https://github.com/chenweixin107/TrojDiff

Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption

  • Paper: https://arxiv.org/abs/2207.03442

  • Code: https://github.com/shiyegao/DDA

DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration

  • Paper: https://arxiv.org/abs/2303.06885

  • Code: None

Vision Transformer

Integrally Pre-Trained Transformer Pyramid Networks

  • Paper: https://arxiv.org/abs/2211.12735

  • Code: https://github.com/sunsmarterjie/iTPN

Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors

  • Homepage: https://niessnerlab.org/projects/hou2023mask3d.html

  • Paper: https://arxiv.org/abs/2302.14746

  • Code: None

Learning Trajectory-Aware Transformer for Video Super-Resolution

  • Paper: https://arxiv.org/abs/2204.04216

  • Code: https://github.com/researchmm/TTVSR

Vision Transformers are Parameter-Efficient Audio-Visual Learners

  • Homepage: https://yanbo.ml/project_page/LAVISH/

  • Code: https://github.com/GenjiB/LAVISH

Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes

  • Paper: https://arxiv.org/abs/2303.04249

  • Code: None

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets

  • Paper: https://arxiv.org/abs/2301.06051

  • Code: https://github.com/Haiyang-W/DSVT

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

  • Paper: https://arxiv.org/abs/2211.10772

  • Code link: https://github.com/ViTAE-Transformer/DeepSolo

BiFormer: Vision Transformer with Bi-Level Routing Attention

  • Paper: https://arxiv.org/abs/2303.08810

  • Code: https://github.com/rayleizhu/BiFormer

Vision Transformer with Super Token Sampling

  • Paper: https://arxiv.org/abs/2211.11167

  • Code: https://github.com/hhb072/SViT

BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

  • Paper: https://arxiv.org/abs/2211.10439

  • Code: None

BAEFormer: Bi-directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation

  • Paper: None

  • Code: None

视觉和语言(Vision-Language)

GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods

  • Paper: https://arxiv.org/abs/2301.01893

  • Code: None

Teaching Structured Vision&Language Concepts to Vision&Language Models

  • Paper: https://arxiv.org/abs/2211.11733

  • Code: None

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

  • Paper: https://arxiv.org/abs/2211.09808

  • Code: https://github.com/fundamentalvision/Uni-Perceiver

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training

  • Paper: https://arxiv.org/abs/2303.00040

  • Code: None

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

  • Paper: https://arxiv.org/abs/2303.02489

  • Code: None

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

  • Paper: https://arxiv.org/abs/2303.02483

  • Code: None

Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding

  • Homepage: https://rllab-snu.github.io/projects/Meta-Explore/doc.html

  • Paper: https://arxiv.org/abs/2303.04077

  • Code: None

All in One: Exploring Unified Video-Language Pre-training

  • Paper: https://arxiv.org/abs/2203.07303

  • Code: https://github.com/showlab/all-in-one

Position-guided Text Prompt for Vision Language Pre-training

  • Paper: https://arxiv.org/abs/2212.09737

  • Code: https://github.com/sail-sg/ptp

EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding

  • Paper: https://arxiv.org/abs/2209.14941

  • Code: https://github.com/yanmin-wu/EDA

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

  • Paper: https://arxiv.org/abs/2303.02489

  • Code: None

FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks

  • Paper: https://arxiv.org/abs/2303.02483

  • Code: https://github.com/BrandonHanx/FAME-ViL

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

  • Homepage: https://boheumd.github.io/A2Summ/

  • Paper: https://arxiv.org/abs/2303.07284

  • Code: https://github.com/boheumd/A2Summ

目标检测(Object Detection)

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

  • Paper: https://arxiv.org/abs/2207.02696

  • Code: https://github.com/WongKinYiu/yolov7

DETRs with Hybrid Matching

  • Paper: https://arxiv.org/abs/2207.13080

  • Code: https://github.com/HDETR

Enhanced Training of Query-Based Object Detection via Selective Query Recollection

  • Paper: https://arxiv.org/abs/2212.07593

  • Code: https://github.com/Fangyi-Chen/SQR

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

  • Paper: https://arxiv.org/abs/2303.05892

  • Code: https://github.com/LutingWang/OADP

目标跟踪(Object Tracking)

Simple Cues Lead to a Strong Multi-Object Tracker

  • Paper: https://arxiv.org/abs/2206.04656

  • Code: None

语义分割(Semantic Segmentation)

Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos

  • Paper: https://arxiv.org/abs/2303.07224

  • Code: https://github.com/THU-LYJ-Lab/AR-Seg

医学图像分割(Medical Image Segmentation)

Label-Free Liver Tumor Segmentation

  • Paper: https://arxiv.org/abs/2210.14845

  • Code: https://github.com/MrGiovanni/SyntheticTumors

视频目标分割(Video Object Segmentation)

Two-shot Video Object Segmentation

  • Paper: https://arxiv.org/abs/2303.12078

  • Code: https://github.com/yk-pku/Two-shot-Video-Object-Segmentation

参考图像分割(Referring Image Segmentation )

PolyFormer: Referring Image Segmentation as Sequential Polygon Generation

  • Paper: https://arxiv.org/abs/2302.07387

  • Code: None

3D点云(3D-Point-Cloud)

Physical-World Optical Adversarial Attacks on 3D Face Recognition

  • Paper: https://arxiv.org/abs/2205.13412

  • Code: https://github.com/PolyLiYJ/SLAttack.git

3D目标检测(3D Object Detection)

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets

  • Paper: https://arxiv.org/abs/2301.06051

  • Code: https://github.com/Haiyang-W/DSVT

FrustumFormer: Adaptive Instance-aware Resampling for Multi-view 3D Detection

  • Paper: https://arxiv.org/abs/2301.04467

  • Code: None

3D Video Object Detection with Learnable Object-Centric Global Optimization

  • Paper: None

  • Code: None

3D语义分割(3D Semantic Segmentation)

Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation

  • Paper: https://arxiv.org/abs/2303.11203

  • Code: https://github.com/l1997i/lim3d

3D语义场景补全(3D Semantic Scene Completion)

  • Paper: https://arxiv.org/abs/2302.12251

  • Code: https://github.com/NVlabs/VoxFormer

Low-level Vision

Causal-IR: Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective

  • Paper: https://arxiv.org/abs/2303.06859

  • Code: https://github.com/lixinustc/Casual-IR-DIL

超分辨率(Video Super-Resolution)

Super-Resolution Neural Operator

  • Paper: https://arxiv.org/abs/2303.02584

  • Code: https://github.com/2y7c3/Super-Resolution-Neural-Operator

视频超分辨率

Learning Trajectory-Aware Transformer for Video Super-Resolution

  • Paper: https://arxiv.org/abs/2204.04216

  • Code: https://github.com/researchmm/TTVSR

图像生成(Image Generation)

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis

  • Paper: https://arxiv.org/abs/2301.12959

  • Code: https://github.com/tobran/GALIP

MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

  • Paper: https://arxiv.org/abs/2211.09117

  • Code: https://github.com/LTH14/mage

视频生成(Video Generation)

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

  • Paper: https://arxiv.org/abs/2212.09478

  • Code: https://github.com/researchmm/MM-Diffusion

视频理解(Video Understanding)

Learning Transferable Spatiotemporal Representations from Natural Script Knowledge

  • Paper: https://arxiv.org/abs/2209.15280

  • Code: https://github.com/TencentARC/TVTS

行为检测(Action Detection)

TriDet: Temporal Action Detection with Relative Boundary Modeling

  • Paper: https://arxiv.org/abs/2303.07347

  • Code: https://github.com/dingfengshi/TriDet

文本检测(Text Detection)

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

  • Paper: https://arxiv.org/abs/2211.10772

  • Code link: https://github.com/ViTAE-Transformer/DeepSolo

知识蒸馏(Knowledge Distillation)

Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation

  • Paper: https://arxiv.org/abs/2302.14290

  • Code: None

Generic-to-Specific Distillation of Masked Autoencoders

  • Paper: https://arxiv.org/abs/2302.14771

  • Code: https://github.com/pengzhiliang/G2SD

模型剪枝(Model Pruning)

DepGraph: Towards Any Structural Pruning

  • Paper: https://arxiv.org/abs/2301.12900

  • Code: https://github.com/VainF/Torch-Pruning

图像压缩(Image Compression)

Context-Based Trit-Plane Coding for Progressive Image Compression

  • Paper: https://arxiv.org/abs/2303.05715

  • Code: https://github.com/seungminjeon-github/CTC

异常检测(Anomaly Detection)

Deep Feature In-painting for Unsupervised Anomaly Detection in X-ray Images

  • Paper: https://arxiv.org/abs/2111.13495

  • Code: https://github.com/tiangexiang/SQUID

三维重建(3D Reconstruction)

OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields

  • Paper: https://arxiv.org/abs/2211.12886

  • Code: None

SparsePose: Sparse-View Camera Pose Regression and Refinement

  • Paper: https://arxiv.org/abs/2211.16991

  • Code: None

NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction

  • Paper: https://arxiv.org/abs/2303.02375

  • Code: None

Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition

  • Homepage: https://moygcc.github.io/vid2avatar/

  • Paper: https://arxiv.org/abs/2302.11566

  • Code: https://github.com/MoyGcc/vid2avatar

  • Demo: https://youtu.be/EGi47YeIeGQ

To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision

  • Paper: https://arxiv.org/abs/2106.09614

  • Code: https://github.com/unibas-gravis/Occlusion-Robust-MoFA

Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction

  • Paper: https://arxiv.org/abs/2303.05937

  • Code: None

3D Cinemagraphy from a Single Image

  • Homepage: https://xingyi-li.github.io/3d-cinemagraphy/

  • Paper: https://arxiv.org/abs/2303.05724

  • Code: https://github.com/xingyi-li/3d-cinemagraphy

Revisiting Rotation Averaging: Uncertainties and Robust Losses

  • Paper: https://arxiv.org/abs/2303.05195

  • Code https://github.com/zhangganlin/GlobalSfMpy

FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction

  • Paper: https://arxiv.org/abs/2211.13874

  • Code: https://github.com/csbhr/FFHQ-UV

深度估计(Depth Estimation)

Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

  • Paper: https://arxiv.org/abs/2211.13202

  • Code: https://github.com/noahzn/Lite-Mono

轨迹预测(Trajectory Prediction)

IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction

  • Paper: https://arxiv.org/abs/2303.00575

  • Code: None

图像描述(Image Captioning)

ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing

  • Paper: https://arxiv.org/abs/2303.02437

  • Code: Node

视觉问答(Visual Question Answering)

MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering

  • Paper: https://arxiv.org/abs/2303.01239

  • Code: https://github.com/jingjing12110/MixPHM

手语识别(Sign Language Recognition)

Continuous Sign Language Recognition with Correlation Network

Paper: https://arxiv.org/abs/2303.03202

Code: https://github.com/hulianyuyy/CorrNet

视频预测(Video Prediction)

MOSO: Decomposing MOtion, Scene and Object for Video Prediction

  • Paper: https://arxiv.org/abs/2303.03684

  • Code: https://github.com/anonymous202203/MOSO

新视点合成(Novel View Synthesis)

3D Video Loops from Asynchronous Input

  • Homepage: https://limacv.github.io/VideoLoop3D_web/

  • Paper: https://arxiv.org/abs/2303.05312

  • Code: https://github.com/limacv/VideoLoop3D

Zero-Shot Learning(零样本学习)

Bi-directional Distribution Alignment for Transductive Zero-Shot Learning

  • Paper: https://arxiv.org/abs/2303.08698

  • Code: https://github.com/Zhicaiwww/Bi-VAEGAN

Semantic Prompt for Few-Shot Learning

  • Paper: None

  • Code: None

立体匹配(Stereo Matching)

Iterative Geometry Encoding Volume for Stereo Matching

  • Paper: https://arxiv.org/abs/2303.06615

  • Code: https://github.com/gangweiX/IGEV

场景图生成(Scene Graph Generation)

Prototype-based Embedding Network for Scene Graph Generation

  • Paper: https://arxiv.org/abs/2303.07096

  • Code: None

数据集(Datasets)

Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

  • Paper: https://arxiv.org/abs/2303.02760

  • Code: None

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

  • Homepage: https://boheumd.github.io/A2Summ/

  • Paper: https://arxiv.org/abs/2303.07284

  • Code: https://github.com/boheumd/A2Summ

其他(Others)

Interactive Segmentation as Gaussian Process Classification

  • Paper: https://arxiv.org/abs/2302.14578

  • Code: None

Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

  • Paper: https://arxiv.org/abs/2302.14677

  • Code: None

SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries

  • Homepage: http://bit.ly/splinecam

  • Paper: https://arxiv.org/abs/2302.12828

  • Code: None

SCOTCH and SODA: A Transformer Video Shadow Detection Framework

  • Paper: https://arxiv.org/abs/2211.06885

  • Code: None

DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization

  • Homepage: https://ai4ce.github.io/DeepMapping2/

  • Paper: https://arxiv.org/abs/2212.06331

  • None: https://github.com/ai4ce/DeepMapping2

RelightableHands: Efficient Neural Relighting of Articulated Hand Models

  • Homepage: https://sh8.io/#/relightable_hands

  • Paper: https://arxiv.org/abs/2302.04866

  • Code: None

Token Turing Machines

  • Paper: https://arxiv.org/abs/2211.09119

  • Code: None

Single Image Backdoor Inversion via Robust Smoothed Classifiers

  • Paper: https://arxiv.org/abs/2303.00215

  • Code: https://github.com/locuslab/smoothinv

To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision

  • Paper: https://arxiv.org/abs/2106.09614

  • Code: https://github.com/unibas-gravis/Occlusion-Robust-MoFA

HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics

  • Homepage: https://dolorousrtur.github.io/hood/

  • Paper: https://arxiv.org/abs/2212.07242

  • Code: https://github.com/dolorousrtur/hood

  • Demo: https://www.youtube.com/watch?v=cBttMDPrUYY

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

  • Paper: https://arxiv.org/abs/2212.04825

  • Code: https://github.com/facebookresearch/Whac-A-Mole.git

RelightableHands: Efficient Neural Relighting of Articulated Hand Models

  • Homepage: https://sh8.io/#/relightable_hands

  • Paper: https://arxiv.org/abs/2302.04866

  • Code: None

  • Demo: https://sh8.io/static/media/teacher_video.923d87957fe0610730c2.mp4

Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation

  • Paper: https://arxiv.org/abs/2303.00914

  • Code: None

Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression

  • Paper: https://arxiv.org/abs/2303.01052

  • Code: None

UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

  • Paper: https://arxiv.org/abs/2303.00938

  • Code: None

Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness

  • Paper: https://arxiv.org/abs/2303.00971

  • Code: https://github.com/zhijieshen-bjtu/DOPNet

Learning Neural Parametric Head Models

  • Homepage: https://simongiebenhain.github.io/NPHM)

  • Paper: https://arxiv.org/abs/2212.02761

  • Code: None

A Meta-Learning Approach to Predicting Performance and Data Requirements

  • Paper: https://arxiv.org/abs/2303.01598

  • Code: None

MACARONS: Mapping And Coverage Anticipation with RGB Online Self-Supervision

  • Homepage: https://imagine.enpc.fr/~guedona/MACARONS/

  • Paper: https://arxiv.org/abs/2303.03315

  • Code: None

Masked Images Are Counterfactual Samples for Robust Fine-tuning

  • Paper: https://arxiv.org/abs/2303.03052

  • Code: None

HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

  • Paper: https://arxiv.org/abs/2303.02700

  • Code: None

Decompose, Adjust, Compose: Effective Normalization by Playing with Frequency for Domain Generalization

  • Paper: https://arxiv.org/abs/2303.02328

  • Code: None

Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization

  • Paper: https://arxiv.org/abs/2303.03108

  • Code: None

Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples

  • Paper: https://arxiv.org/abs/2301.01217

  • Code: https://github.com/jiamingzhang94/Unlearnable-Clusters

Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes

  • Paper: https://arxiv.org/abs/2303.04249

  • Code: None

UniHCP: A Unified Model for Human-Centric Perceptions

  • Paper: https://arxiv.org/abs/2303.02936

  • Code: https://github.com/OpenGVLab/UniHCP

CUDA: Convolution-based Unlearnable Datasets

  • Paper: https://arxiv.org/abs/2303.04278

  • Code: https://github.com/vinusankars/Convolution-based-Unlearnability

Masked Images Are Counterfactual Samples for Robust Fine-tuning

  • Paper: https://arxiv.org/abs/2303.03052

  • Code: None

AdaptiveMix: Robust Feature Representation via Shrinking Feature Space

  • Paper: https://arxiv.org/abs/2303.01559

  • Code: https://github.com/WentianZhang-ML/AdaptiveMix

Physical-World Optical Adversarial Attacks on 3D Face Recognition

  • Paper: https://arxiv.org/abs/2205.13412

  • Code: https://github.com/PolyLiYJ/SLAttack.git

DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

  • Paper: https://arxiv.org/abs/2301.06281

  • Code: https://carlyx.github.io/DPE/

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

  • Paper: https://arxiv.org/abs/2211.12194

  • Code: https://github.com/Winfredy/SadTalker

Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models

  • Paper: None

  • Code: None

Sharpness-Aware Gradient Matching for Domain Generalization

  • Paper: None

  • Code: https://github.com/Wang-pengfei/SAGM

Mind the Label-shift for Augmentation-based Graph Out-of-distribution Generalization

  • Paper: None

  • Code: None

Blind Video Deflickering by Neural Filtering with a Flawed Atlas

  • Homepage: https://chenyanglei.github.io/deflicker

  • Paper: None

  • Code: None

RiDDLE: Reversible and Diversified De-identification with Latent Encryptor

  • Paper: None

  • Code: https://github.com/ldz666666/RiDDLE

PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation

  • Paper: https://arxiv.org/abs/2303.07337

  • Code: None

Upcycling Models under Domain and Category Shift

  • Paper: https://arxiv.org/abs/2303.07110

  • Code: https://github.com/ispc-lab/GLC

Modality-Agnostic Debiasing for Single Domain Generalization

  • Paper: https://arxiv.org/abs/2303.07123

  • Code: None

Progressive Open Space Expansion for Open-Set Model Attribution

  • Paper: https://arxiv.org/abs/2303.06877

  • Code: None

Dynamic Neural Network for Multi-Task Learning Searching across Diverse Network Topologies

  • Paper: https://arxiv.org/abs/2303.06856

  • Code: None

GFPose: Learning 3D Human Pose Prior with Gradient Fields

  • Paper: https://arxiv.org/abs/2212.08641

  • Code: https://github.com/Embracing/GFPose

PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment

  • Paper: https://arxiv.org/abs/2303.11526

  • Code: https://github.com/Zhang-VISLab

Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings

  • Paper: https://arxiv.org/abs/2303.11502

  • Code: None

Boundary Unlearning

  • Paper: https://arxiv.org/abs/2303.11570

机器学习算法AI大数据技术

 搜索公众号添加: datanlp

长按图片,识别二维码


阅读过本文的人还看了以下文章:

TensorFlow 2.0深度学习案例实战

基于40万表格数据集TableBank,用MaskRCNN做表格检测

《基于深度学习的自然语言处理》中/英PDF

Deep Learning 中文版初版-周志华团队

【全套视频课】最全的目标检测算法系列讲解,通俗易懂!

《美团机器学习实践》_美团算法团队.pdf

《深度学习入门:基于Python的理论与实现》高清中文PDF+源码

《深度学习:基于Keras的Python实践》PDF和代码

特征提取与图像处理(第二版).pdf

python就业班学习视频,从入门到实战项目

2019最新《PyTorch自然语言处理》英、中文版PDF+源码

《21个项目玩转深度学习:基于TensorFlow的实践详解》完整版PDF+附书代码

《深度学习之pytorch》pdf+附书源码

PyTorch深度学习快速实战入门《pytorch-handbook》

【下载】豆瓣评分8.1,《机器学习实战:基于Scikit-Learn和TensorFlow》

《Python数据分析与挖掘实战》PDF+完整源码

汽车行业完整知识图谱项目实战视频(全23课)

李沐大神开源《动手学深度学习》,加州伯克利深度学习(2019春)教材

笔记、代码清晰易懂!李航《统计学习方法》最新资源全套!

《神经网络与深度学习》最新2018版中英PDF+源码

将机器学习模型部署为REST API

FashionAI服装属性标签图像识别Top1-5方案分享

重要开源!CNN-RNN-CTC 实现手写汉字识别

yolo3 检测出图像中的不规则汉字

同样是机器学习算法工程师,你的面试为什么过不了?

前海征信大数据算法:风险概率预测

【Keras】完整实现‘交通标志’分类、‘票据’分类两个项目,让你掌握深度学习图像分类

VGG16迁移学习,实现医学图像识别分类工程项目

特征工程(一)

特征工程(二) :文本数据的展开、过滤和分块

特征工程(三):特征缩放,从词袋到 TF-IDF

特征工程(四): 类别特征

特征工程(五): PCA 降维

特征工程(六): 非线性特征提取和模型堆叠

特征工程(七):图像特征提取和深度学习

如何利用全新的决策树集成级联结构gcForest做特征工程并打分?

Machine Learning Yearning 中文翻译稿

蚂蚁金服2018秋招-算法工程师(共四面)通过

全球AI挑战-场景分类的比赛源码(多模型融合)

斯坦福CS230官方指南:CNN、RNN及使用技巧速查(打印收藏)

python+flask搭建CNN在线识别手写中文网站

中科院Kaggle全球文本匹配竞赛华人第1名团队-深度学习与特征工程

不断更新资源

深度学习、机器学习、数据分析、python

 搜索公众号添加: datayx  文章来源地址https://www.toymoban.com/news/detail-465984.html

到了这里,关于CVPR 2023 论文和开源项目合集的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 【CVPR 2023】FasterNet论文详解

    论文名称:Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks 论文地址:https://arxiv.org/abs/2303.03667 作者发现由于效率低下的每秒浮点运算,每秒浮点运算的减少并不一定会导致类似水平的延迟减少。提出通过同时减少冗余计算和内存访问有效地提取空间特征。然后基于PConv进

    2023年04月14日
    浏览(48)
  • CVPR 2023 医学图像分割论文大盘点

    点击下方 卡片 ,关注“ CVer ”公众号 AI/CV重磅干货,第一时间送达 点击进入— 【医学图像分割】微信交流群 被催了很久,CVer 正式开启 CVPR 2023 论文大盘点系列 ! Amusi 一共搜集了13篇医学图像分割论文 ,这应该是目前各平台上 最新最全面的CVPR 2023 医学图像分割盘点资料

    2024年02月14日
    浏览(45)
  • CVPR 2023 | 美团技术团队精选论文解读

    本文精选了美团技术团队被CVPR 2023收录的8篇论文进行解读。这些论文既有自监督学习、领域自适应、联邦学习等通用学习范式方面的技术迭代,也涉及目标检测、跟踪、分割、Low-level Vision等典型视觉任务的性能,体现了美团在基础通用技术和垂直领域技术上的全方位创新。

    2024年02月09日
    浏览(47)
  • CVPR2023 | 70+目标检测论文及代码整理

    目标检测是当下应用最广的计算机视觉任务之一。本文整理了CVPR 2023 目标检测相关论文72篇,覆盖包括2D目标检测、3D目标检测、视频目标检测、人物交互检测、异常检测、伪装目标检测、关键点检测、显著性目标检测、车道线检测、边缘检测等10个细分任务。并且每篇论文都

    2024年02月10日
    浏览(37)
  • CVPR 2023 | OpenGait: 步态识别开源框架介绍

    Title : OpenGait: Revisiting Gait Recognition Toward Better Practicality Paper : https://arxiv.org/pdf/2211.06597.pdf Code : https://github.com/ShiqiYu/OpenGait 今天为大家介绍的 OpenGait 便是一套基于 Pytorch 构建的步态识别( Gait Recognition )框架,其涵盖了一系列最先进的步态识别算法,同时提供了一个结构简单但

    2024年02月04日
    浏览(72)
  • 【论文阅读】CVPR2023 IGEV-Stereo

    【cvhub导读】【paper】【code_openi】 代码是启智社区的镜像仓库,不需要魔法,点击这里注册 1️⃣现有主流方法 基于代价滤波的方法 和 基于迭代优化的方法 : 基于 代价滤波 的方法可以在cost volume中编码足够的 非局部几何和上下文信息 ,这对于具有挑战性的区域中的视差预

    2024年02月07日
    浏览(44)
  • 顶会论文投稿经验分享-笔记【CVPR 2023预讲会】

    视频链接:Panel: 顶会论文投稿经验分享与大模型时代下的科研_哔哩哔哩_bilibili 嘉宾: 王琦,上海交通大学计算机系博士生 任星宇,上海交通大学博士三年级研究生 李逸轩,上海交通大学2022级硕士研究生 官同坤,上海交通大学2023级博士生 李逸轩:不管是对比实验、主图、

    2023年04月23日
    浏览(55)
  • CVPR2023最佳论文候选:3D点云配准新方法

    文章:3D Registration with Maximal Cliques 作者:Xiyu Zhang Jiaqi Yang* Shikun Zhang Yanning Zhang 编辑:点云PCL 代码: https://github.com/zhangxy0517/3D-Registration-with-Maximal-Cliques.git 欢迎各位加入知识星球,获取PDF论文,欢迎转发朋友圈。文章仅做学术分享,如有侵权联系删文。 公众号致力于点云处

    2024年02月08日
    浏览(44)
  • CVPR 2023 | 风格迁移论文3篇简读,视觉AIGC系列

    内容相似度损失(包括特征和像素相似度)是逼真和视频风格迁移中出现伪影的主要问题。本文提出了一个名为CAP-VSTNet的新框架,包括一个新的可逆残差网络(reversible residual network)和一个无偏线性变换模块,用于多功能风格转移。这个可逆残差网络不仅可以保留内容关联性

    2024年02月11日
    浏览(45)
  • CVPR 2023 | 最全 AIGC 论文清单汇总版,30个方向130篇!

    目录 1、图像转换/翻译 2、GAN改进/可控 3、可控文生图/定制化文生图 4、图像恢复 5、布局可控生成 6、医学图像 7、人脸相关 8、3D相关 9、deepfake检测 10、图像超分 11、风格迁移 12、去雨去噪去模糊 13、图像分割 14、视频相关 15、对抗攻击 16、扩散模型改进 17、数据增广 18、说

    2024年02月14日
    浏览(40)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包