在做3D
分割任务中,多数的方法多采用整体缩放,或裁剪成一个个小的patch
操作,这样做的一个主要原因是内存问题。还有就是有些目标太小,比如分割结节,用整图直接输入网络,正负样本的不均衡是非常大的。
相较于整体缩放,采用裁剪成patch
的方法,对于小目标会更加的鲁棒,这也是大多数3D
分割任务中常选取的方式。尤其是针对医学影像的器官分割任务,CT
结节诊断等等,对于细节的要求是非常高的。采用缩小的方式,反而会使得目标的像素区域在输入阶段,就损失较多。
本文,就针对2D
、3D
的图像和MR
数据进行有重叠的crop
和merge
操作,帮助对其中的细节进行理解。通过本文的学习,对于下节推理阶段的理解,有较大的帮助。
一、2D crop and merge
对于一个[10, 10]
大小的示例图像,采用patch大小为[3, 3]
的进行裁剪,每次patch
与patch之间,在x和y方向重叠1
个像素,无法构成一个patch的部分,选择丢弃,如下所示:
这里有一个共识,那就是当输入图像的大小,patch
的尺寸,以及overlap
的size
确定后,怎么裁剪,和能裁剪出多少个patch
,就已经确定了。下面我们就分析下,这个过程是什么样子的。
- 首先,决定行列可以有多少个
patch
,是左上角第一个patch
中右下角的那个红色点,因为他是第一个在水平和竖直方向都会需要重叠的点; - 反应在行和列上面,可以移动的区域也就是
width - patch_width + 1
,和height - patch_height + 1
,因为对于左上角第一个patch
,只有右下角的坐标,是参与到遍历里面的,可参与遍历的区域就是红色曲线区域; - 对于行、列的每一步,都会重叠
overlap_size
个像素区域,所以可走的步长,是patch_width - overlap_width
,和patch_height - overlap_height
; - 最后以左上角的坐标点为准,
x:x+patch_width
和y:y+patch_height
就表示了一个区块patch
的像素范围,被裁剪下来。
1.1、crop 操作
下面是实现上述步骤的代码,主要就是要理解几个点:
- 在水平和数值方向,能取到多少个
patch
? -
patch
的滑动选取,一次可以移动多大的步长? - 最后取像素块,就简单了许多。
实操代码如下:
import numpy as np
def crop_volume(volume, patch_size=[96, 96], overlap_size=[4, 4]):
"""
Crop a 2D volume into patches with specified size and overlap.
Args:
volume (np.ndarray): the 3D volume to be cropped, with shape [width, height]
patch_size (tuple or list): the size of patch, with format [patch_width, patch_height]
overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height]
Returns:
np.ndarray: the cropped patches, with shape [num_patches, patch_width, patch_height]
"""
width, height = volume.shape
patch_width, patch_height = patch_size
overlap_width, overlap_height = overlap_size
patches = []
# 不够一个patch,就丢弃
for x in range(0, width - patch_width + 1, patch_width - overlap_width):
for y in range(0, height - patch_height + 1, patch_height - overlap_height):
print(x, y)
patch = volume[x:x+patch_width, y:y+patch_height]
patches.append(patch)
print('\n')
patches = np.asarray(patches)
return patches
# 生成一个[10, 10]大小的示例图像
imgs = np.random.rand(10, 10)
patch_size=[3, 3]
overlap_size = [1, 1]
# print('img shape:', imgs.shape)
patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)
验证了前面我们的猜想,后面我们直接取一个图片,来验证下我们的思路。如果上述的思路是对的,那么,在裁剪后保存的图像,就该是一个具体部分重叠区域,但是,还能够反映全貌的一个个小图。下面就是:
import os
import itk
import cv2
from matplotlib import pylab as plt
from PIL import Image
path = os.path.join(r'F:\tmp\results2', '10.png')
imgs = cv2.imread(path, 0)
volume_size = imgs.shape
patch_size=[96, 96]
overlap_size = [4, 4]
print('img shape:', imgs.shape)
patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)
for i in range(0, patches.shape[0], 1):
one_patch = patches[i, :, :]
print(i, one_patch.shape)
width_p, height_p = one_patch.shape
img_Image = Image.fromarray(one_patch)
img_Image = img_Image.convert("L")
img_Image.save(r"F:\tmp\results1/" + str(i) + ".png")
如下,是读取的原始图像,和crop
后的一个个散落的小图。尽管一个个小图在我们展示的时候,他们之间使用间隙的,但并不影响我们看到他的全貌。
还是一种图的样子,区别在与彩色图像成了灰度图,最右侧和最下侧的像素像是少了一些。这是因为不足一个patch
,被丢弃的原因。基于此,我们也能给猜出来,在后续merge
阶段,可能会还原回去的图像存在些许的差异。
1.2、merge 操作
merge
是crop
的一个逆过程。当时我们怎么拆的,现在就原路给拼接回去。拼接回去的图像尺寸和crop前的图像尺寸是一致的,当时被忽略的部分,都是0
。
- 移动的步长还是一致的,在行列方向分别还是:
patch_width - overlap_width
,和patch_height - overlap_height
; - 还要需要知道在行列方向,分别crop了多少次。以行为例,
width - patch_width
就是把第一个patch
块去除掉的行长度,再除以(patch_width - overlap_width)
,得到剩余部分可以走多少步,再+1
,把第一个块的数量补上,也就得到了在行方向上,有多少个patch了; - 以左上角的坐标点为准,
x:x+patch_width
和y:y+patch_height
就表示了一个区块patch
了,现在就把对应的patch像素,给赋值给原始图了。 - 因为
overlap
重叠区域,被多次覆盖,这部分需要求均值,巧妙的采用了一个新的像素块,专门记录被赋值的次数,最后除以对应的次数,就可能实现求均值的过程了。
为什么先需要把第一个patch
块的行列方向都先去掉呢?
因为第一个块是最特殊的,它被重叠的区域,只有
overlap_size_w
行和overlap_size_h
例,而其余的patch
块,重叠区域都会是2行和2列,都遵循步长的节奏。
width - patch_width
把第一个patch
块去除掉后的行长度,还能准确反映有多少个patch
吗?
答案是可以的,这是因为减去一个
patch
,无非就是少了一个overlap_size_h
的长度,去掉一个overlap_size_h
的长度,如果恰好整除,那么加上这个长度,也是多余的,无法再次构成一个新的patch;即便有剩余,也是无法组成一个新的patch
的。
下面是上面图像crop
阶段裁剪得到的patch
,加上merge
后的操作,如下:
def merge_patches(patches, volume_size, overlap_size):
"""
Merge the cropped patches into a complete 2D volume.
Args:
patches (np.ndarray): the cropped patches, with shape [num_patches, patch_width, patch_height]
volume_size (tuple or list): the size of the complete volume, with format [width, height]
overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height]
Returns:
np.ndarray: the merged volume, with shape [width, height]
"""
width, height = volume_size
patch_width, patch_height = patches.shape[1:]
overlap_width, overlap_height = overlap_size
num_patches_x = (width - patch_width) // (patch_width - overlap_width) + 1
num_patches_y = (height - patch_height) // (patch_height - overlap_height) + 1
print('merge:', num_patches_x, num_patches_y)
merged_volume = np.zeros(volume_size)
weight_volume = np.zeros(volume_size) # weight_volume的目的是用于记录每个像素在裁剪过程中被遍历的次数,最后用于求平均值
idx = 0
for x in range(num_patches_x):
for y in range(num_patches_y):
x_start = x * (patch_width - overlap_width)
y_start = y * (patch_height - overlap_height)
merged_volume[x_start:x_start+patch_width, y_start:y_start+patch_height] += patches[idx]
weight_volume[x_start:x_start+patch_width, y_start:y_start+patch_height] += 1
idx += 1
merged_volume /= weight_volume
return merged_volume
path = os.path.join(r'./results2', '10.png')
imgs = cv2.imread(path, 0)
volume_size = imgs.shape
patch_size=[96, 96]
overlap_size = [4, 4]
print('img shape:', imgs.shape)
patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)
merged_volume = merge_patches(patches, volume_size, overlap_size)
print('merged_volume shape:', merged_volume.shape)
merged_volume = Image.fromarray(merged_volume)
merged_volume = merged_volume.convert("L")
merged_volume.save(r"./results2/" + "merged_volume.png")
如下,果然和我们前面猜想的一样,在merge
后的图像,相比于原图,在右侧和下侧,都少了部分,这个问题后面在3D
时候,再细细的探讨。
二、3D crop and merge
3D
部分相比于2D
,无非就是多了一个深度信息,也就是z
轴信息。所以在cro
p阶段和merge
阶段,只需要按照2D
行列的方式,增加一个维度的信息即可。
尽管说仅仅是增加了一个纬度,但是对于很多人来说,理解一个二维的平面是好理解的,但是突然变成了三维,还是有些拗不过弯的。此时,如果能绘制出一个三维的图,帮助理解,就再好不过了。
代码如下所示:
import numpy as np
def crop_volume(volume, patch_size=[96, 96, 96], overlap_size=[4, 4, 4]):
"""
Crop a 3D volume into patches with specified size and overlap.
Args:
volume (np.ndarray): the 3D volume to be cropped, with shape [width, height, depth]
patch_size (tuple or list): the size of patch, with format [patch_width, patch_height, patch_depth]
overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height, overlap_depth]
Returns:
np.ndarray: the cropped patches, with shape [num_patches, patch_width, patch_height, patch_depth]
"""
width, height, depth = volume.shape
patch_width, patch_height, patch_depth = patch_size
overlap_width, overlap_height, overlap_depth = overlap_size
patches = []
for z in range(0, depth - patch_depth + 1, patch_depth - overlap_depth):
for y in range(0, height - patch_height + 1, patch_height - overlap_height):
for x in range(0, width - patch_width + 1, patch_width - overlap_width):
patch = volume[x:x+patch_width, y:y+patch_height, z:z+patch_depth]
patches.append(patch)
patches = np.asarray(patches)
return patches
def merge_patches(patches, volume_size, overlap_size):
"""
Merge the cropped patches into a complete 3D volume.
Args:
patches (np.ndarray): the cropped patches, with shape [num_patches, patch_width, patch_height, patch_depth]
volume_size (tuple or list): the size of the complete volume, with format [width, height, depth]
overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height, overlap_depth]
Returns:
np.ndarray: the merged volume, with shape [width, height, depth]
"""
width, height, depth = volume_size
patch_width, patch_height, patch_depth = patches.shape[1:]
overlap_width, overlap_height, overlap_depth = overlap_size
num_patches_x = (width - patch_width) // (patch_width - overlap_width) + 1
num_patches_y = (height - patch_height) // (patch_height - overlap_height) + 1
num_patches_z = (depth - patch_depth) // (patch_depth - overlap_depth) + 1
print('merge:', num_patches_x, num_patches_y, num_patches_z)
merged_volume = np.zeros(volume_size)
weight_volume = np.zeros(volume_size) # weight_volume的目的是用于记录每个像素在裁剪过程中被遍历的次数,最后用于求平均值
idx = 0
for z in range(num_patches_z):
for y in range(num_patches_y):
for x in range(num_patches_x):
x_start = x * (patch_width - overlap_width)
y_start = y * (patch_height - overlap_height)
z_start = z * (patch_depth - overlap_depth)
merged_volume[x_start:x_start+patch_width, y_start:y_start+patch_height, z_start:z_start+patch_depth] += patches[idx]
weight_volume[x_start:x_start+patch_width, y_start:y_start+patch_height, z_start:z_start+patch_depth] += 1
idx += 1
merged_volume /= weight_volume
return merged_volume
import os
import itk
nii_path = os.path.join(r'F:\tmp\results2', 'brain.nii.gz')
imgs = itk.array_from_image(itk.imread(nii_path))
imgs /= np.max(imgs)
volume_size = imgs.shape
patch_size = [96, 96, 96]
overlap_size = [32, 32, 32]
print('img shape:', imgs.shape)
# crop
patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)
print(patches.shape)
for d in range(0, patches.shape[0], 1):
one_patch = patches[d, :, :, :]*255
print(d, one_patch.shape)
width_p, height_p, depth_p = one_patch.shape
one_patch = itk.image_from_array(one_patch)
itk.imwrite(one_patch, os.path.join(r'F:\tmp\results2\patch', str(d) + ".nii.gz")) # 保存nii文件
# merge
merged_volume = merge_patches(patches, volume_size, overlap_size)
print('merged_volume shape:', merged_volume.shape)
merged_volume = itk.image_from_array(merged_volume)
itk.imwrite(merged_volume, r'F:\tmp\results2\merged_volume.nii.gz')
原始的图像,如下:
crop
一个块图像,如下:
patch_size = [96, 96, 96],overlap_size = [4, 4, 4]
时,merge
后的图像,发现丢失像素区域较多,如下:
patch_size = [96, 96, 96],overlap_size = [32, 32, 32]
时,merge
后的图像,发现丢失像素区域相对较少,如下:
对于大的patch size
,需要采用较大的overlap size
,否则会导致右侧和下侧丢弃的像素区域过多,在合并的时候,会丢失较多信息。所以说为了避免这个问题,需要使得两者之间达到一个相对的平衡。
究竟选择的patch size
有多大,这个还需要综合来考虑。包括原始图像的大小和GPU
的内存大小。太小了,对于空间相对位置信息的获取,肯定是不利的。太大了也可能内存占用太高,不太经济。
三、总结
这次就是一次对3D
图像大尺寸的一次处理的记录,之前对这块内容涉猎较少,也没有好好的理解。比如说对于脑部MRI
的处理中,就选择把无关区域去除,只留下可能存在目标区域,用于后续的预测。这也是用全图训练的一种取巧的方式。
至于新的patch
方法,可以与老方法进行相互的补充。尤其是对于尺寸较大的输入,patch
的方法,就更加的取巧了。文章来源:https://www.toymoban.com/news/detail-741094.html
除此之外,推荐下MONAI
这个库和nnU-Net、nnFormer、unetr plus plus
系列论文进行阅读和学习,对你做三维立体图像的分割有非常大的帮助,尤其是数据的前处理。赶紧看完了就去搜索看看,麻溜的。文章来源地址https://www.toymoban.com/news/detail-741094.html
到了这里,关于【3D 图像分割】基于 Pytorch 的 VNet 3D 图像分割9(patch 的 crop 和 merge 操作)的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!