在图像上使用skimage.transform.rescale两次可以创建额外的通道

在我正在做的一个coursera指导的项目中，讲师使用了

from skimage.transform import rescale
image_rescaled = rescale(rescale(image,0.5),2.0)

以使图像失真。

在我自己的设备上发生的错误(在该项目的jupyter笔记本上没有发生，可能是由于模块和python的版本不同(是image_rescaled的通道数量增加了1。

例如=>images_normal.shape = (256,256,256,3)和images_with_twice_reshape.shape=(256,256,256,4)

如果我使用rescaled(rescale(image,2.0),0.5)，就不会出现这个问题。

这是在新版本的python/skimage中使用的，还是我做错了什么？

对于其他参考(没有从源代码中删除任何内容，但用#s突出显示了重要部分(：

import os
import re
from scipy import ndimage, misc
from skimage.transform import resize, rescale
from matplotlib import pyplot
import numpy as np
def train_batches(just_load_dataset=False):
batches = 256 # Number of images to have at the same time in a batch
batch = 0 # Number if images in the current batch (grows over time and then resets for each batch)
batch_nb = 0 # Batch current index

ep = 4 # Number of epochs
images = []
x_train_n = []
x_train_down = []

x_train_n2 = [] # Resulting high res dataset
x_train_down2 = [] # Resulting low res dataset

for root, dirnames, filenames in os.walk("data/cars_train.nosync"):
for filename in filenames:
if re.search(".(jpg|jpeg|JPEG|png|bmp|tiff)$", filename):
filepath = os.path.join(root, filename)
image = pyplot.imread(filepath)
if len(image.shape) > 2:

image_resized = resize(image, (256, 256)) # Resize the image so that every image is the same size
#########################
x_train_n.append(image_resized) # Add this image to the high res dataset
x_train_down.append(rescale(rescale(image_resized, 0.5), 2.0)) # Rescale it 0.5x and 2x so that it is a low res image but still has 256x256 resolution
########################
# >>>> x_train_down.append(rescale(rescale(image_resized, 2.0), 0.5)), this one works and gives the same shape of x_train_down and x_train_n.
########################
batch += 1
if batch == batches:
batch_nb += 1
x_train_n2 = np.array(x_train_n)
x_train_down2 = np.array(x_train_down)

if just_load_dataset:
return x_train_n2, x_train_down2

print('Training batch', batch_nb, '(', batches, ')')
autoencoder.fit(x_train_down2, x_train_n2,
epochs=ep,
batch_size=10,
shuffle=True,
validation_split=0.15)

x_train_n = []
x_train_down = []

batch = 0
return x_train_n2, x_train_down2

通过上面的代码，我得到了x_train_n2.shape = (256,256,256,3)和x_train_down2.shape=(256,256,256,4)。

我能够按照如下方式重现您的问题：

import numpy as np
from skimage.transform import resize, rescale
image = np.random.random((512, 512, 3))
resized = resize(image, (256, 256))
rescaled2x = rescale(
rescale(resized, 0.5),
2,
)
print(rescaled2x.shape)
# prints (256, 256, 4)

问题是resize可以推断出您的最终维度是通道/RGB，因为您给它一个2D形状。另一方面，rescale将您的阵列视为形状为(256256，3(的3D图像，其向下到(128128，2(，也沿着颜色进行插值，就好像它们是另一个空间维度一样，然后上采样到(2562564(。

如果你查看rescale文档，你会发现；多通道"；参数，描述为：

图像的最后一个轴是被解释为多个通道还是另一个空间维度。

所以，更新我的代码：

rescaled2x = rescale(
rescale(resized, 0.5, multichannel=True),
2,
multichannel=True,
)
print(rescaled2x.shape)
# prints (256, 256, 3)

相关内容

最新更新

热门标签：