我使用Pytorch DataLoader创建了我的";批处理数据";洛德,但我遇到了一些问题。
作为pytorch DataLoader Shuffer的定义。
shuffle (bool, optional) – set to True to have the data reshuffled at every epoch (default: False)
数据将在每个时期后重新整理。但是,尽管我将shuffle设置为False,但我可能也会在同一时期的每次迭代中得到完全不同的批次。
testData = torchvision.datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=ToTensor()
)
CurrentFoldTestDataLoader = data.DataLoader(testData, batch_size=32, shuffle=False)
for i in range(1000):
test_features, test_labels = next(iter(CurrentFoldTestDataLoader))
print(i,test_labels)
在这里,我在每次迭代中都得到了相同的批次。
0 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
1 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
2 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
3 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
4 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
5 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
6 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
7 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
8 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
9 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
10 tensor([9, 2, 1, 1, 6, 1, 4, 6, 5, 7, 4, 5, 7, 3, 4, 1, 2, 4, 8, 0, 2, 5, 7, 9,
1, 4, 6, 0, 9, 3, 8, 8])
为什么会这样?我对shuffle的定义是否不准确?
代码的问题在于,您正在为for循环中的每个步骤重新实例化相同的迭代器。使用shuffle=False
,迭代器生成相同的第一批图像。尝试在循环之外实例化加载程序:
loader = data.DataLoader(testData, batch_size=32, shuffle=False)
for i, data in enumerate(loader):
test_features, test_labels = data
print(i, test_labels)