我正在使用Google Colab在MNIST上使用Python3和PyTorch 1.8训练LeNet-300-100全连接神经网络。
要应用转换并下载MNIST数据集,需要使用以下代码:
# MNIST dataset statistics:
# mean = tensor([0.1307]) & std dev = tensor([0.3081])
mean = np.array([0.1307])
std_dev = np.array([0.3081])
transforms_apply = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean = mean, std = std_dev)
])
给出错误:
下载http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz/数据/MNIST/生/train-images-idx3-ubyte.gz--------------------------------------------------------------------------- HTTPError 回溯(最近调用in () in ()2 train_dataset = torchvision.datasets.MNIST(3 . root = '。/data', train = True,——比;4 transform = transforms_apply, download = True5 )6
11帧/usr/lib/python3.7/urllib/request.py inHttp_error_default (self, req, fp, code, msg, hdrs)类HTTPDefaultErrorHandler(BaseHandler):648 def http_error_default(self, req, fp, code, msg, hdrs):——比;649引发HTTPError(请求。Full_url, code, msg, hdrs, fp)650类HTTPRedirectHandler(BaseHandler):
HTTPError: HTTPError 503: Service Unavailable
怎么了?
我有相同的503错误,这为我工作
!wget www.di.ens.fr/~lelarge/MNIST.tar.gz
!tar -zxvf MNIST.tar.gz
from torchvision.datasets import MNIST
from torchvision import transforms
train_set = MNIST('./', download=True,
transform=transforms.Compose([
transforms.ToTensor(),
]), train=True)
test_set = MNIST('./', download=True,
transform=transforms.Compose([
transforms.ToTensor(),
]), train=False)
托管在http://yann.lecun.com/exdb/mnist/上的MNIST遇到了很多麻烦,因此pytorch获得了许可,现在托管在amazon aws上。
不幸的是,这个修复只在夜间构建中可用(在这里您可以找到修复的代码)。
我发现一个有用的修复方法是:
from torchvision import datasets
new_mirror = 'https://ossci-datasets.s3.amazonaws.com/mnist'
datasets.MNIST.resources = [
('/'.join([new_mirror, url.split('/')[-1]]), md5)
for url, md5 in datasets.MNIST.resources
]
train_dataset = datasets.MNIST(
"../data", train=True, download=True, transform=transform
)
根据火炬视觉问题3549,这将在下一个小版本中修复
此问题已在torchvision==0.9.1
中按此解决。作为临时解决方案,请使用以下解决方案:
from torchvision import datasets, transforms
datasets.MNIST.resources = [
('https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz', 'f68b3c2dcbeaaa9fbdd348bbdeb94873'),
('https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz', 'd53e105ee54ea40749a09fcbcd1e9432'),
('https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz', '9fb629c4189551a2d022fa330f9573f3'),
('https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz', 'ec29112dd5afa0611ce80d1b7f02629c')
]
# AND the rest of your code as usual for train and test (EXAMPLE):
batch_sz = 100
tr_ = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
# MNIST
train_dataset = datasets.MNIST(
root='./dataset',
train=True,
transform=tr_,
download=True
)
test_dataset = datasets.MNIST(
root='./dataset',
train=False,
transform=tr_
)
# DataLoader
train_loader = torch.utils.data.DataLoader(
dataset=train_dataset,
batch_size=batch_sz,
shuffle=True
)
test_loader = torch.utils.data.DataLoader(
dataset=test_dataset,
batch_size=batch_sz,
shuffle=False
)
你可以试试:
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', data_home=".")
x = mnist.data
x = x.reshape((-1, 28, 28))
x = x.astype('float32')
y = mnist.target
y = y.astype('float32')
为PyTorch 0.4.0在udacity笔记本。
解决方案的灵感来自于上面的解决方案。
new_mirror = 'https://ossci-datasets.s3.amazonaws.com/mnist'
datasets.MNIST.urls = [
str('/'.join([new_mirror, url.split('/')[-1]]))
for url in datasets.MNIST.urls
]
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
])
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
使用
你没做错什么。这是数据托管平台的问题。使用Pytorch,您可以使用以下代码下载MNIST
import torch
import torchvision
from torchvision.datasets import MNIST
# Download training dataset
dataset = MNIST(root='data/', download=True)
上面Pytorch数据集中的MNIST包装器将尝试许多可用数据的可能位置。运行代码后,您可以看到它首先尝试从Yan Le Cun网站下载,但无法从那里下载,并退回到其他可能的选项。
潜在原因:Yan LeCun网站缺少更新的SSL证书,所以有些下载文件的方法考虑了这个安全措施,有些没有。