我正在尝试在Python中加载MNIST Original数据集。sklearn.datasets.fetch_openml
函数似乎不适用于此。
这是我正在使用的代码-
from sklearn.datasets import fetch_openml
dataset = fetch_openml("MNIST Original")
我收到此错误-
File "generateClassifier.py", line 11, in <module>
dataset = fetch_openml("MNIST Original")
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 526, in fetch_openml
data_info = _get_data_info_by_name(name, version, data_home)
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 302, in
_get_data_info_by_name
data_home)
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 169, in
_get_json_content_from_openml_api
raise error
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 164, in
_get_json_content_from_openml_api
return _load_json()
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 52, in wrapper
return f()
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 160, in _load_json
with closing(_open_openml_url(url, data_home)) as response:
File "/home/inglorion/.local/lib/python3.5/site-
packages/sklearn/datasets/openml.py", line 109, in _open_openml_url
with closing(urlopen(req)) as fsrc:
File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.5/urllib/request.py", line 472, in open
response = meth(req, response)
File "/usr/lib/python3.5/urllib/request.py", line 582, in
http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.5/urllib/request.py", line 510, in error
return self._call_chain(*args)
File "/usr/lib/python3.5/urllib/request.py", line 444, in
_call_chain
result = func(*args)
File "/usr/lib/python3.5/urllib/request.py", line 590, in
http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
我该如何解决这个问题?或者,有没有其他方法可以将MNIST数据集加载到Python中?
我使用的是 0.20.2 版scikit-learn
.
总的来说,我对编程相对较新,所以如果我能得到一个简单的答案,我将不胜感激。谢谢!
试试
mnist = fetch_openml('mnist_784')
我通过 https://www.openml.org/在 https://www.openml.org/d/554 下找到了它
方法fetch_openml()从 mldata.org 下载数据集,该数据集不稳定且无法连接。另一种方法是手动从原始数据下载数据集。您可以从 Kaggle(mnist 数据)下载数据并运行以下代码
from scipy.io import loadmat
mnist = loadmat("../input/mnist-original.loadmat")
mnist_data = mnist["data"].T
mnist_label = mnist["label"][0]
您可以使用:
mist = fetch_openml('mnist_784', version=1)
fetch_mldata 自 scikit-learn v0.20 起被弃用
测试版
import sklearn
sklearn.__version__
导入数据集
from sklearn.datasets import fetch_openml
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
例
- https://scikit-learn.org/stable/auto_examples/neural_networks/plot_mnist_filters.html
我也面临着类似的问题。更新 sklearn 的版本对我有用
我刚刚运行了以下命令
conda update scikit-learn
然后要验证版本,您可以执行以下操作
import nltk
import sklearn
print('nltk version: {}.'.format(nltk.__version__))
print('scikit-learn version: {}.'.format(sklearn.__version__))
不要忘记在更新 sklearn 的版本后重新启动内核。
mnist = fetch_openml('mnist_784')