Tensorflow Saver.save 无法写入 docker 共享卷



我有一个带有自动编码器的Docker容器,该容器可以通过烧瓶服务器启动。所有脚本都复制到Docker的/根中,并且还可以访问看起来像这样的共享卷/数据:

/数据
-/图像
-/模型
-/autoenc.exe.ckpt.data-00000 of-00001
-/autoenc.exe.ckpt.index
-/autoenc.exe.ckpt.meta
-/检查点


/root
-myserver.py

服务器可以成功地将图像写入/数据/图像文件夹,但未能写入/数据/模型目录。我像这样实例化了TensorFlow Saver:

saver = tf.train.Saver()

并尝试了以下每种编写保存文件

的方法
saver.save(sess, '/data/models/Autoenc.exe.ckpt')
saver.save(sess, '../data/models/Autoenc.exe.ckpt')

有趣的事实:当我这样做时,它可以无误地工作

saver.save(sess, './Autoenc.exe.ckpt')

但这会将文件写入错误的位置,在重建Docker容器时将删除该文件。构建Docker容器并已经在上述目录中提供检查点时,通过

还原
saver.restore(sess, "../data/models/Autoenc.exe.ckpt")

无问题的工作。

不要让我向您展示错误消息:

2018-02-20 15:00:52.868566: W tensorflow/core/framework/op_kernel.cc:1198] Unknown: ../data/models/Autoenc.exe.ckpt.data-00000-of-00001.tempstate17405837231896083449; Input/output error
2018-02-20 15:00:53.339357: W tensorflow/core/kernels/queue_base.cc:277] _0_input_producer: Skipping cancelled enqueue attempt with queue not closed
[2018-02-20 15:00:53,590] ERROR in app: Exception on /train/ [POST]
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_call
    return fn(*args)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1329, in _run_fn
    status, run_metadata)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: ../data/models/Autoenc.exe.ckpt.data-00000-of-00001.tempstate17405837231896083449; Input/output error
         [[Node: save/SaveV2 = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, Variable, Variable/Adam, Variable/Adam_1, Variable_1, Variable_1/Adam, Variable_1/Adam_1, Variable_10, Variable_10/Adam, Variable_10/Adam_1, Variable_11, Variable_11/Adam, Variable_11/Adam_1, Variable_12, Variable_12/Adam, Variable_12/Adam_1, Variable_13, Variable_13/Adam, Variable_13/Adam_1, Variable_14, Variable_14/Adam, Variable_14/Adam_1, Variable_15, Variable_15/Adam, Variable_15/Adam_1, Variable_2, Variable_2/Adam, Variable_2/Adam_1, Variable_3, Variable_3/Adam, Variable_3/Adam_1, Variable_4, Variable_4/Adam, Variable_4/Adam_1, Variable_5, Variable_5/Adam, Variable_5/Adam_1, Variable_6, Variable_6/Adam, Variable_6/Adam_1, Variable_7, Variable_7/Adam, Variable_7/Adam_1, Variable_8, Variable_8/Adam, Variable_8/Adam_1, Variable_9, Variable_9/Adam, Variable_9/Adam_1, beta1_power, beta2_power)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/conda/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper
    resp = resource(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request
    resp = meth(*args, **kwargs)
  File "Server.py", line 140, in post
    auto.Do_Autoenc()
  File "/root/dense_autoencoder.py", line 163, in Do_Autoenc
    saver.save(sess, '../data/models/Autoenc.exe.ckpt')
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1593, in save
    {self.saver_def.filename_tensor_name: checkpoint_file})
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1128, in _run
    feed_dict_tensor, options, run_metadata)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
    options, run_metadata)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: ../data/models/Autoenc.exe.ckpt.data-00000-of-00001.tempstate17405837231896083449; Input/output error
         [[Node: save/SaveV2 = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, Variable, Variable/Adam, Variable/Adam_1, Variable_1, Variable_1/Adam, Variable_1/Adam_1, Variable_10, Variable_10/Adam, Variable_10/Adam_1, Variable_11, Variable_11/Adam, Variable_11/Adam_1, Variable_12, Variable_12/Adam, Variable_12/Adam_1, Variable_13, Variable_13/Adam, Variable_13/Adam_1, Variable_14, Variable_14/Adam, Variable_14/Adam_1, Variable_15, Variable_15/Adam, Variable_15/Adam_1, Variable_2, Variable_2/Adam, Variable_2/Adam_1, Variable_3, Variable_3/Adam, Variable_3/Adam_1, Variable_4, Variable_4/Adam, Variable_4/Adam_1, Variable_5, Variable_5/Adam, Variable_5/Adam_1, Variable_6, Variable_6/Adam, Variable_6/Adam_1, Variable_7, Variable_7/Adam, Variable_7/Adam_1, Variable_8, Variable_8/Adam, Variable_8/Adam_1, Variable_9, Variable_9/Adam, Variable_9/Adam_1, beta1_power, beta2_power)]]
Caused by op 'save/SaveV2', defined at:
  File "Server.py", line 212, in <module>
    app.run(host = '0.0.0.0')
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 841, in run
    run_simple(host, port, self, **options)
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 739, in run_simple
    inner()
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 702, in inner
    srv.serve_forever()
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 539, in serve_forever
    HTTPServer.serve_forever(self)
  File "/opt/conda/lib/python3.6/socketserver.py", line 238, in serve_forever
    self._handle_request_noblock()
  File "/opt/conda/lib/python3.6/socketserver.py", line 317, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/opt/conda/lib/python3.6/socketserver.py", line 348, in process_request
    self.finish_request(request, client_address)
  File "/opt/conda/lib/python3.6/socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/opt/conda/lib/python3.6/socketserver.py", line 696, in __init__
    self.handle()
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 232, in handle
    rv = BaseHTTPRequestHandler.handle(self)
  File "/opt/conda/lib/python3.6/http/server.py", line 418, in handle
    self.handle_one_request()
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 267, in handle_one_request
    return self.run_wsgi()
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 209, in run_wsgi
    execute(self.server.app)
  File "/opt/conda/lib/python3.6/site-packages/werkzeug/serving.py", line 197, in execute
    application_iter = app(environ, start_response)
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1997, in __call__
    return self.wsgi_app(environ, start_response)
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/conda/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper
    resp = resource(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request
    resp = meth(*args, **kwargs)
  File "Server.py", line 140, in post
    auto.Do_Autoenc()
  File "/root/dense_autoencoder.py", line 139, in Do_Autoenc
    saver = tf.train.Saver()
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
    self.build()
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1248, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
    build_save=build_save, build_restore=build_restore)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 762, in _build_internal
    save_tensor = self._AddSaveOps(filename_tensor, saveables)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 297, in _AddSaveOps
    save = self.save_op(filename_tensor, saveables)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 240, in save_op
    tensors)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1174, in save_v2
    shape_and_slices=shape_and_slices, tensors=tensors, name=name)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access
UnknownError (see above for traceback): ../data/models/Autoenc.exe.ckpt.data-00000-of-00001.tempstate17405837231896083449; Input/output error
         [[Node: save/SaveV2 = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, Variable, Variable/Adam, Variable/Adam_1, Variable_1, Variable_1/Adam, Variable_1/Adam_1, Variable_10, Variable_10/Adam, Variable_10/Adam_1, Variable_11, Variable_11/Adam, Variable_11/Adam_1, Variable_12, Variable_12/Adam, Variable_12/Adam_1, Variable_13, Variable_13/Adam, Variable_13/Adam_1, Variable_14, Variable_14/Adam, Variable_14/Adam_1, Variable_15, Variable_15/Adam, Variable_15/Adam_1, Variable_2, Variable_2/Adam, Variable_2/Adam_1, Variable_3, Variable_3/Adam, Variable_3/Adam_1, Variable_4, Variable_4/Adam, Variable_4/Adam_1, Variable_5, Variable_5/Adam, Variable_5/Adam_1, Variable_6, Variable_6/Adam, Variable_6/Adam_1, Variable_7, Variable_7/Adam, Variable_7/Adam_1, Variable_8, Variable_8/Adam, Variable_8/Adam_1, Variable_9, Variable_9/Adam, Variable_9/Adam_1, beta1_power, beta2_power)]]

如果需要更多信息,请随时提出更多问题。感谢您的任何帮助,因为我开始失去理智。

我认为,它可以允许目录'/data/data/models/'。请检查运行 Saver的容器用户 Process。

要测试,docker exec -it bash,然后尝试在目录中创建一个文件'/data/dody/'.

相关内容

  • 没有找到相关文章

最新更新