这是在Ubuntu 14.04 LTS上
编辑:
WAL-E使用pip安装,秘钥由envdir
管理,根据说明https://gist.github.com/elithrar/8682235和https://github.com/wal-e/wal-e#dependencies。
我正在尝试使用WAL-E恢复数据库,并且一切似乎都很顺利,因为我已经安装并运行了Postgres,并且可以轻松地创建或恢复数据库,并通过pgadmin在本地和远程访问它。当我尝试使用wal-e fetch-backup从S3备份执行恢复时,问题就出现了。在启动postgres之前,一切似乎都很顺利。
出现了许多错误,似乎是缺少包或权限问题,如下所示:
* Starting PostgreSQL 9.3 database server
* The PostgreSQL server failed to start. Please check the log output:
2015-07-11 00:41:11 EDT LOG: database system was interrupted; last known up at 2015-06-30 05:00:02 EDT
2015-07-11 00:41:11 EDT LOG: starting archive recovery
Traceback (most recent call last):
File "/usr/local/bin/wal-e", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3084, in <module>
@_call_aside
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3070, in _call_aside
f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3097, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 651, in _build_master
ws.require(__requires__)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 952, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 847, in resolve
new_requirements = dist.requires(req.extras)[::-1]
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2602, in requires
dm = self._dep_map
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2803, in _dep_map
self.__dep_map = self._compute_dependencies()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2825, in _compute_dependencies
for req in self._parsed_pkg_info.get_all('Requires-Dist') or []:
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2794, in _parsed_pkg_info
metadata = self.get_metadata(self.PKG_INFO)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1617, in get_metadata
return self._get(self._fn(self.egg_info, name))
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1728, in _get
with open(path, 'rb') as stream:
IOError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/six-1.9.0.dist-info/METADATA'
2015-07-11 00:41:11 EDT LOG: invalid checkpoint record
2015-07-11 00:41:11 EDT FATAL: could not locate required checkpoint record
2015-07-11 00:41:11 EDT HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.3/main/backup_label".
2015-07-11 00:41:11 EDT LOG: startup process (PID 1693) exited with exit code 1
2015-07-11 00:41:11 EDT LOG: aborting startup due to startup process failure
我有几个这样的问题,并且能够通过更改组和修改记录文件的权限来解决它们,以匹配目录中的其他文件,但我怀疑这个问题更多地与这些包的安装方式和/或安装位置有关。解决上述问题后,postgres仍然无法启动,返回如下:
* Starting PostgreSQL 9.3 database server
* The PostgreSQL server failed to start. Please check the log output:
2015-07-11 00:30:04 EDT LOG: database system was interrupted; last known up at 2015-06-30 05:00:02 EDT
2015-07-11 00:30:04 EDT LOG: starting archive recovery
Traceback (most recent call last):
File "/usr/local/bin/wal-e", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3084, in <module>
@_call_aside
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3070, in _call_aside
f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3097, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 651, in _build_master
ws.require(__requires__)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 952, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 839, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'wal-e==0.8.1' distribution was not found and is required by the application
2015-07-11 00:30:04 EDT LOG: invalid checkpoint record
2015-07-11 00:30:04 EDT FATAL: could not locate required checkpoint record
2015-07-11 00:30:04 EDT HINT: If you are not restoring from a backup, try removing the file "/var/lib/postgresql/9.3/main/backup_label".
2015-07-11 00:30:04 EDT LOG: startup process (PID 1495) exited with exit code 1
2015-07-11 00:30:04 EDT LOG: aborting startup due to startup process failure
它在抱怨the 'wal-e==0.8.1' distribution was not found...
,但它显然是安装和可执行的:
ls -l /usr/local/lib/python2.7/dist-packages
total 608
drwxr-sr-x 2 root staff 4096 Jul 10 20:14 argparse-1.3.0.dist-info
-rw-r--r-- 1 root staff 88400 Jul 10 20:14 argparse.py
-rw-r--r-- 1 root staff 65659 Jul 10 20:14 argparse.pyc
drwxr-sr-x 6 root staff 4096 Jul 9 22:36 azure
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 azure-0.11.1.egg-info
drwxr-sr-x 5 root staff 4096 Jul 9 22:36 babel
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 Babel-1.3.egg-info
drwxr-sr-x 57 root staff 4096 Jul 9 22:36 boto
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 boto-2.38.0.dist-info
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 concurrent
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 dateutil
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 debtcollector
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 debtcollector-0.5.0.dist-info
-rw-r--r-- 1 root staff 207 Jul 10 20:10 easy-install.pth
-rw-r--r-- 1 root staff 126 Jul 10 20:33 easy_install.py
-rw-r--r-- 1 root staff 315 Jul 10 20:33 easy_install.pyc
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 futures-3.0.3.dist-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 gevent
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 gevent-1.0.2.egg-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 greenlet-0.4.7.egg-info
-rwxr-xr-x 1 root staff 82869 Jul 9 22:36 greenlet.so
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 iso8601
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 iso8601-0.1.10.egg-info
drwxr-sr-x 15 root staff 4096 Jul 9 22:36 keystoneclient
drwxr-sr-x 2 root staff 4096 Jul 10 20:33 _markerlib
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 msgpack
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 msgpack_python-0.4.6.egg-info
drwxr-sr-x 5 root staff 4096 Jul 9 22:36 netaddr
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 netaddr-0.7.15.dist-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 netifaces-0.10.4.egg-info
-rwxr-xr-x 1 root staff 58386 Jul 9 22:36 netifaces.so
drwxr-sr-x 4 root staff 4096 Jul 9 22:36 oslo
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 oslo_config
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 oslo.config-1.14.0.dist-info
-rw-r--r-- 1 root root 299 Jul 9 22:35 oslo.config-1.14.0-py2.7-nspkg.pth
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 oslo_i18n
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 oslo.i18n-2.1.0.dist-info
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 oslo_serialization
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 oslo.serialization-1.7.0.dist-info
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 oslo_utils
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 oslo.utils-1.8.0.dist-info
-rw-r--r-- 1 root root 299 Jul 9 22:35 oslo.utils-1.8.0-py2.7-nspkg.pth
drwxr-sr-x 5 root staff 4096 Jul 10 20:14 pbr
drwxr-sr-x 2 root staff 4096 Jul 10 20:14 pbr-1.3.0.dist-info
drwxr-sr-x 4 root staff 4096 Jul 10 20:10 pip-7.1.0-py2.7.egg
drwxr-sr-x 3 root staff 4096 Jul 10 20:33 pkg_resources
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 python_dateutil-2.4.2.dist-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 python_keystoneclient-1.6.0.dist-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 python_swiftclient-2.4.0.dist-info
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 pytz
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 pytz-2015.4.dist-info
drwxr-sr-- 3 root staff 4096 Jul 9 23:25 requests
drwxr-sr-- 2 root staff 4096 Jul 9 23:25 requests-2.7.0.dist-info
drwxr-sr-x 3 root staff 4096 Jul 10 20:33 setuptools
drwxr-sr-x 2 root staff 4096 Jul 10 20:33 setuptools-18.0.1.dist-info
drwxr-sr-x 3 root staff 4096 Jul 9 22:36 simplejson
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 simplejson-3.7.3.egg-info
drwxr-sr-- 2 root staff 4096 Jul 9 23:26 six-1.9.0.dist-info
-rw-r--r-- 1 root root 29664 Jul 9 23:26 six.py
-rw-r--r-- 1 root root 29006 Jul 9 23:26 six.pyc
drwxr-sr-x 4 root staff 4096 Jul 9 22:36 stevedore
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 stevedore-1.6.0.dist-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 swiftclient
drwxr-sr-x 7 root staff 4096 Jul 9 22:35 wal_e
drwxr-sr-x 2 root staff 4096 Jul 9 22:35 wal_e-0.8.1.egg-info
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 wrapt
drwxr-sr-x 2 root staff 4096 Jul 9 22:36 wrapt-1.10.5.egg-info
我已经做了相当多的搜索,但没有找到任何有用的东西。任何建议或指出正确的方向,我们都很感激。
此外,解决这个问题以确保我的主安装在需要恢复时能够正常工作将是非常有趣的。如果不能恢复,备份就没有意义了。
我不完全确定是什么解决了The 'wal-e==0.8.1' distribution was not found
错误,但在重新启动时,我再也没有看到过。
除此之外,修复这个非常简单。
未设置许多python发行版的可执行位。
chmod o+x /usr/local/lib/python2.7/dist-packages/requests-2.7.0.dist-info/
chmod o+x /usr/local/lib/python2.7/dist-packages/requests
...
未正确创建backup_label的原因
在Master
中运行以下命令- su postgres
psql -c "select pg_start_backup('initial_backup');"
rsync -cva——inplace——exclude=pg_xlog/var/lib/postgresql/9.1/main/slave_IP_address:/var/lib/postgresql/9.1/主/
psql -c "select pg_stop_backup();"
cd/var/lib/postgresql/9.1/main
创建文件名recovery.conf 添加以下行
- standby_mode = 'on'primary_conninfo = 'host=master_IP_address .端口=5432用户=rep密码=yourpassword' trigger_file ="/tmp/postgresql.trigger.5432"
service postgresql restart