快速重新启动单个节点



我有一个由2个节点组成的galera集群(mariadb-1.test.com:10.10.10.21,mariadb-2.test.com:10.10.22(和3个不同的centos 8服务器上的仲裁器
是否可以在不删除/var/lib/mysql和/或使用galera_new_cluster的情况下重新启动节点
是否可以以不同于完全群集崩溃后的方式重新启动节点?

[centos@mariadb-1 ~]$ sudo systemctl restart mariadb.service 
Job for mariadb.service failed because a fatal signal was delivered to the control process.
See "systemctl status mariadb.service" and "journalctl -xe" for details.
[centos@mariadb-1 ~]$ sudo systemctl status mariadb.service
● mariadb.service - MariaDB 10.3 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2021-12-16 00:07:08 CET; 29s ago
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 9887 ExecStart=/usr/libexec/mysqld --basedir=/usr $MYSQLD_OPTS $_WSREP_NEW_CLUSTER (code=exited, status=1/FAILURE)
Process: 9849 ExecStartPre=/usr/libexec/mysql-prepare-db-dir mariadb.service (code=exited, status=0/SUCCESS)
Process: 9824 ExecStartPre=/usr/libexec/mysql-check-socket (code=exited, status=0/SUCCESS)
Main PID: 9887 (code=exited, status=1/FAILURE)
Status: "MariaDB server is down"
déc. 16 00:06:36 mariadb-1.test.com systemd[1]: Starting MariaDB 10.3 database server...
déc. 16 00:06:36 mariadb-1.test.com mysql-prepare-db-dir[9849]: Database MariaDB is probably initialized in /var/lib/mysql already, nothing is done.
déc. 16 00:06:36 mariadb-1.test.com mysql-prepare-db-dir[9849]: If this is not the case, make sure the /var/lib/mysql is empty before running mysql-prepare-db-dir.
déc. 16 00:06:36 mariadb-1.test.com mysqld[9887]: 2021-12-16  0:06:36 0 [Note] /usr/libexec/mysqld (mysqld 10.3.28-MariaDB) starting as process 9887 ...
déc. 16 00:07:08 mariadb-1.test.com systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
déc. 16 00:07:08 mariadb-1.test.com systemd[1]: mariadb.service: Failed with result 'exit-code'.
déc. 16 00:07:08 mariadb-1.test.com systemd[1]: Failed to start MariaDB 10.3 database server.

journalctl-xe没有提供更多细节
/var/log/mariadb/mariadb.log重新启动失败后节点上的内容:

2021-12-16  0:14:43 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
2021-12-16  0:14:43 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
2021-12-16  0:14:43 0 [Note] WSREP: wsrep_load(): Galera 3.32(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
2021-12-16  0:14:43 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
2021-12-16  0:14:43 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
2021-12-16  0:14:43 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.10.10.21; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; 
2021-12-16  0:14:43 0 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 00000000-0000-0000-0000-000000000000:-1
2021-12-16  0:14:43 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2021-12-16  0:14:43 0 [Note] WSREP: wsrep_sst_grab()
2021-12-16  0:14:43 0 [Note] WSREP: Start replication
2021-12-16  0:14:43 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2021-12-16  0:14:43 0 [Note] WSREP: protonet asio version 0
2021-12-16  0:14:43 0 [Note] WSREP: Using CRC-32C for message checksums.
2021-12-16  0:14:43 0 [Note] WSREP: backend: asio
2021-12-16  0:14:43 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2021-12-16  0:14:43 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2021-12-16  0:14:43 0 [Note] WSREP: restore pc from disk failed
2021-12-16  0:14:43 0 [Note] WSREP: GMCast version 0
2021-12-16  0:14:43 0 [Note] WSREP: (c96725c5, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2021-12-16  0:14:43 0 [Note] WSREP: (c96725c5, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2021-12-16  0:14:43 0 [Note] WSREP: EVS version 0
2021-12-16  0:14:43 0 [Note] WSREP: gcomm: connecting to group 'test-wsrep', peer '10.10.10.21:,10.10.10.22:'
2021-12-16  0:14:43 0 [Note] WSREP: (c96725c5, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://10.10.10.21:4567
2021-12-16  0:14:43 0 [Note] WSREP: (c96725c5, 'tcp://0.0.0.0:4567') connection established to 3cfd9c42 tcp://10.10.10.22:4567
2021-12-16  0:14:43 0 [Note] WSREP: (c96725c5, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.10.30:4567 
2021-12-16  0:14:43 0 [Note] WSREP: (c96725c5, 'tcp://0.0.0.0:4567') connection established to 34bdff1b tcp://10.10.10.30:4567
2021-12-16  0:14:43 0 [Note] WSREP: declaring 34bdff1b at tcp://10.10.10.30:4567 stable
2021-12-16  0:14:43 0 [Note] WSREP: declaring 3cfd9c42 at tcp://10.10.10.22:4567 stable
2021-12-16  0:14:43 0 [Note] WSREP: Node 34bdff1b state prim
2021-12-16  0:14:43 0 [Note] WSREP: view(view_id(PRIM,34bdff1b,340) memb {
34bdff1b,0
3cfd9c42,0
c96725c5,0
} joined {
} left {
} partitioned {
})
2021-12-16  0:14:43 0 [Note] WSREP: save pc into disk
2021-12-16  0:14:44 0 [Note] WSREP: gcomm: connected
2021-12-16  0:14:44 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2021-12-16  0:14:44 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2021-12-16  0:14:44 0 [Note] WSREP: Opened channel 'test-wsrep'
2021-12-16  0:14:44 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2021-12-16  0:14:44 0 [Note] WSREP: Waiting for SST to complete.
2021-12-16  0:14:44 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2021-12-16  0:14:44 0 [Note] WSREP: STATE EXCHANGE: sent state msg: c9b97488-5dfc-11ec-a809-f37d727cc3ec
2021-12-16  0:14:44 0 [Note] WSREP: STATE EXCHANGE: got state msg: c9b97488-5dfc-11ec-a809-f37d727cc3ec from 0 (garb)
2021-12-16  0:14:44 0 [Note] WSREP: STATE EXCHANGE: got state msg: c9b97488-5dfc-11ec-a809-f37d727cc3ec from 1 (mariadb-2.test.com)
2021-12-16  0:14:44 0 [Note] WSREP: STATE EXCHANGE: got state msg: c9b97488-5dfc-11ec-a809-f37d727cc3ec from 2 (mariadb-1.test.com)
2021-12-16  0:14:44 0 [Note] WSREP: Quorum results:
version    = 4,
component  = PRIMARY,
conf_id    = 289,
members    = 2/3 (joined/total),
act_id     = 66241951,
last_appl. = -1,
protocols  = 0/9/3 (gcs/repl/appl),
group UUID = 52e841e5-398a-11eb-8a1c-8efbc5e6d73d
2021-12-16  0:14:44 0 [Note] WSREP: Flow-control interval: [28, 28]
2021-12-16  0:14:44 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 66241951)
2021-12-16  0:14:44 2 [Note] WSREP: State transfer required: 
Group state: 52e841e5-398a-11eb-8a1c-8efbc5e6d73d:66241951
Local state: 00000000-0000-0000-0000-000000000000:-1
2021-12-16  0:14:44 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2021-12-16  0:14:44 2 [Note] WSREP: New cluster view: global state: 52e841e5-398a-11eb-8a1c-8efbc5e6d73d:66241951, view# 290: Primary, number of nodes: 3, my index: 2, protocol version 3
2021-12-16  0:14:44 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2021-12-16  0:14:44 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.10.10.21' --datadir '/var/lib/mysql/' --parent '10502' --mysqld-args --basedir=/usr'
2021-12-16  0:14:44 2 [Note] WSREP: Prepared SST request: rsync|10.10.10.21:4444/rsync_sst
2021-12-16  0:14:44 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2021-12-16  0:14:44 2 [Note] WSREP: Assign initial position for certification: 66241951, protocol version: 4
2021-12-16  0:14:44 0 [Note] WSREP: Service thread queue flushed.
2021-12-16  0:14:44 2 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (52e841e5-398a-11eb-8a1c-8efbc5e6d73d): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():467. IST will be unavailable.
2021-12-16  0:14:44 0 [Note] WSREP: Member 2.0 (mariadb-1.test.com) requested state transfer from '*any*'. Selected 1.0 (mariadb-2.test.com)(SYNCED) as donor.
2021-12-16  0:14:44 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 66241951)
2021-12-16  0:14:44 2 [Note] WSREP: Requesting state transfer: success, donor: 1
2021-12-16  0:14:44 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 52e841e5-398a-11eb-8a1c-8efbc5e6d73d:66241951
2021-12-16  0:14:44 0 [Warning] WSREP: 1.0 (mariadb-2.test.com): State transfer to 2.0 (mariadb-1.test.com) failed: -255 (Unknown error 255)
2021-12-16  0:14:44 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():780: Will never receive state. Need to abort.
2021-12-16  0:14:44 0 [Note] WSREP: gcomm: terminating thread
2021-12-16  0:14:44 0 [Note] WSREP: gcomm: joining thread
2021-12-16  0:14:44 0 [Note] WSREP: gcomm: closing backend
2021-12-16  0:14:45 0 [Note] WSREP: view(view_id(NON_PRIM,34bdff1b,340) memb {
c96725c5,0
} joined {
} left {
} partitioned {
34bdff1b,0
3cfd9c42,0
})
2021-12-16  0:14:45 0 [Note] WSREP: view((empty))
2021-12-16  0:14:45 0 [Note] WSREP: gcomm: closed
2021-12-16  0:14:45 0 [Note] WSREP: /usr/libexec/mysqld: Terminated.
WSREP_SST: [ERROR] Parent mysqld process (PID:10502) terminated unexpectedly. (20211216 00:14:46.490)
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 10560 (20211216 00:14:46.491)
WSREP_SST: [INFO] Joiner cleanup done. (20211216 00:14:46.996)
[centos@mariadb-1 ~]$ 

留在集群中的节点似乎很好:

MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_%' ; 
+-------------------------------+--------------------------------------+
| Variable_name                 | Value                                |
+-------------------------------+--------------------------------------+
| wsrep_applier_thread_count    | 1                                    |
| wsrep_apply_oooe              | 0.000220                             |
| wsrep_apply_oool              | 0.000000                             |
| wsrep_apply_window            | 1.000220                             |
| wsrep_causal_reads            | 0                                    |
| wsrep_cert_deps_distance      | 3770.895462                          |
| wsrep_cert_index_size         | 69                                   |
| wsrep_cert_interval           | 0.091845                             |
| wsrep_cluster_conf_id         | 268                                  |
| wsrep_cluster_size            | 2                                    |
| wsrep_cluster_state_uuid      | 52e841e5-398a-11eb-8a1c-8efbc5e6d73d |
| wsrep_cluster_status          | Primary                              |
| wsrep_cluster_weight          | 2                                    |
| wsrep_commit_oooe             | 0.000000                             |
| wsrep_commit_oool             | 0.000000                             |
| wsrep_commit_window           | 1.000000                             |
| wsrep_connected               | ON                                   |
| wsrep_desync_count            | 0                                    |
| wsrep_evs_delayed             |                                      |
| wsrep_evs_evict_list          |                                      |
| wsrep_evs_repl_latency        | 0/0/0/0/0                            |
| wsrep_evs_state               | OPERATIONAL                          |
| wsrep_flow_control_active     | false                                |
| wsrep_flow_control_paused     | 0.000000                             |
| wsrep_flow_control_paused_ns  | 3297992241                           |
| wsrep_flow_control_recv       | 9                                    |
| wsrep_flow_control_requested  | false                                |
| wsrep_flow_control_sent       | 9                                    |
| wsrep_gcomm_uuid              | 3cfd9c42-e33a-11eb-aea6-1fc34d7a6b30 |
| wsrep_gmcast_segment          | 0                                    |
| wsrep_incoming_addresses      | ,10.10.10.22:3306                    |
| wsrep_last_committed          | 66241750                             |
| wsrep_local_bf_aborts         | 0                                    |
| wsrep_local_cached_downto     | 66130787                             |
| wsrep_local_cert_failures     | 0                                    |
| wsrep_local_commits           | 116324                               |
| wsrep_local_index             | 1                                    |
| wsrep_local_recv_queue        | 0                                    |
| wsrep_local_recv_queue_avg    | 0.006762                             |
| wsrep_local_recv_queue_max    | 31                                   |
| wsrep_local_recv_queue_min    | 0                                    |
| wsrep_local_replays           | 0                                    |
| wsrep_local_send_queue        | 0                                    |
| wsrep_local_send_queue_avg    | 0.000003                             |
| wsrep_local_send_queue_max    | 2                                    |
| wsrep_local_send_queue_min    | 0                                    |
| wsrep_local_state             | 4                                    |
| wsrep_local_state_comment     | Synced                               |
| wsrep_local_state_uuid        | 52e841e5-398a-11eb-8a1c-8efbc5e6d73d |
| wsrep_open_connections        | 0                                    |
| wsrep_open_transactions       | 0                                    |
| wsrep_protocol_version        | 9                                    |
| wsrep_provider_name           | Galera                               |
| wsrep_provider_vendor         | Codership Oy <info@codership.com>    |
| wsrep_provider_version        | 3.32(rXXXX)                          |
| wsrep_ready                   | ON                                   |
| wsrep_received                | 26471367                             |
| wsrep_received_bytes          | 30177334116                          |
| wsrep_repl_data_bytes         | 91246197                             |
| wsrep_repl_keys               | 9616150                              |
| wsrep_repl_keys_bytes         | 79743208                             |
| wsrep_repl_other_bytes        | 0                                    |
| wsrep_replicated              | 117234                               |
| wsrep_replicated_bytes        | 2538681128                           |
| wsrep_rollbacker_thread_count | 1                                    |
| wsrep_thread_count            | 2                                    |
+-------------------------------+--------------------------------------+

谢谢。

问题解决了:状态快照传输方法是rsync,我在receiver节点上的/var/lib/mysql中有一个目录,mysql没有写权限。我删除了该目录,节点成功重启。

最新更新