在尝试使用cqlsh登录时,我的身份验证出现了一些问题。我正在运行一个由3个节点组成的集群,这些节点分布在Kubernetes中的三个不同物理节点上。在过去的一个月里,它一直像一种魅力一样滚动,但大约一周左右,它开始下降。在下面,你可以看到当我尝试登录到不同的节点和响应时。(仅供参考,并非总是节点0有问题,我也看到节点1有同样的问题,但节点0工作正常。这个问题似乎类似于Cassandra PasswordAuthenticator导致超时,但那里的建议没有帮助。
I have no name!@cassandra-0:/$ cqlsh -u cassandra -p abc123
Connection error: ('Unable to connect to any servers', {'127.0.0.1:9042': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})
I have no name!@cassandra-1:/$ cqlsh -u cassandra -p abc123
Python 2.7 support is deprecated. Install Python 3.6+ or set CQLSH_NO_WARN_PY2 to suppress this message.Connected to mycluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.0 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cassandra@cqlsh>
I have no name!@cassandra-2:/$ cqlsh -u cassandra -p abc123
Python 2.7 support is deprecated. Install Python 3.6+ or set CQLSH_NO_WARN_PY2 to suppress this message.Connected to mycluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.0 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cassandra@cqlsh>
这是来自节点0 的日志
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,090 MessagingMetrics.java:206 - HINT_RSP messages were dropped in last 5000 ms: 0 internal and 4 cross node. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 8582 ms
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,090 MessagingMetrics.java:206 - HINT_REQ messages were dropped in last 5000 ms: 0 internal and 2 cross node. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 8582 ms
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,090 StatusLogger.java:65 - Pool Name Active Pending Completed Blocked All Time Blocked
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,090 StatusLogger.java:69 - ReadStage 0 0 17086 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,090 StatusLogger.java:69 - CompactionExecutor 0 0 195522 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,090 StatusLogger.java:69 - MutationStage 0 0 88515 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - MemtableReclaimMemory 0 0 210 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - PendingRangeCalculator 0 0 13 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - GossipStage 0 0 660680 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - SecondaryIndexManagement 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - HintsDispatcher 2 0 37776 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - Repair-Task 0 0 5 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - Native-Transport-Requests 0 0 17111 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - RequestResponseStage 0 0 20 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - MemtableFlushWriter 0 0 210 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - PerDiskMemtableFlushWriter_0 0 0 210 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - MemtablePostFlush 0 0 255 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - Sampler 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - ValidationExecutor 0 0 41 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - ViewBuildExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,091 StatusLogger.java:69 - InternalResponseStage 0 0 151676 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:69 - AntiEntropyStage 0 0 249 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:69 - CacheCleanupExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:79 - CompactionManager 0 0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:91 - MessagingService n/a 0/0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:101 - Cache Type Size Capacity KeysToSave
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:103 - KeyCache 11452 75497472 all
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:109 - RowCache 0 0 all
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:116 - Table Memtable ops,data
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,092 StatusLogger.java:119 - system_schema.columns 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.types 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.indexes 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.keyspaces 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.dropped_columns 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.aggregates 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.triggers 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.tables 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.views 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system_schema.functions 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.compaction_history 3,634
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.IndexInfo 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.repairs 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.size_estimates 49344,1087744
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.table_estimates 98688,2339296
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.paxos 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.built_views 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.peer_events 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.peers_v2 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.peers 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.peer_events_v2 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.batches 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,093 StatusLogger.java:119 - system.transferred_ranges 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.transferred_ranges_v2 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.view_builds_in_progress 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.local 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.sstable_activity 229,2816
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.available_ranges_v2 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.available_ranges 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,094 StatusLogger.java:119 - system.prepared_statements 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,095 StatusLogger.java:119 - system_auth.roles 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,095 StatusLogger.java:119 - system_auth.role_members 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,095 StatusLogger.java:119 - system_auth.resource_role_permissons_index 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,095 StatusLogger.java:119 - system_auth.network_permissions 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,095 StatusLogger.java:119 - system_auth.role_permissions 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,100 StatusLogger.java:119 - system_distributed.parent_repair_history 10,418916
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,100 StatusLogger.java:119 - system_distributed.repair_history 29495,15967
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,100 StatusLogger.java:119 - system_distributed.view_build_status 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,101 StatusLogger.java:119 - system_traces.sessions 0,0
INFO [ScheduledTasks:1] 2022-10-26 17:39:00,101 StatusLogger.java:119 - system_traces.events 0,0
密钥空间复制
cassandra@cqlsh> describe keyspace system_auth;
CREATE KEYSPACE system_auth WITH replication = {'class':
'NetworkTopologyStrategy', 'datacenter1': '3'} AND durable_writes = true;
节点工具状态
I have no name!@cassandra-prod-2:/$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.233.92.33 3.63 MiB 256 100.0% xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx rack1
UN 10.233.96.184 4.2 MiB 256 100.0% xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx rack1
UN 10.233.90.48 3.24 MiB 256 100.0% xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx rack1
我尝试过节点工具修复,但没有成功。
有人知道发生了什么事吗?
我尝试重新启动所有c*-节点,即k8集群中的底层节点,运行nodetool修复,但没有成功。
您描述的有时无法使用cqlsh连接到集群的症状与其说是身份验证问题,不如说是节点有时没有响应。
从您发布的日志条目中,它显示节点正在删除提示消息。如果你还记得,协调器(负责通过向所有副本发送突变来协调写入请求(存储";提示";当复制品没有确认CCD_ 1内的写入时。
A";提示";包含错过写入的复制副本的IP加上突变有效载荷。当复制品重新联机时;重放";从提示到复制品的突变(在Cassandra中称为提示切换(。
丢弃的消息是一个更大问题的症状。节点丢弃消息是因为它们过载了,而且这是一种在无法接收更多请求时进行甩负载的机制。
查找GC暂停,这是过载的另一个症状。如果pod的资源受到限制,可以考虑分配更多的RAM,这样就可以增加堆大小。此外,您可以通过添加更多节点(pod(来增加集群的容量。干杯