我正在使用haproxy来负载平衡我的MQTT代理集群。每个MQTT Broker可以轻松处理多达100000个连接。但我在使用haproxy时面临的问题是,每个节点只能处理多达30k个连接。每当任何节点达到接近32k的连接时,haproxy CPU都会突然飙升至100%,现在所有连接都开始下降。
这样做的问题是,对于每30k的连接,我必须滚动另一个MQTT代理。如何将其增加到每个MQTT代理节点至少60k个连接?
我的虚拟机:1个Cpu,2 GB内存。我尝试过增加CPU数量,但也遇到了同样的问题。
我的配置–
bind 0.0.0.0:1883
maxconn 1000000
mode tcp
#sticky session load balancing – new feature
stick-table type string len 32 size 200k expire 30m
stick on req.payload(0,0),mqtt_field_value(connect,client_identifier)
option clitcpka # For TCP keep-alive
option tcplog
timeout client 600s
timeout server 2h
timeout check 5000
server mqtt1 10.20.236.140:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
server mqtt2 10.20.236.142:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
server mqtt3 10.20.236.143:1883 check-send-proxy send-proxy-v2 check inter 10s fall 2 rise 5
我还调整了的系统参数
sysctl -w net.core.somaxconn=60000
sysctl -w net.ipv4.tcp_max_syn_backlog=16384
sysctl -w net.core.netdev_max_backlog=16384
sysctl -w net.ipv4.ip_local_port_range='1024 65535'
sysctl -w net.ipv4.tcp_rmem='1024 4096 16777216'
sysctl -w net.ipv4.tcp_wmem='1024 4096 16777216'
modprobe ip_conntrack
sysctl -w net.nf_conntrack_max=1000000
sysctl -w net.netfilter.nf_conntrack_max=1000000
sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
sysctl -w net.ipv4.tcp_max_tw_buckets=1048576
sysctl -w net.ipv4.tcp_fin_timeout=15
tee -a /etc/security/limits.conf << EOF
root soft nofile 1048576
root hard nofile 1048576
haproxy soft nproc 1048576
haproxy hard nproc 1048576
EOF
Haproxyhaproxy -v
的输出
HAProxy version 2.4.18-1ppa1~focal 2022/07/27 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.18.html
Running on: Linux 5.4.0-122-generic #138-Ubuntu SMP Wed Jun 22 15:00:31 UTC 2022 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-96Se88/haproxy-2.4.18=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -Wdeclaration-after-statement -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1 USE_PROMEX=1
DEBUG =
Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT +PCRE2 +PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 -CLOSEFROM -ZLIB +SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT -QUIC +PROMEX -MEMORY_PROFILING
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_THREADS=64, default=1).
Built with OpenSSL version : OpenSSL 1.1.1f 31 Mar 2020
Running on OpenSSL version : OpenSSL 1.1.1f 31 Mar 2020
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE2 version : 10.34 2019-11-21
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with gcc compiler version 9.4.0
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
Available services : prometheus-exporter
Available filters :
[SPOE] spoe
[CACHE] cache
[FCGI] fcgi-app
[COMP] compression
[TRACE] trace
我在您的配置中没有看到任何东西可以解释为什么您不能在没有CPU峰值的情况下实现超过30K的连接。我不确定那些调整后的系统参数对你有什么好处。
作为参考,我使用2个CPU和4G内存成功运行了HAProxy(香草docker镜像haproxy:2.4.17
(,并且可以在不固定CPU的情况下达到150K的maxconn
。(
除了您遇到的每个实例的扩展问题之外,您可能遇到的另一个问题是假设1个HAProxy到1个MQTT代理。
这个问题是,对于每30k连接,我必须滚动另一个MQTT代理。
根据我的经验,最好在单个代理之前运行多个HAProxy节点。(不确定您的限制是什么。(每个后端拥有多个HAProxy实例至关重要。当你失去一个实例时,你只会失去一小部分流量。(当,而不是如果。(这是关键部分,因为当流量下降时,所有客户端都会同时尝试重新连接。这就是为什么你会因为丢失虚拟机或pod而意外地对自己进行DDoS攻击。
按照您当前的垂直比例(CPU和内存(,您可以获得30K的连接。如果将maxconn
设置为30K,则这将是34个节点。如果您将它们作为Kubernetes pod运行,并在HAProxy集群前面部署ELB,那么这应该很容易实现。无论如何,您需要弄清楚如何在不复制后端的情况下将HAProxy作为集群运行。
一个好的经验法则是";当按比例放大不再起作用时按比例缩小";。