Apache Curator-Zookeeper连接丢失异常，可能存在内存泄漏

我一直致力于一个连续监视分布式原子长计数器的进程。它使用以下类ZkClient的方法getCounter每分钟监视一次。事实上，我有多个线程在运行，每个线程都在监视Zookeeper节点中存储的不同计数器（分布式原子长）。每个线程通过getCounter方法的参数指定计数器的路径。

public class TagserterZookeeperManager {
public enum ZkClient {
    COUNTER("10.11.18.25:2181");  // Integration URL
    private CuratorFramework client;
    private ZkClient(String servers) {
        Properties props = TagserterConfigs.ZOOKEEPER.getProperties();
        String zkFromConfig = props.getProperty("servers", "");
        if (zkFromConfig != null && !zkFromConfig.isEmpty()) {
            servers = zkFromConfig.trim();
        }
        ExponentialBackoffRetry exponentialBackoffRetry = new ExponentialBackoffRetry(1000, 3);
        client = CuratorFrameworkFactory.newClient(servers, exponentialBackoffRetry);
        client.start();
    }
    public CuratorFramework getClient() {
        return client;
    }
}
public static String buildPath(String ... node) {
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < node.length; i++) {
        if (node[i] != null && !node[i].isEmpty()) {
            sb.append("/");
            sb.append(node[i]);
        }
    }
    return sb.toString();
}
public static DistributedAtomicLong getCounter(String taskType, int hid, String jobId, String countType) {
    String path = buildPath(taskType, hid+"", jobId, countType);
    Builder builder = PromotedToLock.builder().lockPath(path + "/lock").retryPolicy(new ExponentialBackoffRetry(10, 10));
    DistributedAtomicLong count = new DistributedAtomicLong(ZkClient.COUNTER.getClient(), path, new RetryNTimes(5, 20), builder.build());
    return count;
}
}

在线程中，我就是这样调用这个方法的：

    DistributedAtomicLong counterTotal = TagserterZookeeperManager
                        .getCounter("testTopic", hid, jobId, "test");

现在看来，在线程运行了几个小时后，在某个阶段，我开始在getCounter方法中获得以下org.apache.zookeeper.KeeperException$ConnectionLossException异常，它试图读取计数：

org.apache.zookeeper.KeeperException$ConnectionLossException：KeeperErrorCode=/contentTaskProd的ConnectionLoss网址：org.apache.zookeeper.KeeperException.create（KeeperException.java：99）网址：org.apache.zookeeper.KeeperException.create（KeeperException.java：51）网址：org.apache.zookeeper.zookeeper.exists（ZooKeepers.java:1045）网址：org.apache.zookeeper.zookeeper.exists（ZooKeepers.java:1073）网址：org.apache.culturer.utils.ZKPaths.mkdirs（ZKPaths.java:215）网址：org.apache.control.utils.EnsurePath$InitialHelper$1.call（EnsurePath.java:148）网址：org.apache.control.RetryLoop.callWithRetry（RetryLoop.java:107）网址：org.apache.control.utils.EnsurePath$InitialHelper.ensure（EnsurePath.java:141）网址：org.apache.control.utils.EnsurePath.esure（EnsurePath.java:99）网址：org.apache.culturer.framework.precipes.atomic.DistributedAtomicValue.getCurrentValue（DistributedAtomic Value.java:254）网址：org.apache.culturer.framework.precipes.atomic.DistributedAtomicValue.get（DistributedAtomic Value.java:91）网址：org.apache.culturer.framework.precipes.atomic.DistributedAtomicLong.get（DistributedAtomicLong.java:72）…

一段时间以来，我一直从中得到这个异常，我感觉它正在导致一些内部内存泄漏，最终导致OutOfMemory错误，整个过程都停止了。有人知道这可能是什么原因吗？为什么Zookeeper突然开始抛出连接丢失异常？在这个过程结束后，我可以通过我编写的另一个小控制台程序（也使用了策展人）手动连接到Zookeeper，在那里一切都很好。

为了使用curator监控Zookeeper中的节点，您可以使用NodeCache，这不会解决您的连接问题。。。。但是，您可以在节点发生更改时获得一个推送事件，而不是每分钟轮询一次节点。

根据我的经验，NodeCache可以很好地处理断开和恢复连接。

相关内容

最新更新

热门标签：