轴突不正确地处理合并段



我们有一个使用 Axon Server 的应用程序,我们正在实现 k8s 集群中实例的自动扩展。

扩展代码查看 Axon 服务器 API,以确定哪些处理器具有空闲线程实例或未分配的段。如果找到空闲线程实例,则拆分该段。如果看到带有消息"未声明所有段"的警告,则处理器已合并。拆分/合并请求完成后,我们会轮询 API 以获取处理器信息,等待跟踪器计数相应更改。

向上扩展(拆分(时,这工作正常。当缩减(合并(时,我们经常在应用程序日志中看到一个异常,与JPA令牌存储中令牌的管理有关。

以下日志在发送合并 API 请求后 300 毫秒。在此之前,正在运行 2 个实例,每个实例配置为 2 个线程。我们总共扩展到 4 个线程,然后我杀死了一个实例。这留下了 2 个未分配的段。因此需要合并。理想情况下,我们希望在实例死亡之前合并和移动,但我们确实需要能够像这样处理意外的实例死亡。

2020-06-12 14:41:51.417  INFO 14056 [:] --- [EventHandler]-0] o.h.e.internal.DefaultLoadEventListener  : HHH000327: Error performing load command : org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [org.axonframework.eventhandling.tokenstore.jpa.TokenEntry#org.axonframework.eventhandling.tokenstore.jpa.TokenEntry$PK@9e1d11cf]
org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [org.axonframework.eventhandling.tokenstore.jpa.TokenEntry#org.axonframework.eventhandling.tokenstore.jpa.TokenEntry$PK@9e1d11cf]
at org.hibernate.dialect.lock.PessimisticWriteSelectLockingStrategy.lock(PessimisticWriteSelectLockingStrategy.java:76)
at org.hibernate.persister.entity.AbstractEntityPersister.lock(AbstractEntityPersister.java:1928)
at org.hibernate.event.internal.AbstractLockUpgradeEventListener.upgradeLock(AbstractLockUpgradeEventListener.java:82)
at org.hibernate.event.internal.DefaultLoadEventListener.loadFromSessionCache(DefaultLoadEventListener.java:569)
at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:444)
at org.hibernate.event.internal.DefaultLoadEventListener.load(DefaultLoadEventListener.java:222)
at org.hibernate.event.internal.DefaultLoadEventListener.lockAndLoad(DefaultLoadEventListener.java:406)
at org.hibernate.event.internal.DefaultLoadEventListener.doOnLoad(DefaultLoadEventListener.java:127)
at org.hibernate.event.internal.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:92)
at org.hibernate.internal.SessionImpl.fireLoad(SessionImpl.java:1256)
at org.hibernate.internal.SessionImpl.access$1900(SessionImpl.java:207)
at org.hibernate.internal.SessionImpl$IdentifierLoadAccessImpl.doLoad(SessionImpl.java:2866)
at org.hibernate.internal.SessionImpl$IdentifierLoadAccessImpl.load(SessionImpl.java:2847)
at org.hibernate.internal.SessionImpl.find(SessionImpl.java:3482)
at org.hibernate.internal.SessionImpl.find(SessionImpl.java:3456)
at jdk.internal.reflect.GeneratedMethodAccessor219.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:308)
at com.sun.proxy.$Proxy277.find(Unknown Source)
at org.axonframework.eventhandling.tokenstore.jpa.JpaTokenStore.loadToken(JpaTokenStore.java:216)
at org.axonframework.eventhandling.tokenstore.jpa.JpaTokenStore.storeToken(JpaTokenStore.java:111)
at org.axonframework.eventhandling.TrackingEventProcessor$MergeSegmentInstruction.runSafe(TrackingEventProcessor.java:1385)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.lambda$null$0(TrackingEventProcessor.java:1139)
at org.axonframework.common.transaction.TransactionManager.executeInTransaction(TransactionManager.java:47)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.lambda$run$1(TrackingEventProcessor.java:1139)
at org.axonframework.common.ProcessUtils.executeWithRetry(ProcessUtils.java:33)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.run(TrackingEventProcessor.java:1139)
at org.axonframework.eventhandling.TrackingEventProcessor.processInstructions(TrackingEventProcessor.java:332)
at org.axonframework.eventhandling.TrackingEventProcessor.processingLoop(TrackingEventProcessor.java:297)
at org.axonframework.eventhandling.TrackingEventProcessor$TrackingSegmentWorker.run(TrackingEventProcessor.java:1161)
at org.axonframework.eventhandling.TrackingEventProcessor$WorkerLauncher.run(TrackingEventProcessor.java:1276)
at java.base/java.lang.Thread.run(Thread.java:834)
2020-06-12 14:41:51.431 ERROR 14056 [:] --- [ault-executor-0] o.a.a.c.p.EventProcessorControlService   : Failed to merge segment [3] for processor [MyEventHandler]
java.util.concurrent.CompletionException: javax.persistence.OptimisticLockException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [org.axonframework.eventhandling.tokenstore.jpa.TokenEntry#org.axonframework.eventhandling.tokenstore.jpa.TokenEntry$PK@9e1d11cf]
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.run(TrackingEventProcessor.java:1143)
at org.axonframework.eventhandling.TrackingEventProcessor.processInstructions(TrackingEventProcessor.java:332)
at org.axonframework.eventhandling.TrackingEventProcessor.processingLoop(TrackingEventProcessor.java:297)
at org.axonframework.eventhandling.TrackingEventProcessor$TrackingSegmentWorker.run(TrackingEventProcessor.java:1161)
at org.axonframework.eventhandling.TrackingEventProcessor$WorkerLauncher.run(TrackingEventProcessor.java:1276)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: javax.persistence.OptimisticLockException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [org.axonframework.eventhandling.tokenstore.jpa.TokenEntry#org.axonframework.eventhandling.tokenstore.jpa.TokenEntry$PK@9e1d11cf]
at org.hibernate.internal.ExceptionConverterImpl.wrapStaleStateException(ExceptionConverterImpl.java:226)
at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:93)
at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:200)
at org.hibernate.internal.SessionImpl.find(SessionImpl.java:3515)
at org.hibernate.internal.SessionImpl.find(SessionImpl.java:3456)
at jdk.internal.reflect.GeneratedMethodAccessor219.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke(SharedEntityManagerCreator.java:308)
at com.sun.proxy.$Proxy277.find(Unknown Source)
at org.axonframework.eventhandling.tokenstore.jpa.JpaTokenStore.loadToken(JpaTokenStore.java:216)
at org.axonframework.eventhandling.tokenstore.jpa.JpaTokenStore.storeToken(JpaTokenStore.java:111)
at org.axonframework.eventhandling.TrackingEventProcessor$MergeSegmentInstruction.runSafe(TrackingEventProcessor.java:1385)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.lambda$null$0(TrackingEventProcessor.java:1139)
at org.axonframework.common.transaction.TransactionManager.executeInTransaction(TransactionManager.java:47)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.lambda$run$1(TrackingEventProcessor.java:1139)
at org.axonframework.common.ProcessUtils.executeWithRetry(ProcessUtils.java:33)
at org.axonframework.eventhandling.TrackingEventProcessor$Instruction.run(TrackingEventProcessor.java:1139)
... 5 common frames omitted

这是由于我们的 JPA 令牌存储配置错误造成的,还是我缺少一些咒语?

我相信你已经知道了,你已经发现了Axon Framework方面关于这个问题的错误@ptomli。 您在 Axon Server SE GitHub 存储库上为此提交了一个问题(可以在此处找到(,这进一步解释了一些额外的步骤。

AxonIQ 团队已经调查了这个问题,并确实发现这是一个问题,该问题已标记为问题 #1451,并在此拉取请求中得到解决。

简而言之,它与框架中的排序要求有关,该要求期望用户提供要合并的一对段的最低segmentId。此问题的修复程序目前包含在 4.4 中。

经过调查,发现了一个额外的问题,如果任何线程都无法认领要合并的段,则不允许合并操作。允许这样做的委派过程将在 Axon Server 的一端解决,问题 #136 中标记了这项工作。

希望这能为您提供所有必要的信息,以便您继续@ptomli,并让其他人了解这个问题是如何解决的。

最新更新