Spring Data JPA Is Too Slow



我最近将我的应用程序切换到了Spring Boot 2。我依靠SpringDataJPA来处理所有事务,我注意到这与我的旧配置之间存在巨大的速度差异。储存大约1000个元素需要大约6秒,现在需要超过25秒。我看到过很多关于使用DataJPA进行批处理的帖子,但这些都不起作用。

让我向您展示2种配置:

实体(两者通用(:

@Entity
@Table(name = "category")
public class CategoryDB implements Serializable
{
private static final long serialVersionUID = -7047292240228252349L;
@Id
@Column(name = "category_id", length = 24)
private String category_id;
@Column(name = "category_name", length = 50)
private String name;
@Column(name = "category_plural_name", length = 50)
private String pluralName;
@Column(name = "url_icon", length = 200)
private String url;
@Column(name = "parent_category", length = 24)
@JoinColumn(name = "parent_category", referencedColumnName = "category_id")
private String parentID;
//Getters & Setters
}

旧存储库(仅显示插入(:

@Override
public Set<String> insert(Set<CategoryDB> element)
{
Set<String> ids = new HashSet<>();
Transaction tx = session.beginTransaction();
for (CategoryDB category : element)
{
String id = (String) session.save(category);
ids.add(id);
}
tx.commit();
return ids;
}

旧的Hibernate XML配置文件:

<property name="show_sql">true</property>
<property name="format_sql">true</property>
<!-- connection information -->
<property name="hibernate.connection.driver_class">com.mysql.cj.jdbc.Driver</property>
<property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>
<!-- database pooling information -->
<property name="connection_provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
<property name="hibernate.c3p0.min_size">5</property>
<property name="hibernate.c3p0.max_size">100</property>
<property name="hibernate.c3p0.timeout">300</property>
<property name="hibernate.c3p0.max_statements">50</property>
<property name="hibernate.c3p0.idle_test_period">3000</property>

旧统计:

18949156 nanoseconds spent acquiring 2 JDBC connections;
5025322 nanoseconds spent releasing 2 JDBC connections;
33116643 nanoseconds spent preparing 942 JDBC statements;
3185229893 nanoseconds spent executing 942 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
3374152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)

新存储库:

@Repository
public interface CategoryRepository extends JpaRepository<CategoryDB,String>
{
@Query("SELECT cat.parentID FROM CategoryDB cat WHERE cat.category_id = :#{#category.category_id}")
String getParentID(@Param("category") CategoryDB category);
}

我在服务中使用saveAll()

新应用程序。属性:

spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.hikari.connection-timeout=6000
spring.datasource.hikari.maximum-pool-size=10
spring.jpa.properties.hibernate.show_sql=true
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.generate_statistics = true
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQLDialect
spring.jpa.properties.hibernate.jdbc.batch_size=50
spring.jpa.properties.hibernate.order_inserts=true

新统计数据:

24543605 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
136919170 nanoseconds spent preparing 942 JDBC statements;
5457451561 nanoseconds spent executing 941 JDBC statements;
19985781508 nanoseconds spent executing 19 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
20256178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

也许,我代表Spring误解了一些事情。这是一个巨大的性能差异,我已经走到了死胡同。任何关于这里出了什么问题的暗示都将不胜感激。

让我们合并统计数据,以便轻松比较。旧行以o为前缀,新行以n为前缀。计数为0的行将被忽略。纳秒测量值的格式使得毫秒可以在之前。

o:    18 949156 nanoseconds spent acquiring 2 JDBC connections;
n:    24 543605 nanoseconds spent acquiring 1 JDBC connections;
o:    33 116643 nanoseconds spent preparing 942 JDBC statements;
n:   136 919170 nanoseconds spent preparing 942 JDBC statements;
o:  3185 229893 nanoseconds spent executing 942 JDBC statements;
n:  5457 451561 nanoseconds spent executing 941 JDBC statements; //loosing ~2sec
o:            0 nanoseconds spent executing 0 JDBC batches;
n: 19985 781508 nanoseconds spent executing 19 JDBC batches; // loosing ~20sec
o:  3374 152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
n: 20256 178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections); // loosing ~20sec, processing 3 times the entities
o:         6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
n:            0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

以下似乎是相关的要点:

  • 新版本有19个批次,耗时20秒,这在旧版本中根本不存在。

  • 新版本有3次刷新,而不是1次,总共需要20秒或大约6倍的时间。这可能与批处理的额外时间大致相同,因为它们肯定是这些刷新的一部分。

尽管批处理应该会让事情变得更快,但也有一些报告会让事情变慢,尤其是MySql:Why Spring';s的jdbcTemplate.batchUpdate((太慢了?

这给我们带来了一些你可以尝试/调查的事情:

  • 禁用批处理,以便测试您是否真的遇到了某种慢批处理问题
  • 使用链接的SO帖子以加快批处理速度
  • 记录实际执行的SQL语句以找出差异。由于这将导致需要处理相当长的日志,请尝试仅提取两个文件中的SQL语句,并将它们与diff工具进行比较
  • 日志刷新,以便了解触发额外刷新的原因
  • 使用断点和调试器或额外的日志记录来找出哪些实体正在被刷新,以及为什么在第二个变体中有更多的实体

以上所有提案都在JPA上运行。但您的统计数据和问题内容表明,您正在对单个或几个表进行简单的插入。在JDBC上执行此操作,例如使用CCD_5可能更高效,而且至少更易于理解。

您可以直接使用jdbc模板,它比数据jpa快得多。

最新更新