如何创建大于节点分配内存的镶木地板文件?



我正试图从存储在mysql中的表中创建一个镶木地板文件。源代码包含数百万行,几分钟后我得到一个GC开销限制异常。

在没有更多RAM可用的情况下,能否以允许操作临时使用磁盘的方式配置apache drill?

这是我在得到错误之前的步骤:

  • 将mysqljdbc连接器放入jars/3rdparty中
  • 执行sqlline.bat-u"jdbc:drill:zk=local"
  • 导航到http://localhost:8047/storage
  • 配置一个新的存储插件以连接到mysql
  • 导航到http://localhost:8047/query并执行以下查询
  • ALTER SESSION SET `store.format`='parquet'
  • ALTER SESSION SET`store.parquet.compression `='snappy'
  • 创建表dfs.tmp.`bigtable.parquet`as(从mysql.schema.bigtable中选择*)

然后我得到错误,应用程序结束:

节点堆内存不足,正在退出。java.lang.OutOfMemoryError:超出GC开销限制网址:com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2149)网址:com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1956)网址:com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3308)网址:com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:463)网址:com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3032)网址:com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2280)网址:com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)网址:com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2546)网址:com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2504)在com.mysql.jjdbc.StatementImpl.executeQuery(StatementImpl.java:1370)网址:org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)网址:org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)网址:org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup(JdbcRecordsReader.java:177)网址:org.apache.drill.exec.physical.impl.ScanBatch。(ScanBatch.java:101)网址:org.apache.drill.exec.physical.impl.ScanBatch。(ScanBatch.java:128)网址:org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch(JdbcBatch Creator.java:40)网址:org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch(JdbcBatch Creator.java:33)网址:org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:151)网址:org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)网址:org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)网址:org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)网址:org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)网址:org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)网址:org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)网址:org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)网址:org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:105)网址:org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)网址:org.apache.drill.exec.work.frage.FragmentExecutitor.run(FragmentExecutior.java:230)网址:org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunable.java:38)位于java.util.concurrent.ThreadPoolExecutiator.runWorker(ThreadPoolExecutiator.java:1145)位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)在java.lang.Thread.run(Thread.java:745)

检查位于<drill_installation_directory>/conf 中的drill-env.sh

默认值为:

DRILL_MAX_DIRECT_MEMORY="8G"
DRILL_HEAP="4G"

Drillbit的默认内存为8G,但Drill更喜欢16G或更大,具体取决于工作负载。

如果您有足够的RAM,您可以将其配置为16G

您可以在Drill的文档中详细阅读。

最新更新