我正在尝试为Spark Streaming项目设置一个开发环境,该项目需要将数据写入Hive。我有一个集群,1主,2从和1开发机器(在Intellij Idea 14编码)。
在spark shell中,一切似乎都工作得很好,我能够通过spark 1.5使用DataFrame.write.insertInto("testtable")将数据存储到Hive的默认数据库中
然而,当在IDEA中创建一个scala项目并使用相同的集群和相同的设置运行它时,当在metastore数据库中创建事务连接工厂时抛出错误,假设在mysql中是metastore_db。
这是hive-site.xml:
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://10.1.50.73:9083</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://10.1.50.73:3306/metastore_db?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>sa</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>huanhuan</value>
</property>
</configuration>
我运行IDEA的机器可以远程登录Mysql和Hive来创建表,所以应该没有权限问题。以下是log4j输出:
> /home/stdevelop/jdk1.7.0_79/bin/java -Didea.launcher.port=7536 -Didea.launcher.bin.path=/home/stdevelop/idea/bin -Dfile.encoding=UTF-8 -classpath /home/stdevelop/jdk1.7.0_79/jre/lib/plugin.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/deploy.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jfxrt.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/charsets.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/javaws.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jfr.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jce.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/jsse.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/rt.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/resources.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/management-agent.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/zipfs.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunec.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunpkcs11.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/sunjce_provider.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/localedata.jar:/home/stdevelop/jdk1.7.0_79/jre/lib/ext/dnsns.jar:/home/stdevelop/IdeaProjects/StreamingIntoHive/target/scala-2.10/classes:/root/.sbt/boot/scala-2.10.4/lib/scala-library.jar:/home/stdevelop/SparkDll/spark-assembly-1.5.0-hadoop2.5.2.jar:/home/stdevelop/SparkDll/datanucleus-api-jdo-3.2.6.jar:/home/stdevelop/SparkDll/datanucleus-core-3.2.10.jar:/home/stdevelop/SparkDll/datanucleus-rdbms-3.2.9.jar:/home/stdevelop/SparkDll/mysql-connector-java-5.1.35-bin.jar:/home/stdevelop/idea/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain StreamingIntoHive 10.1.50.68 8080
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/09/22 19:43:18 INFO SparkContext: Running Spark version 1.5.0
15/09/22 19:43:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/22 19:43:22 INFO SecurityManager: Changing view acls to: root
15/09/22 19:43:22 INFO SecurityManager: Changing modify acls to: root
15/09/22 19:43:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/09/22 19:43:26 INFO Slf4jLogger: Slf4jLogger started
15/09/22 19:43:26 INFO Remoting: Starting remoting
15/09/22 19:43:26 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.1.50.68:58070]
15/09/22 19:43:26 INFO Utils: Successfully started service 'sparkDriver' on port 58070.
15/09/22 19:43:26 INFO SparkEnv: Registering MapOutputTracker
15/09/22 19:43:26 INFO SparkEnv: Registering BlockManagerMaster
15/09/22 19:43:26 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-e7fdc896-ebd2-4faa-a9fe-e61bd93a9db4
15/09/22 19:43:26 INFO MemoryStore: MemoryStore started with capacity 797.6 MB
15/09/22 19:43:27 INFO HttpFileServer: HTTP File server directory is /tmp/spark-fb07a3ad-8077-49a8-bcaf-12254cc90282/httpd-0bb434c9-1418-49b6-a514-90e27cb80ab1
15/09/22 19:43:27 INFO HttpServer: Starting HTTP Server
15/09/22 19:43:27 INFO Utils: Successfully started service 'HTTP file server' on port 38865.
15/09/22 19:43:27 INFO SparkEnv: Registering OutputCommitCoordinator
15/09/22 19:43:29 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/09/22 19:43:29 INFO SparkUI: Started SparkUI at http://10.1.50.68:4040
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/mysql-connector-java-5.1.35-bin.jar at http://10.1.50.68:38865/jars/mysql-connector-java-5.1.35-bin.jar with timestamp 1442922209496
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/datanucleus-api-jdo-3.2.6.jar at http://10.1.50.68:38865/jars/datanucleus-api-jdo-3.2.6.jar with timestamp 1442922209498
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/datanucleus-rdbms-3.2.9.jar at http://10.1.50.68:38865/jars/datanucleus-rdbms-3.2.9.jar with timestamp 1442922209534
15/09/22 19:43:29 INFO SparkContext: Added JAR /home/stdevelop/SparkDll/datanucleus-core-3.2.10.jar at http://10.1.50.68:38865/jars/datanucleus-core-3.2.10.jar with timestamp 1442922209564
15/09/22 19:43:30 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/09/22 19:43:30 INFO AppClient$ClientEndpoint: Connecting to master spark://10.1.50.71:7077...
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20150922062654-0004
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor added: app-20150922062654-0004/0 on worker-20150921191458-10.1.50.71-44716 (10.1.50.71:44716) with 1 cores
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150922062654-0004/0 on hostPort 10.1.50.71:44716 with 1 cores, 1024.0 MB RAM
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor added: app-20150922062654-0004/1 on worker-20150921191456-10.1.50.73-36446 (10.1.50.73:36446) with 1 cores
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150922062654-0004/1 on hostPort 10.1.50.73:36446 with 1 cores, 1024.0 MB RAM
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor added: app-20150922062654-0004/2 on worker-20150921191456-10.1.50.72-53999 (10.1.50.72:53999) with 1 cores
15/09/22 19:43:32 INFO SparkDeploySchedulerBackend: Granted executor ID app-20150922062654-0004/2 on hostPort 10.1.50.72:53999 with 1 cores, 1024.0 MB RAM
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/1 is now LOADING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/0 is now LOADING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/2 is now LOADING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/0 is now RUNNING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/1 is now RUNNING
15/09/22 19:43:32 INFO AppClient$ClientEndpoint: Executor updated: app-20150922062654-0004/2 is now RUNNING
15/09/22 19:43:33 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 60161.
15/09/22 19:43:33 INFO NettyBlockTransferService: Server created on 60161
15/09/22 19:43:33 INFO BlockManagerMaster: Trying to register BlockManager
15/09/22 19:43:33 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.68:60161 with 797.6 MB RAM, BlockManagerId(driver, 10.1.50.68, 60161)
15/09/22 19:43:33 INFO BlockManagerMaster: Registered BlockManager
15/09/22 19:43:34 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
15/09/22 19:43:35 INFO SparkContext: Added JAR /home/stdevelop/Builds/streamingintohive.jar at http://10.1.50.68:38865/jars/streamingintohive.jar with timestamp 1442922215169
15/09/22 19:43:39 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@10.1.50.72:40110/user/Executor#-132020084]) with ID 2
15/09/22 19:43:39 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@10.1.50.71:38248/user/Executor#-1615730727]) with ID 0
15/09/22 19:43:40 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.72:37819 with 534.5 MB RAM, BlockManagerId(2, 10.1.50.72, 37819)
15/09/22 19:43:40 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.71:48028 with 534.5 MB RAM, BlockManagerId(0, 10.1.50.71, 48028)
15/09/22 19:43:42 INFO HiveContext: Initializing execution hive, version 1.2.1
15/09/22 19:43:42 INFO ClientWrapper: Inspected Hadoop version: 2.5.2
15/09/22 19:43:42 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.5.2
15/09/22 19:43:42 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@10.1.50.73:56385/user/Executor#1871695565]) with ID 1
15/09/22 19:43:43 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.50.73:43643 with 534.5 MB RAM, BlockManagerId(1, 10.1.50.73, 43643)
15/09/22 19:43:45 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/09/22 19:43:45 INFO ObjectStore: ObjectStore, initialize called
15/09/22 19:43:47 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/09/22 19:43:47 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/09/22 19:43:47 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/09/22 19:43:48 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/09/22 19:43:58 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/09/22 19:44:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:10 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:10 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:12 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
15/09/22 19:44:12 INFO ObjectStore: Initialized ObjectStore
15/09/22 19:44:13 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/09/22 19:44:14 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/09/22 19:44:15 INFO HiveMetaStore: Added admin role in metastore
15/09/22 19:44:15 INFO HiveMetaStore: Added public role in metastore
15/09/22 19:44:16 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/09/22 19:44:16 INFO HiveMetaStore: 0: get_all_databases
15/09/22 19:44:16 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
15/09/22 19:44:17 INFO HiveMetaStore: 0: get_functions: db=default pat=*
15/09/22 19:44:17 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
15/09/22 19:44:17 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/09/22 19:44:18 INFO SessionState: Created local directory: /tmp/9ee94679-df51-46bc-bf6f-66b19f053823_resources
15/09/22 19:44:18 INFO SessionState: Created HDFS directory: /tmp/hive/root/9ee94679-df51-46bc-bf6f-66b19f053823
15/09/22 19:44:18 INFO SessionState: Created local directory: /tmp/root/9ee94679-df51-46bc-bf6f-66b19f053823
15/09/22 19:44:18 INFO SessionState: Created HDFS directory: /tmp/hive/root/9ee94679-df51-46bc-bf6f-66b19f053823/_tmp_space.db
15/09/22 19:44:19 INFO HiveContext: default warehouse location is /user/hive/warehouse
15/09/22 19:44:19 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
15/09/22 19:44:19 INFO ClientWrapper: Inspected Hadoop version: 2.5.2
15/09/22 19:44:19 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.5.2
15/09/22 19:44:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/22 19:44:22 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/09/22 19:44:22 INFO ObjectStore: ObjectStore, initialize called
15/09/22 19:44:23 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/09/22 19:44:23 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/09/22 19:44:23 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/09/22 19:44:25 WARN HiveMetaStore: Retrying creating default database after error: Error creating transactional connection factory
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:227)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:186)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:393)
at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:175)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:178)
at StreamingIntoHive$.main(StreamingIntoHive.scala:42)
at StreamingIntoHive.main(StreamingIntoHive.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
NestedThrowablesStackTrace:
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:227)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:186)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:393)
at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:175)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:178)
at StreamingIntoHive$.main(StreamingIntoHive.scala:42)
at StreamingIntoHive.main(StreamingIntoHive.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.doLoadClass(IsolatedClientLoader.scala:165)
at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.loadClass(IsolatedClientLoader.scala:153)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.datanucleus.store.rdbms.connectionpool.DBCPConnectionPoolFactory.createConnectionPool(DBCPConnectionPoolFactory.java:59)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:85)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240)
………………
………………
Process finished with exit code 1
===============谁能帮我找出原因吗?谢谢。
我相信添加hiveContext可能会有所帮助,如果没有在代码中添加的话。
import org.apache.spark.sql.hive.HiveContext
import hiveContext.implicits._
import hiveContext.sql
val hiveContext = new HiveContext(sc)
hive上下文增加了在MetaStore中查找表和使用HiveQL编写查询的支持。没有Hive部署的用户仍然可以创建HiveContext。如果没有通过hive-site.xml配置,上下文将自动在当前目录中创建metastore_db和warehouse。——来自spark示例
问题已被解决,
实际上我手动创建了metastore_db,所以如果spark连接mysql,它将通过直接sql DbType测试。然而,由于它没有直接传递mysql sql,因此spark使用derby作为默认的metastore db,如log4j中所示。这意味着spark没有连接到mysql中的metastore_db,尽管hive-site.xml配置正确。我在问题评论中找到了一个解决方案。希望能有所帮助
如果您使用Mysql数据库,请在spark-env.sh文件中指定Mysql驱动程序路径
SampleCode:出口SPARK_CLASSPATH = "/home/$ {user.name}/工作/apache-hive-2.0.0-bin/lib/mysql-connector-java-5.1.38-bin.jar"
出口HADOOP_CONF_DIR = $ HADOOP_HOME/etc/hadoop
出口YARN_CONF_DIR = $ HADOOP_HOME/etc/hadoop
出口SPARK_LOG_DIR =/home/hadoop/工作/spark-2.0.0-bin-hadoop2.7/日志
出口SPARK_WORKER_DIR =/home/hadoop/工作/spark-2.0.0-bin-hadoop2.7/日志/worker_dir
更多信息:https://www.youtube.com/watch?v=y3E8tUFhBt0