我正在尝试在 EMR 上启动一个简单的应用程序。
我使用 SBT 编译一个 jar:
SBT:
name := "exampleTest1 Project"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.0"
libraryDependencies += "joda-time" % "joda-time" % "2.1"
libraryDependencies += "org.joda" % "joda-convert" % "1.2"
斯卡拉代码:
/* exampleTest1.scala */
package org.apache.spark.examples
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext
import org.joda.time.DateTime
import org.joda.time.format.DateTimeFormatter
import org.joda.time.format.DateTimeFormat
object exampleTest1 {
def main() {
val conf = new SparkConf().setAppName("exampleTest1")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val yesterday = DateTime.now().minusDays(1)
val TestString = "s3n://logs.xxxxxxx.com/yyyyyy/zzzz/"+yesterday.toString("yyyy/MM/dd/yyyy-MM-dd*")
val fullconv = sqlContext.read.format("com.databricks.spark.csv").option("header", "false").option("inferSchema", "false").load("s"+TestString).selectExpr("C0 as datetime", "C1 as ip").registerTempTable("example")
sqlContext.sql("""select * from example limit 1000""").coalesce(10).write.format("com.databricks.spark.csv").save(s"s3n://dev.xxx.com/newSparkResults/sample"+yesterday)
}
}
当我尝试向 EMR 上的现有集群添加步骤时,出现以下错误
16/02/09 10:55:11 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:12 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:13 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:14 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:15 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:16 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:17 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:18 INFO yarn.Client: Application report for application_1455007292848_0007 (state: ACCEPTED)
16/02/09 10:55:19 INFO yarn.Client: Application report for application_1455007292848_0007 (state: FAILED)
16/02/09 10:55:19 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1455007292848_0007 failed 2 times due to AM Container for appattempt_1455007292848_0007_000002 exited with exitCode: 10
For more detailed output, check application tracking page:http://ip-10-65-65-226.ec2.internal:8088/cluster/app/application_1455007292848_0007Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1455007292848_0007_02_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
退出代码 10 是什么意思? 更好的是 - 我做错了什么?
任何想法将不胜感激
这可能意味着很多事情 - 您应该查看容器日志以获取解释:
yarn logs -applicationId application_1455007292848_0007
将日志打印到标准输出。