Using scala-eclipse for spark



是否有人可以帮助我如何使用spark的scala-eclipse IDE ?我遇到了这个链接- http://syndeticlogic.net/?p=311。但我无法理解。我输入命令- mvn - phadoop2 eclipse:clean eclipse:eclipse在spark目录内下载了一长串后,它给了我一些错误。请帮助。由于

下面是我收到的错误
Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM .......................... SUCCESS [5:22.386s]
[INFO] Spark Project Core ................................ SUCCESS [17:20.807s]
[INFO] Spark Project Bagel ............................... FAILURE [2.159s]
[INFO] Spark Project GraphX .............................. SKIPPED
[INFO] Spark Project ML Library .......................... SKIPPED
[INFO] Spark Project Streaming ........................... SKIPPED
[INFO] Spark Project Tools ............................... SKIPPED
[INFO] Spark Project Catalyst ............................ SKIPPED
[INFO] Spark Project SQL ................................. SKIPPED
[INFO] Spark Project Hive ................................ SKIPPED
[INFO] Spark Project REPL ................................ SKIPPED
[INFO] Spark Project Assembly ............................ SKIPPED
[INFO] Spark Project External Twitter .................... SKIPPED
[INFO] Spark Project External Kafka ...................... SKIPPED
[INFO] Spark Project External Flume ...................... SKIPPED
[INFO] Spark Project External ZeroMQ ..................... SKIPPED
[INFO] Spark Project External MQTT ....................... SKIPPED
[INFO] Spark Project Examples ............................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 25:15.115s
[INFO] Finished at: Wed May 07 15:27:51 GMT+05:30 2014
[INFO] Final Memory: 22M/81M
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop2" could not be activated because it does not exist.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on project spark-bagel_2.10: Failed to resolve dependencies for one or more projects in the reactor. Reason: Missing:
[ERROR] ----------
[ERROR] 1) org.apache.spark:spark-core_2.10:jar:1.0.0-SNAPSHOT
[ERROR] 
[ERROR] Try downloading the file manually from the project website.
[ERROR] 
[ERROR] Then, install it using the command:
[ERROR] mvn install:install-file -DgroupId=org.apache.spark -DartifactId=spark-core_2.10 -Dversion=1.0.0-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file
[ERROR] 
[ERROR] Alternatively, if you host your own repository you can deploy the file there:
[ERROR] mvn deploy:deploy-file -DgroupId=org.apache.spark -DartifactId=spark-core_2.10 -Dversion=1.0.0-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR] 
[ERROR] Path to dependency:
[ERROR] 1) org.apache.spark:spark-bagel_2.10:jar:1.0.0-SNAPSHOT
[ERROR] 2) org.apache.spark:spark-core_2.10:jar:1.0.0-SNAPSHOT
[ERROR] 
[ERROR] ----------
[ERROR] 1 required artifact is missing.
[ERROR] 
[ERROR] for artifact:
[ERROR] org.apache.spark:spark-bagel_2.10:jar:1.0.0-SNAPSHOT
[ERROR] 
[ERROR] from the specified remote repositories:
[ERROR] maven-repo (http://repo.maven.apache.org/maven2, releases=true, snapshots=false),
[ERROR] apache-repo (https://repository.apache.org/content/repositories/releases, releases=true, snapshots=false),
[ERROR] jboss-repo (https://repository.jboss.org/nexus/content/repositories/releases, releases=true, snapshots=false),
[ERROR] mqtt-repo (https://repo.eclipse.org/content/repositories/paho-releases, releases=true, snapshots=false),
[ERROR] cloudera-repo (https://repository.cloudera.com/artifactory/cloudera-repos, releases=true, snapshots=false),
[ERROR] apache.snapshots (http://repository.apache.org/snapshots, releases=false, snapshots=true),
[ERROR] central (http://repo.maven.apache.org/maven2, releases=true, snapshots=false)
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :spark-bagel_2.10

这是因为pom.xml中没有名为hadoop2的配置文件。最接近的匹配是hadoop-2.2,hadoop-2.3等

可以运行以下命令

mvn -Phadoop-2.2 eclipse:clean eclipse:eclipse

或者您可以运行'mvn help:all-profiles'列出所有配置文件,并使用其中的一个

如果你想为Apache Spark项目做贡献,那么

  • 进入spark home并运行sbt/sbt eclipse
  • 在Scala IDE中,选择File | Import | Existing Projects into Workspace。
  • 选择根目录MY_SPARK_HOME
  • 选择搜索嵌套项目
  • 选择需要的项目
  • 不要选择"Copy projects into workspace"。

如果你想在你正在使用的应用程序中使用spark库,-你可以使用sbt/sbt组装命令创建一个jar,然后将该jar作为库添加到你的应用程序项目

也可以参考这里的eclipse文档:https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-Eclipse

最新更新