我正在尝试通过scala使用gcloud compute ssh运行hive查询
首先,这是我尝试过的
scala> import sys.process._
scala> val results = Seq("hive", "-e", "show databases;").!!
asd
zxc
qwe
scala>
这很好。现在,我想运行相同的 hive 命令,但针对 GCP 集群。我的虚拟机上有 gcloud 设置,从命令行,我可以轻松做到
$ gcloud compute ssh --zone myZone myNode --internal-ip -- 'hive -e "show databases;"'
Updating project ssh metadata...⠶Updated [https://www.googleapis.com/compute/v1/projects/myProject].
Updating project ssh metadata...done.
Waiting for SSH key to propagate.
Warning: Permanently added 'compute.2746937995265952194' (RSA) to the list of known hosts.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 19 100 19 0 0 2982 0 --:--:-- --:--:-- --:--:-- 3166
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: true
OK
asd
zxc
qwe
现在,我想使用 scala 运行上述内容。这是我尝试过的
scala> val results = Seq("gcloud", "compute", "ssh", "--zone", "myZone", "myNode", "--internal-ip", "--", "hive", "-e" ,"show databases").!!
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 19 100 19 0 0 3270 0 --:--:-- --:--:-- --:--:-- 3800
Pseudo-terminal will not be allocated because stdin is not a terminal.
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: true
NoViableAltException(-1@[846:1: ddlStatement : ( createDatabaseStatement | switchDatabaseStatement | dropDatabaseStatement | createTableStatement | dropTableStatement | truncateTableStatement | alterStatement | descStatement | showStatement | metastoreCheck | createViewStatement | createMaterializedViewStatement | dropViewStatement | dropMaterializedViewStatement | createFunctionStatement | createMacroStatement | createIndexStatement | dropIndexStatement | dropFunctionStatement | reloadFunctionStatement | dropMacroStatement | analyzeStatement | lockStatement | unlockStatement | lockDatabase | unlockDatabase | createRoleStatement | dropRoleStatement | ( grantPrivileges )=> grantPrivileges | ( revokePrivileges )=> revokePrivileges | showGrants | showRoleGrants | showRolePrincipals | showRoles | grantRole | revokeRole | setRole | showCurrentRole | abortTransactionStatement );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:144)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:3757)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2382)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
FAILED: ParseException line 1:4 cannot recognize input near 'show' '<EOF>' '<EOF>' in ddl statement
java.lang.RuntimeException: Nonzero exit value: 64
at scala.sys.package$.error(package.scala:27)
at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:132)
at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:102)
... 50 elided
scala>
为什么我会收到此错误?我也试过
scala> val results = Seq("gcloud", "compute", "ssh", "--zone", "myZone", "myNode", "--internal-ip", "--", "hive", "-e" ,"show databases;").!!
但得到了同样的错误。然后我试了
scala> val results = Seq("gcloud", "compute", "ssh", "--zone", "myZone", "myNode", "--internal-ip", "--", "'hive -e "show databases;"'").!!
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 19 100 19 0 0 3245 0 --:--:-- --:--:-- --:--:-- 3800
Pseudo-terminal will not be allocated because stdin is not a terminal.
bash: hive -e "show databases;": command not found
java.lang.RuntimeException: Nonzero exit value: 127
at scala.sys.package$.error(package.scala:27)
at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:132)
at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:102)
... 50 elided
如何使用 scala 正确运行 gcloud comput ssh?
上一个示例中不需要单引号。您正在尝试传递字符串:
hive -e "show databases;"
为了好玩,我会在 Scala 中使用三引号:
"""hive -e "show databases;""""
以避免反斜杠。良好命令行中的单引号由 bash 处理。
这是在 bash 中起作用的方法:
$ gcloud compute ssh --zone myZone myNode --internal-ip -- 'hive -e "show databases;"'
scala.sys.process
在某个时候得到了一些基本的解析。此文件名中有一个空格必须用引号括起来。令人惊讶的是,它似乎做了贝壳式的引号:
$ scala
Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 11.0.3).
Type in expressions for evaluation. Or try :help.
scala> import scala.sys.process._
import scala.sys.process._
scala> "ls -l /tmp/skypeforlinux Crashes".!!
ls: cannot access '/tmp/skypeforlinux': No such file or directory
ls: cannot access 'Crashes': No such file or directory
java.lang.RuntimeException: Nonzero exit value: 2
at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:155)
at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:112)
... 28 elided
scala> """ls -l "/tmp/skypeforlinux Crashes"""".!!
res1: String =
"total 0
"
scala> """ls -l '/tmp/skypeforlinux Crashes'""".!!
res2: String =
"total 0
"
scala> """ls -l /tmp/skypeforlin'ux Cr'ashes""".!!
res3: String =
"total 0
"
scala> """echo 'hive -e "show databases;"'""".!!
res4: String =
"hive -e "show databases;"
"
"my house"
两边的双引号是文件名的一部分:
scala> """ls '/tmp/"my house"'""".!!
res5: String =
"/tmp/"my house"
"
我想代码是我学习 shell 样式引号如何工作的地方,尽管我从来没有机会使用这些知识。除了这个答案,所以谢谢你给我这个机会。