Spark Cassandra Integration using Calliope Library-不显示任何记录



Try connect Cassandra using tuplejump Calliope-sql via spark-shell.

Spark 版本 1.1.0:

连接:

./spark-shell  --master spark://PCSS-HDOP04:7077 --jars calliope-sql-assembly-1.1.0-CTP-U2.jar,calliope-sql_2.10-1.1.0-CTP-U2.jar,spark-cassandra-assembly-1.0.0-SNAPSHOT-jar-with-dependencies.jar,stargate-core-0.9.9.jar,calliope-core-assembly-1.1.0-CTP-U2.jar --conf "spark.cassandra.connection.host=10.234.31.231"

执行的命令:

import com.datastax.spark.connector._
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
 val conf = new SparkConf(true).set("spark.cassandra.connection.host", "10.234.31.231")
 val sc = new SparkContext("spark://PCSS-HDOP04:7077", "test", conf)
val sqlContext = new org.apache.spark.sql.CassandraAwareSQLContext(sc)
import sqlContext.createSchemaRDD
sqlContext.sql("select * from roadtrips.roadtrip")

输出:

scala> val res = sqlContext.sql("select * from roadtrips.roadtrip")
15/01/15 14:55:41 INFO CassandraAwareSQLContext$$anon$1: LOOKING UP DB [None] for CF [roadtrips.roadtrip]
15/01/15 14:55:41 INFO CassandraAwareSQLContext$$anon$1: INTERPRETED AS DB [Some(roadtrips)] for CF [roadtrip]
ArrayBuffer(id#21, destination_city_name#22, destination_state_abr#23, distance#24, elapsed_time#25, origin_city_name#26, origin_state_abr#27)
res: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[6] at RDD at SchemaRDD.scala:103
== Query Plan ==
== Physical Plan ==
CassandraTableScan [id#21,destination_city_name#22,destination_state_abr#23,distance#24,elapsed_time#25,origin_city_name#26,origin_state_abr#27], (CassandraRelation 10.234.31.231, 9042, 9160, roadtrips, roadtrip, org.apache.spark.sql.CassandraAwareSQLContext@54bebc7b, None, None, false, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml)), []
scala>

卡桑德拉表:

 id | destination_city_name | destination_state_abr | distance | elapsed_time | origin_city_name | origin_state_abr
----+-----------------------+-----------------------+----------+--------------+------------------+------------------
 23 |           Los Angeles |                    CA |     2475 |         1700 |         New York |               NY
 33 |           Los Angeles |                    CA |     2475 |         1444 |         New York |               NY

仅检索列名而不检索记录的命令。

由于查询返回的记录数可能很大,因此默认情况下不显示结果。 如果要查看从 RDD 检索到的一些记录,可以使用firsttake方法:

val res = sqlContext.sql("select * from roadtrips.roadtrip")
res.first()
res.take(3)

最新更新