我正在尝试使用Java语言连接Spark和Cassandra数据库。对于连接Spark和Cassandra,我正在使用最新版本的Spark-Cassandra-Connector,即2.4.0。目前,我可以使用连接器连接Spark和Cassandra。我正在以RDD格式获取数据,但无法从该数据结构中读取数据。如果我将Row Reader Factory用作CassAndRatable()的第三个参数(),我将获得
> Wrong 3rd argument type. Found: > 'java.lang.Class<com.journaldev.sparkdemo.JohnnyDeppDetails>', > required: > 'com.datastax.spark.connector.rdd.reader.RowReaderFactory<T>'
任何人都可以告诉我我应该使用哪个版本或在这里有什么问题?
cassandratablescanjavardd Prices rdd2 = cassandrajavautil.javafunctions(sc).cassandratable(keyspace,table,johnnydeppdetails.classs.class);
我的pom.xml:
<!-- Import Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.5.0-M2</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.9</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-mapping</artifactId>
<version>2.1.9</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
而不是通过类实例,您需要使用mapRowTo
函数创建RowReaderFactory
,例如我(这是来自我的示例):
CassandraJavaRDD<UUIDData> uuids = javaFunctions(spark.sparkContext())
.cassandraTable("test", "utest", mapRowTo(UUIDData.class));
当您回信时,您可以通过mapToRow
函数将类转换为相应的工厂。