如何在cassandra和hive中存储多维数组

所以，我下面的例子是：

https://keras.io/examples/nlp/pretrained_word_embeddings/

在这个例子中，在下面的部分中生成嵌入矩阵

num_tokens = len(voc) + 2
embedding_dim = 100
hits = 0
misses = 0
# Prepare embedding matrix
embedding_matrix = np.zeros((num_tokens, embedding_dim))
for word, i in word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# Words not found in embedding index will be all-zeros.
# This includes the representation for "padding" and "OOV"
embedding_matrix[i] = embedding_vector
hits += 1
else:
misses += 1
print("Converted %d words (%d misses)" % (hits, misses))

这怎么能被推到卡桑德拉和蜂巢。我尝试了以下查询：

statement="；CREATE TABLE schema.upcoming_calendar3(embedding_matrix列表<冻结<集合>>，PRIMARY KEY(embeddings_matrix((">

然而，这给了我以下错误：

InvalidRequest：来自服务器的错误：代码=2200[无效查询]消息="；PRIMARY KEY组件embedding_matrix的非冻结集合类型无效；

同样，我也想把它发给蜂箱。

关于在cassandra和hive中使用什么数据类型的任何帮助，以及将其发送到DB的更有效的方法，都将是非常好的。

目前，我正在推送这样的数据：

statement="；插入到架构中。upcoming_calendar3(embedding_matrix(值(%s("%(embedding_matrix(

将上层集合声明为冻结，如下所示：

embedding_matrix frozen<list<set<text>>>

如果您想将其用作主键。

在配置单元中，对应的数据类型为array<array<type>>，请参阅手册。

相关内容

最新更新

热门标签：