首页 > 解决方案 > Cassandra 连接器 -- joinWithCassandraTable 和 leftJoinWithCassandraTable 之间的区别 -- 无法解析符号

问题描述

我正在尝试通过使用 datastax cassandra 连接器从 Cassandra 访问数据。下面的代码对我有用。我试图在加入后总结来自 RDD 和 Cassandra 的值列

tm(a.joinWithCassandraTable("ks","tbl").on(SomeColumns("key","key2","key3","key4","key5","key6","key7","key8","key9","key10","key11","key12","key13","key14","key15","column1","column2","column3","column4","column5")).select("value1").map { case (ip, row) => IP(ip.key, ip.key2, ip.key3,ip.key4,ip.key5,ip.key6,ip.key7,ip.key8,ip.key9,ip.key10,ip.key11,ip.key12,ip.key13,ip.key14,ip.key15,ip.column1,ip.column2,ip.column3,ip.column4,ip.column5,ip.value1 + row.getLong("value1")) }.saveToCassandra("ks", "tbl"))

但是,当我尝试进行左连接时,它会给出“无法解析符号getLong ” 我相信这是因为左连接不保证值,因为它可能为空,但我无法编码这在斯卡拉。

tm(a.leftJoinWithCassandraTable("ks","tbl").on(SomeColumns("key","key2","key3","key4","key5","key6","key7","key8","key9","key10","key11","key12","key13","key14","key15","column1","column2","column3","column4","column5")).select("value1").map { case (ip, row) => IP(ip.key, ip.key2, ip.key3,ip.key4,ip.key5,ip.key6,ip.key7,ip.key8,ip.key9,ip.key10,ip.key11,ip.key12,ip.key13,ip.key14,ip.key15,ip.column1,ip.column2,ip.column3,ip.column4,ip.column5,ip.value1 + row.getLong("value1")) }.saveToCassandra("ks", "tbl"))

任何帮助表示赞赏。如果有任何需要的信息,请告诉我,我会尝试添加

标签: apache-sparkcassandradatastaxspark-cassandra-connector

解决方案


当你没有在 Cassandra 中获取数据时,你应该得到一个Option[Row]代替Row对象。

而不是.map { case (ip, row) => ...}你可以写:

.map { case (ip, row) => 
  row match {
    case None => ip
    case Some(data) => IP(...., ip.value1 + data.getLong("value1"))
  }
}

在这种情况下 - 当您没有数据 ( None) 时,您只需返回IP对象本身,如果您有数据,则构造新IP对象


推荐阅读