首页 > 解决方案 > What are the advantages of using "new style" Cassandra paging state over "old style" token functions?

问题描述

I understand that there are two ways of iterating over a large result set in Cassandra:

  1. Querying explicitly with tokens, as discussed in this article on "Displaying rows from an unordered partitioner with the TOKEN function". This appears to have been the only way of doing things prior to Cassandra 2.0.
  2. Using "paging state".

Paging state appears to be the suggested way of doing things these days, but doing it the old token way still works.

Aside from it being the blessed way of doing things, which is of course a type of advantage, I'd love to understand what are the particular advantages of using the "new" method over the "old"? Is there a reason I should not use token in this way?

标签: cassandra

解决方案


分页或令牌的使用实际上取决于您的要求和技术能力。从我的角度来看,使用分页有利于从大分区中获取数据,或者当你的表中没有那么多数据时,你可以使用select * from table.

但是,如果您在集群中有多个服务器,并且有大量数据,则使用 oftoken将允许您从特定服务器读取数据(如果您正确设置路由密钥),并且并行(Spark Cassandra 连接器token正是出于这个原因使用) - 这个与使用分页相比有很大的优势,因为您使用一个协调节点,该节点需要转到其他节点以获取它没有的数据。但是对于某些人来说,实现起来并不容易,因为您需要涵盖边缘情况,例如,当令牌范围不完全从最小值开始时。如果您需要,我有Java 中的示例如何执行此操作。


推荐阅读