java 在 HBase 中使用 Scan 与开始行、结束行和过滤器

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12087090/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 07:32:34  来源:igfitidea点击:

Using Scan in HBase with start row, end row and a filter

javahbasedatabase-scan

提问by Andrea

I need to use a Scan in HBase for scanning all rows that meet certain criteria: that's the reason why I will use a filter (really a compound filter list that includes two SingleColumnValueFilter). Now, I have my rowKeys structured in this way:

我需要在 HBase 中使用 Scan 来扫描满足特定条件的所有行:这就是我将使用过滤器(实际上是包含两个 SingleColumnValueFilter 的复合过滤器列表)的原因。现在,我以这种方式构建了我的 rowKeys:

a.b.x|1|1252525  
a.b.x|1|2373273  
a.b.x|1|2999238  
...  
a.b.x|2|3000320  
a.b.x|2|4000023  
...  
a.b.y|1|1202002  
a.b.y|1|1778949  
a.b.y|1|2738273  

and as an additional requirement, I need to iterate only those rows having a rowKey starting with "a.b.x|1"

作为附加要求,我只需要迭代那些具有以“abx|1”开头的 rowKey 的行

Now, the questions

现在,问题

  1. if I use an additional PrefixFilter in my filter list does the scanner always scan all rows (and on each of them applies the filter)?
  2. if I instantiate the Scan passing a startRow (prefix) and the filterlist (without the PrefixFilter), I understood that the scan starts from the given row prefix. So, assume I'm using an "a.b.x." as startRow, does the scan will scan also the a.b.y?
  3. What is the behaviour if I use new Scan(startRow, endRow) and then setFilter? In any words: what about the missing constructor Scan(byte [] start, byte [] end, Filter filter)?
  1. 如果我在我的过滤器列表中使用额外的 PrefixFilter 扫描器是否总是扫描所有行(并且在每行上应用过滤器)?
  2. 如果我通过 startRow(前缀)和过滤器列表(没有 PrefixFilter)实例化 Scan,我知道扫描从给定的行前缀开始。所以,假设我使用“abx”作为 startRow,扫描是否也会扫描 aby?
  3. 如果我使用 new Scan(startRow, endRow) 然后 setFilter 会出现什么行为?换句话说:缺少的构造函数 Scan(byte [] start, byte [] end, Filter filter) 怎么样?

Thanks in advance
Andrea

提前致谢
安德里亚

回答by srav

Row keys are sorted(lexical) in hbase. Hence all the "a.b.x|1"s would come before "a.b.x|2"s and so on.. As rows keys are stored as byte arrays and are lexicographically sorted, be careful with non fixed length row keys and when you are mixing up different character classes. But for your requirement something on this lines should work:

行键在 hbase 中排序(词法)。因此,所有的“abx|1”都会出现在“abx|2”之前,依此类推。由于行键存储为字节数组并按字典顺序排序,因此请注意非固定长度的行键以及混淆时不同的字符类。但是对于您的要求,这条线上的东西应该可以工作:

Scan scan = new Scan(Bytes.toBytes("a.b.x|1"),Bytes.toBytes("a.b.x|2"); //creating a scan object with start and stop row keys

scan.setFilter(colFilter);//set the Column filters you have to this scan object.

//And then you can get a scanner object and iterate through your results
ResultScanner scanner = table.getScanner(scan);
for (Result result = scanner.next(); result != null; result = scanner.next())
{
    //Use the result object
}

update: ToBytes should be toBytes

更新:ToBytes 应该是 toBytes