happybase在试图扫描非常大的hbase列时崩溃

brgchamk  于 2021-06-09  发布在  Hbase
关注(0)|答案(1)|浏览(382)

我的代码如下:

for key,data in table.scan(columns=["raw:dataInfo"]):
   count+=1
   ...

列raw:datainfo maybe 高达50mb,当我运行上述代码时,happybase崩溃并引发以下异常:

Traceback (most recent call last):
  File "happybasetestscan.py", line 8, in <module>
    for key,data in table.scan(columns=["raw:sample"],limit=10):
  File "/usr/lib/python2.6/site-packages/happybase/table.py", line 374, in scan
    self.name, scan, {})
.......
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

有什么好主意吗,怎么算大栏。谢谢!

sdnqo3pr

sdnqo3pr1#

我猜那个勤俭服务生没接好。happybase报告(通过thrift库)无法从套接字读取任何数据。
无论如何,如果要执行完整表扫描以进行计数(虽然效率很低,但还行),请在扫描时使用筛选器:


# Scan, get only keys (data will be empty)

scanner = table.scan(
    row_start=b'aaa',
    row_stop=b'bbb',
    filter=b'KeyOnlyFilter() AND FirstKeyOnlyFilter()',
)

for row_key, data in scanner:
    pass  # do something with row_key

看到了吗https://github.com/wbolster/happybase/issues/12#issuecomment-12754400了解更多信息

相关问题