我有一个带有一些损坏数据的数据集-一个字符串列有一些包含 \u0000 . 我需要把它们全部过滤掉,我唯一能支配的就是 where 条款。我试过了 WHERE field NOT LIKE concat('%', chr(00), '%') ,但我的hive发行版(即aws emr)无法识别 chr() . 填写我的表格还有别的选择吗 where 子句以筛选出包含 \u0000 ,不使用 chr() ?
\u0000
where
WHERE field NOT LIKE concat('%', chr(00), '%')
chr()
wwtsj6pe1#
请尝试以下操作:
WHERE field NOT LIKE '%\000%'
7hiiyaii2#
你可以试试看
SELECT '\u0000' AS text; +-------+--+ | text | +-------+--+ | | +-------+--+ -- NOT EMPTY
SELECT '\u0000abc' AS text; +-------+--+ | text | +-------+--+ | abc | +-------+--+ -- NOT EMPTY
所以
SELECT text FROM(SELECT '\u0000abc' AS text) AS t WHERE text NOT LIKE('\u0000%'); +-------+--+ | text | +-------+--+ +-------+--+ -- EMPTY
SELECT text FROM(SELECT '\u0000abc' AS text) AS t WHERE text LIKE('\u0000%'); +-------+--+ | text | +-------+--+ | abc | +-------+--+ -- NOT EMPTY
2条答案
按热度按时间wwtsj6pe1#
请尝试以下操作:
7hiiyaii2#
你可以试试看
所以