sql—如何在所有行中搜索文本,而不单独指定每一列

c90pui9n  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(438)

例如。
给定下表和数据,查找包含单词“on”的行(不区分大小写)

  1. create table t (i int,dt date,s1 string,s2 string,s3 string)
  2. ;
  3. insert into t
  4. select inline
  5. (
  6. array
  7. (
  8. struct(1,date '2017-03-15','Now we take our time','so nonchalant','And spend our nights so bon vivant')
  9. ,struct(2,date '2017-03-16','Quick as a wink','She changed her mind','She stood on the tracks')
  10. ,struct(3,date '2017-03-17','But I’m talking a Greyhound','On the Hudson River Line','I’m in a New York state of mind')
  11. )
  12. )
  13. ;
  14. select * from t
  15. ;
  16. +-----+------------+-----------------------------+--------------------------+------------------------------------+
  17. | t.i | t.dt | t.s1 | t.s2 | t.s3 |
  18. +-----+------------+-----------------------------+--------------------------+------------------------------------+
  19. | 1 | 2017-03-15 | Now we take our time | so nonchalant | And spend our nights so bon vivant |
  20. | 2 | 2017-03-16 | Quick as a wink | She changed her mind | She stood on the tracks |
  21. | 3 | 2017-03-17 | But Im talking a Greyhound | On the Hudson River Line | Im in a New York state of mind |
  22. +-----+------------+-----------------------------+--------------------------+------------------------------------+
nlejzf6q

nlejzf6q1#

简单(但有限)的解决方案

此解决方案仅与包含“基元”类型的表相关
(没有结构、数组、Map等)。
该解决方案的问题是,所有列都没有分隔符(no, concat_ws(*) 产生一个例外)因此边界中的单词变成一个单词,例如- Greyhound 以及 On 变成 GreyhoundOn ```
select i
,regexp_replace(concat(*),'(?i)on','==>$0<==') as rec

from t

where concat(*) rlike '(?i)on'
;

+---+-----------------------------------------------------------------------------------------------------------+
| | rec |
+---+-----------------------------------------------------------------------------------------------------------+
| 1 | 12017-03-15Now we take our timeso n==>on<==chalantAnd spend our nights so b==>on<== vivant |
| 2 | 22017-03-16Quick as a winkShe changed her mindShe stood ==>on<== the tracks |
| 3 | 32017-03-17But I’m talking a Greyhound==>On<== the Huds==>on<== River LineI’m in a New York state of mind |
+---+-----------------------------------------------------------------------------------------------------------+

  1. ### 复杂(但灵活)的解决方案
  2. 此解决方案仅与包含“基元”类型的表相关
  3. (没有结构、数组、Map等)。
  4. 我将信封推到这里,但成功地生成了一个包含所有列的分隔字符串。
  5. 现在可以寻找完整的单词了。 `(?ix)` http://www.regular-expressions.info/modifiers.html

select i
,regexp_replace(concat(*),'(?ix)\b on \b','==>$0<==') as delim_rec

from (select i
,printf(concat('%s',repeat('|||%s',field(unhex(1),,unhex(1))-2)),) as delim_rec

  1. from t
  2. ) t

where delim_rec rlike '(?ix)\b on \b'
;

+---+------------------------------------------------------------------------------------------------------------------+
| i | delim_rec |
+---+------------------------------------------------------------------------------------------------------------------+
| 2 | 22|||2017-03-16|||Quick as a wink|||She changed her mind|||She stood ==>on<== the tracks |
| 3 | 33|||2017-03-17|||But I’m talking a Greyhound|||==>On<== the Hudson River Line|||I’m in a New York state of mind |
+---+------------------------------------------------------------------------------------------------------------------+

  1. ### 使用其他外部表

create external table t_ext (rec string)
row format delimited
fields terminated by '0'
location '/user/hive/warehouse/t'
;

select cast(split(rec,'\x01')[0] as int) as i
,regexp_replace(regexp_replace(rec,'(?ix)\b on \b','==>$0<=='),'\x01','|||') as rec

from t_ext

where rec rlike '(?ix)\b on \b'
;

+---+-----------------------------------------------------------------------------------------------------------------+
| i | rec |
+---+-----------------------------------------------------------------------------------------------------------------+
| 2 | 2|||2017-03-16|||Quick as a wink|||She changed her mind|||She stood ==>on<== the tracks |
| 3 | 3|||2017-03-17|||But I’m talking a Greyhound|||==>On<== the Hudson River Line|||I’m in a New York state of mind |
+---+-----------------------------------------------------------------------------------------------------------------+

展开查看全部

相关问题