如何使用spark过滤hive中的记录

wmomyfyw 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(375)

为什么刺痛没有被比较？
我的意见是-

+-------+
|      y|
+-------+
| ""no""|
| ""no""|
| ""no""|
|""yes""|
| ""no""|
| ""no""|
| ""no""|
| ""no""|
|""yes""|
| ""no""|
| ""no""|
| ""no""|
| ""no""|
|""yes""|
| ""no""|
| ""no""|
+-------+

我在质疑-

sqlContext.sql("select count(y) from dummy where y='yes'").show()

输出为-

+---+
|_c0|
+---+
|  0|
+---+
``` `y` 在ddl中声明为字符串类型

Hive apache-spark apache-spark-sql hiveql

来源：https://stackoverflow.com/questions/44823303/how-to-filter-records-in-hive-using-spark

1条答案

按热度按时间

u7up0aaq1#

你应该试试这个：

sqlContext.sql("select count(y) from dummy where y='\"\"yes\""'").show()

请注意，您的数据 ""yes"" 不仅仅是 yes .
您仍然需要清理数据：）
或者这样做：

sqlContext.sql("select count(y) from dummy where y like '%yes%'").show()

赞(0）回复(0）举报 2021-06-26

我来回答

如何使用spark过滤hive中的记录

1条答案

相关问题

热门标签

最新问答