每100天读取一次数据，直到我们在hive中获得完整的数据

z31licg0 于 2021-06-27 发布在 Hive

关注(0)|答案(1)|浏览(360)

我正在使用bash脚本将prod中的数据复制到测试中，以便在配置单元中进行测试。在为表执行此操作时，我收到了内存堆问题。为了解决这个问题，我计划每100天读取一次rundate（执行脚本的那一天）到数据可用的那一天的数据，以避免这个问题。你能让我知道如何使用bash来实现这一点吗？请让我知道除了设置内存之外还有其他方法吗

Hive Database bigdata bash unix

来源：https://stackoverflow.com/questions/52706909/read-data-for-every-100-days-untill-we-get-the-complete-data-in-hive

1条答案

按热度按时间

beq87vna1#

基本上需要从shell运行hiveql（.hql）脚本。
创建 .hql 脚本与您的查询拉只有最后100天的数据。 example.hql ```
select * from my_database.my_table
where insert_date BETWEEN '2018-07-01' AND '2018-10-01';

现在可以从配置单元shell调用此脚本： `hive -f example.hql` 或者您可以创建一个shell脚本并在其中执行查询。 `run.sh` ```

# !/bin/bash

    hive -e "select * from my_database.my_table 
    where insert_date BETWEEN '2018-07-01' AND '2018-10-01'" >select.txt

    result=`echo $?`
    if [ $result -ne 0 ]; then
    echo "Error!!!!"
    echo "Hive error number is: $result"
    exit 1
    else
    echo "no error, do your stuffs"
    fi

然后执行shell脚本 sh run.sh .

赞(0）回复(0）举报 2021-06-27

我来回答

每100天读取一次数据，直到我们在hive中获得完整的数据

1条答案

相关问题

热门标签

最新问答