hdinsight和配置单元查询

kxe2p93d 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(349)

我们正在为hdinsight做poc。我对这项技术很陌生。我们正在做的是，尝试向azure发送一些数据并编写一些配置单元查询。我们能够实现第一部分：我们可以使用azcopy将一些测试数据推送到azureblob(我知道有azure表和azure队列）。但是对于poc来说，azure blob是很好的。
我们可以使用visualstudio来处理这个blob。但是，我们还想检查hdinsight及其mapreduce功能。
在这种背景下，有几个问题：

1. Do I need to copy data from Azure Blob to Anywhere else for writing
    Hive queries in Ambari? Or Can Ambari directly talk to data stored
    in Azure blob? 
 2. Is this the right way to process data? (Keep data in
        Azure blob, and use HDInsight/Ambari to process the data)
 3. If point 2 is correct, that means HDInsight is used only for
    parallel processing with MapReducing feature. Is this correct?

非常感谢你的真知灼见。

Hive Azure ambari azure-hdinsight

来源：https://stackoverflow.com/questions/50100785/hdinsight-and-hive-queries

1条答案

按热度按时间

yzxexxkh1#

是的，hdinsight可以读取blob存储中存储的数据。示例：
https://docs.microsoft.com/en-us/azure/hdinsight/hadoop/apache-hadoop-linux-tutorial-get-startedhttpshttp://blogs.msdn.microsoft.com/azuredatalake/2017/04/06/azure-hdinsight-3-6-five-things-that-will-make-data-developer-happy/
是的，根据您想做什么，您可以使用spark、mr、pig或hive来处理数据好的起点在这里https://www.edx.org/course/processing-big-data-with-hadoop-in-azure-hdinsight
3:是的，数据是使用一种分布式框架处理的，比如spark、map reduce、hive或pig

赞(0）回复(0）举报 2021-06-26

我来回答

hdinsight和配置单元查询

1条答案

相关问题

热门标签

最新问答