我在hive中有一个表,其结构如下:
> describe volatility2;
Query: describe volatility2
+------------------+---------------+---------+
| name | type | comment |
+------------------+---------------+---------+
| version | int | |
| unmappedmkfindex | int | |
| mfvol | array<string> | |
+------------------+---------------+---------+
它是由spark hivecontext代码使用如下dataframe api创建的:
val volDF = hc.createDataFrame(volRDD)
volDF.saveAsTable(volName)
它继承了模式中定义的rdd结构:
def schemaVolatility: StructType = StructType(
StructField("Version", IntegerType, false) ::
StructField("UnMappedMKFIndex", IntegerType, false) ::
StructField("MFVol", DataTypes.createArrayType(StringType), true) :: Nil)
但是,当我试图使用最新的jdbc impala驱动程序从这个表中进行选择时,最后一列对它不可见。我的查询非常简单—尝试将数据打印到控制台—与驱动程序下载提供的示例代码完全相同:
String sqlStatement = "select * from default.volatility2";
Class.forName(jdbcDriverName);
con = DriverManager.getConnection(connectionUrl);
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(sqlStatement);
System.out.println("\n== Begin Query Results ======================");
ResultSetMetaData metadata = rs.getMetaData();
for (int i=1; i<=metadata.getColumnCount(); i++) {
System.out.println(rs.getMetaData().getColumnName(i)+":"+rs.getMetaData().getColumnTypeName(i));
}
System.out.println("== End Query Results =======================\n\n");
控制台输出如下:
== Begin Query Results ======================
version:version
unmappedmkfindex:unmappedmkfindex
== End Query Results =======================
是司机的错误还是我遗漏了什么?
1条答案
按热度按时间eimct9ow1#
我找到了自己问题的答案。把它贴在这里,这样可以帮助别人,节省搜索时间。显然,impala最近在sql中引入了所谓的“复杂类型”支持,其中包括array。该文档的链接如下:
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/impala_complex_types.html#complex_types_using
根据这一点,我要做的是将查询更改为如下所示:
我得到了正确的预期结果