我想拆分字符串进行面积转换。我有这样的数据。
(149Sq.Yards)
(151Sq.Yards)
(190Sq.Yards)
(190Sq.Yards)
我想像这样拆分上面的数据。
149 sq.yards
151 sq.yards
我尝试了以下代码。
a = LOAD '/user/ahmedabad/Makkan_PropertyDetails_Apartment_Ahmedabad.csv' using PigStorage('\t') as (SourceWebSite:chararray,PropertyID:chararray,ListedOn:chararray,ContactName:chararray,TotalViews:int,Price:chararray,PriceperArea:chararray,NoOfBedRooms:int,NoOfBathRooms:int,FloorNoOfProperty:chararray,TotalFloors:int,Possession:chararray,BuiltUpArea:chararray,Furnished:chararray,Ownership:chararray,NewResale:chararray,Facing:chararray,title:chararray,PropertyAddress:chararray,NearByFacilities:chararray,PropertyFeatures:chararray,Sellerinfo:chararray,Description:chararray);
b = FOREACH a GENERATE BuiltUpArea;
c = FILTER b BY (BuiltUpArea matches '.*Sq.Yards.*');
d = FOREACH c GENERATE (bigdecimal) REGEX_EXTRACT(BuiltUpArea,'(.*)', 1) * 9;
同时转储d。它打印为空。
1条答案
按热度按时间xlpyo6sf1#
您提到的正则表达式将匹配所有字符,因此它将尝试像这样进行乘法
(149Sq.Yards * 9)
. 这就是输出为null的原因。下面的正则表达式将从输入中单独拆分数字,然后像这样进行乘法
(149 * 9)
.