我正在尝试在Python中构建一个正则表达式模式,它将匹配像这样的字符串:
“汽车失窃-大(950.01美元及以上)",“汽车失窃-被盗”,“运输工厂(机场)",“5600 N FIGUEROA”和“400 WORLD WY”ST。
import re
hello = {"meta": 1, "reza": [[ "row-f696.af3d.c3v9", "00000000-0000-0000-2D2F-EA38F9F11DB9", 0, 1642111191, 1642111191, "{ }", "201412343", "2020-06-15T00:00:00", "2020-06-15T00:00:00", "0700", "14", "Pacific", "1494", "1", "331", "THEFT FROM MOTOR VEHICLE - GRAND ($950.01 AND OVER)", "1606 0344 1300 1402", "60", "F", "W", "212", "TRANSPORTATION FACILITY (AIRPORT)", "IC", "Invest Cont", "331", "998", "400 WORLD WY", "33.9433", "-118.4072" ] ,
[ "row-f2wh.yte2-zhv8", "00000000-0000-0000-0BF4-2A6281C66DEF", 0, 1636553859, 1636553859, "{ }", "201107194", "2020-03-11T00:00:00", "2020-03-11T00:00:00", "1100", "11", "Northeast", "1118", "1", "510", "VEHICLE - STOLEN", "0", "108", "PARKING LOT", "IC", "Invest Cont", "510", "5600 N FIGUEROA ST", "34.114", "-118.1949" ]]}
crime = []
for items in hello["reza"]:
for item in items:
pattern = re.compile(r'[A-Z].*')
crime = re.findall(pattern,str(item))
print(crime)
字符串
1条答案
按热度按时间eh57zj3b1#
代码中最明显的问题是,在嵌套循环的每次迭代中,你都是
crime
。因此,你将打印最后一次findall
调用的结果。由于findall
返回一个列表(str(item)
中所有匹配项的列表),因此最终得到一个空列表(因为最后一项中没有匹配项)。此外,你没有描述如何过滤结果,你的模式
[A-Z].*
将匹配以小写字母开头的字符串,但它显然会排除5600 N FIGUEROA
。这里有一个建议,检查至少有三个空格的字符串,并且不是以
-
直接跟随的数字开头(也用一个空格替换多个空格):字符串
输出量:
型