regex re.search 文件名中的第一个日期[重复]

k2fxgqgv 于 2023-03-31 发布在其他

关注(0)|答案(2)|浏览(125)

此问题在此处已有答案：

re.search if/if not, but "nonetype object has no attribute 'group'"（2个答案）
昨天关门了。
此帖子已于昨天编辑并提交审核，未能重新打开帖子：
原始关闭原因未解决
有人能帮我找到第一个日期使用regex从文件名格式化：

TEST_2022-03-04-05-30-20.csv_parsed.csv_encrypted.csv

但是我在www.example.com行上得到以下错误re.search：〉AttributeError：“NoneType”对象没有属性“group”
因为我没有正确地搜索re.search，有人可以根据上面的文件名纠正我的re.search吗？我想只从文件名中提取第一个日期。我是正则表达式的新手，有人可以帮助吗？谢谢！
我尝试了下面的方法，我希望在每个格式为2022-03-04的文件名中提取第一个日期
date =re.search（'\ B（\d{4}-\d{2}-\d{2}）.'，filename）

regex

来源：https://stackoverflow.com/questions/75879875/re-search-for-first-of-two-dates-in-filenames

2条答案

按热度按时间

4zcjmb1e1#

你的正则表达式有几个问题。
首先，正则表达式本身是不正确的：

\b       # Match a word boundary (non-word character followed by word character or vice versa)
(        # followed by a group which consists of
  \d{4}- # 4 digits and '-', then
  \d{2}- # 2 digits and '-', then
  \d{2}  # another 2 digits
)        # and eventually succeeded by
\.       # a dot

由于您的filename（TEST_2022-03-04-05-30-20.csv_parsed.csv_encrypted.csv）没有任何这样的组，re.search()失败并返回None。

2022-03-04后面没有点
\b不匹配，因为_和2都被视为单词字符。

也就是说，regex应该修改，像这样：

(?<=_)   # Match something preceded by '_', which will not be included in our match,
\d{4}-   # 4 digits and '-', then
\d{2}-   # 2 digits and '-', then
\d{2}    # another 2 digits, then
\b       # a word boundary

现在，你看到那些反斜杠了吗？永远记住，你需要在字符串中再次转义它们。这可以使用原始字符串自动完成：

r'(?<=_)\d{4}-\d{2}-\d{2}\b'

试试看：

filename = 'TEST_2022-03-04-05-30-20.csv_parsed.csv_encrypted.csv'
match = re.search(r'(?<=_)\d{4}-\d{2}-\d{2}\b', filename).group(0)

print(match) # '2022-03-04'

赞(0）回复(0）举报 2023-03-31

ttygqcqt2#

您需要检查正则表达式是否匹配，然后再尝试提取匹配的文本。

for filename in filenames:
    match = re.search(r'\b(\d{4}-\d{2}-\d{2})\.', filename)
    if not match:
        continue
    date = match.group(1)
    ...

还请注意，使用了r'...'原始字符串，并使用group(1)仅从带括号的表达式中提取匹配项。

赞(0）回复(0）举报 2023-03-31

我来回答

regex re.search 文件名中的第一个日期[重复]

2条答案

相关问题

热门标签

最新问答