pandas 匹配路径字符串中所有单词集的Python正则表达式

llmtgqce  于 2023-03-16  发布在  Python
关注(0)|答案(2)|浏览(100)

我有这样一个 Dataframe :

paths.modified

0   {'/user/withdraw/lnurl': {'operations': {'added': ['GET'], 'deleted': ['POST']}}}
1   {'/user/withdraw/lnurl': {'operations': {'added': ['POST'], 'deleted': ['GET']}}}
2   {'/user': {'operations': {'deleted': ['PUT']}}}
3   {'/user': {'operations': {'added': ['PUT']}}}
4   {'/leaderboard': {'operations': {'deleted': ['PUT']}}}
5   {'/dummy': {'operations': {'modified': {'GET': {'operationID': {'from': '', 'to': 'dummyGet'}}, 'PUT': {'operationID': {'from': '', 'to': 'dummyPut'}}}}}, '/file_response': {'operations': {'modified': {'GET': {'operationID': {'from': '', 'to': 'file_responseGet'}}}}}, '/html': {'operations': {'modified': {'POST': {'operationID': {'from': '', 'to': 'htmlPost'}}}}}, '/raw_json': {'operations': {'modified': {'GET': {'operationID': {'from': '', 'to': 'raw_jsonGet'}}}}}, '/solo-object': {'operations': {'modified': {'POST': {'operationID': {'from': '', 'to': 'solo_objectPost'}}}}}
6   {'/login/joule': {'operations': {'deleted': ['GET']}}}
7   {'/login/joule': {'operations': {'added': ['GET']}}}
8   {'/': {'operations': {'modified': {'GET': {'summary': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition returning a valid/invalid badge', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition returning a valid/invalid badge'}, 'description': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition provided via `url` parameter\nreturning a valid/invalid badge\n', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition provided via `url` parameter\nreturning a valid/invalid badge\n'}, 'parameters': {'added': {'query': ['resolve', 'resolveFully', 'validateInternalRefs', 'validateExternalRefs', 'resolveRequestBody', 'resolveCombinators', 'allowEmptyStrings', 'legacyYamlDeserialization', 'inferSchemaType', 'jsonSchemaValidation', 'legacyJsonSchemaValidation']}}}, 'POST': {'summary': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition returning a valid/invalid badge', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition returning a valid/invalid badge'}, 'description': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition provided in request body\nreturning a valid/invalid badge\n', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition provided in request body\nreturning a valid/invalid badge\n'}, 'parameters': {'added': {'query': ['resolve', 'resolveFully', 'validateInternalRefs', 'validateExternalRefs', 'resolveRequestBody', 'resolveCombinators', 'allowEmptyStrings', 'legacyYamlDeserialization', 'inferSchemaType', 'jsonSchemaValidation', 'legacyJsonSchemaValidation']}}}}}}, '/debug': {'operations': {'modified': {'GET': {'summary': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition returning a validation response', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition returning a validation response'}, 'description': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition provided via `url` parameter\nreturning a validation response containing any found validation errors\n', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition provided via `url` parameter\nreturning a validation response containing any found validation errors\n'}, 'parameters': {'added': {'query': ['resolve', 'resolveFully', 'validateInternalRefs', 'validateExternalRefs', 'resolveRequestBody', 'resolveCombinators', 'allowEmptyStrings', 'legacyYamlDeserialization', 'inferSchemaType', 'jsonSchemaValidation', 'legacyJsonSchemaValidation']}}}, 'POST': {'summary': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition returning a validation response', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition returning a validation response'}, 'description': {'from': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.0 definition provided via request body\nreturning a validation response containing any found validation errors\n', 'to': 'Validates a Swagger/OpenAPI 2.0 or an OpenAPI 3.x definition provided via request body\nreturning a validation response containing any found validation errors\n'}, 'parameters': {'added': {'query': ['resolve', 'resolveFully', 'validateInternalRefs', 'validateExternalRefs', 'resolveRequestBody', 'resolveCombinators', 'allowEmptyStrings', 'legacyYamlDeserialization', 'inferSchemaType', 'jsonSchemaValidation', 'legacyJsonSchemaValidation']}}}}}}} 
9   {'/{tenant}/forwarders/v2beta1/certificates': {'operations': {'modified': {'POST': {'requestBody': {'content': {'mediaTypeModified': {'application/json': {'schema': {'required': {'stringsdiff': {'added': ['pem']}}}}}}}}}}}}

我有一个正则表达式,它应该只匹配路径的开始:因此在给定的DF值中,例如:一米零一米,一米一米一米,一米二米,一米三米,一米四米。
这是我在Python中的正则表达式:

'[A-Za-z0-9\-\/{}]+'

它匹配包含字母数字字符、连字符、正斜杠和大括号组合的任何字符串。总而言之,它查找以单引号开头和结尾的字符串,并且在这些引号之间允许上述字符的任何组合。
但是当我把这个模式应用到下面的代码中时:

import re

paths_column = paths['paths.modified']

pattern = r"'[A-Za-z0-9\-\/{}]+'"

paths_list = [re.findall(pattern, str(path_dict)) for path_dict in paths_column]

# count the number of paths for each row using list comprehension
count_list = [len(paths) for paths in paths_list]
paths['Paths_modified'] = count_list

它捕获了所需的路径,但也捕获了所有其他的单词,如操作,添加,GET等-这是我不想要的。我在regex101中测试了正则表达式,它正确地产生了匹配,但由于某些原因,我无法调试它在代码中失败。
编辑:我在dataframe(第5个值)中进行了编辑,其中我在一个字符串中有多个路径,这应该由正则表达式匹配。

p8h8hvxi

p8h8hvxi1#

/[\w{}-]+您能否尝试一下这个正则表达式模式,看看它是否适合您的用例

jfgube3f

jfgube3f2#

顾名思义,re.findall()查找指定模式的所有示例,如果只想访问第一个示例,请使用re.search()

相关问题