regex 在Javascript中调试正则表达式

hm2xizp9  于 2023-03-20  发布在  Java
关注(0)|答案(2)|浏览(177)

我把这段文字存储在一个名为description的变量中:

`This is a code update`


*Official Name:

*Pub:

*Agency:

*Reference: https://docs.google.com/document/d/1FFTgytIIcMYnCCgp2cKuUWIwdz7MFolLzCci_-OQn9c/edit#heading=h.81ay6ysgrxtb
https://docs.google.com/document/d/1FFTgytIIcMYnCCgp2cKuUWIwdz7MFolLzCci_-OQn9c/edit#heading=h.81ay6ysgrxtb

*Citation: rg


*Draft Doc Title:

*Draft Source Doc:

*Draft Drive:


*Final Doc Title:

*Final Source Doc:

*Final Drive:
    
*Effective Date:

使用下面的代码,它返回一个包含两个元素的数组:

//3. Extract Reference    
var reference = description.search("Reference");
if(reference != -1){        
    reference = description.match(/(?<=^\*\s*Reference\s*:)[\s]*[\n]*.*?(?=\n\*)/ms);   
    reference  = reference?.[0].trim();  
    reference = reference.split(/[\r\n]+/);        
}else{
    reference = '';
}
console.log('Reference:');
console.log(reference);

输出:
[”https://docs.google.com/document/d/1FFTgytIIcMYnCCgp2cKuUWIwdz7MFolLzCci_-OQn9c/edit#heading=h.81ay6ysgrxtb“,“https://docs.google.com/document/d/1FFTgytIIcMYnCCgp2cKuUWIwdz7MFolLzCci_-OQn 9 c/编辑#标题=h.81ay6ysgrxtb”]
但是,当我将描述文本更改为:

`This is a code update`   

*Official Name:

*Pub:

*Agency:

*Reference: 

*Citation: rg


*Draft Doc Title:

*Draft Source Doc:

*Draft Drive:


*Final Doc Title:

*Final Source Doc:

*Final Drive:


*Effective Date:

代码返回*Citation: rg。它应该返回一个空字符串。我哪里错了?谢谢。

dced5bon

dced5bon1#

问题

下面是许多程序员在调试的第一步使用的技巧:他们向一只橡皮鸭解释密码。

/                           
  (?<=^\*\s*Reference\s*:)  # Match after '* Reference:'
  [\s]*[\n]*                # 0 or more whitespaces followed by 0 or more line breaks
  .*?                       # 0 or more characters, including line breaks, lazily
  (?=\n\*)                  # until we meet a line break followed by '*'
/ms                         # with '^' stands for start of line and '.' for all characters.

其中matches

/
  '*Reference:'
  '\n'
  '*Citation: rg\n\n'
  '\n\*'               # Followed by 'Draft Doc Title'
/

此外,[\s]*[\n]*实际上与\s*相同,因为“\n”是“\s”的子集。
溶液

/
  (?<=^\*\s*Reference\s*:)  # Match after '* Reference:'
  (?:                       # a non-capturing group, consists of
    .                       # a character
    (?!^\*)                 # which is not followed by a '*' at the start of a line
  )*                        # 0 or more times
/ms

regex101.com上试用

km0tfn4u

km0tfn4u2#

您可以从my previous answer更新模式,并在同一行的*Reference:之后匹配至少一个非空白字符。
使用JavaScript:

(?<=^\*\s*Reference[^\S\n]*:[^\S\n]*)\S[^]*?(?=^\s*\*)

说明

  • (?<=正后视,Assert左侧为
  • ^\*\s*Reference匹配字符串开头的*Reference
  • [^\S\n]*:[^\S\n]*在可选空格之间匹配:,不使用换行符
  • )关闭后视
  • \S匹配非空白字符
  • [^]*?匹配包括换行符在内的任意字符,尽可能少
  • (?=^\s*\*)正向前瞻,Assert字符串的开头,匹配可选的空白字符,然后*

请参见match hereno match here
如果要返回空字符串而不是不匹配,可以忽略匹配非空白字符:

(?<=^\*\s*Reference[^\S\n]*:)[^]*?(?=^\s*\*)

请参见match herematching a space here

相关问题