Regex:高级删除单行PHP注解

pbwdgjma  于 2023-10-22  发布在  PHP
关注(0)|答案(2)|浏览(90)

我想删除一些PHP代码中的所有单行注解,比如使用Visual Basic:

<?php
some code
// "(\/\/[^;)]*)(?=\?\>)"         <-- NOT REMOVED
/* a first comment */
/* a second comment with some // inside
// a single line ; comment inside         <-- NOT REMOVED
// a comment with a ; website http://www.google.com         <-- NOT REMOVED
a third comment
*/
/* a fourth comment with some // inside */
/* a fifth comment with some ftp://ftp.google;com//onefolder inside */
some code
if {// a comment         <-- NOT REMOVED
    doit();
} /* another comment */
some more code// a comment         <-- NOT REMOVED
some more code2// a comment         <-- NOT REMOVED
?>
<?php  // a comment with a website http://www.google.com ?>
<?php and some more code // and a comment ?>
<?php var = 'http://www.google;com'; // and a comment ?>
<?php var = 'https://www.google;com'; // and a comment ?>
<?php var = 'ftp://www.google;com'; // and a comment ?>
<?php var = 'ftp://www.google;com'; // a comment with a website http://www.google.com ?>
<?php var = 'ftp://www.google;com'; // a comment ?         <-- NOT REMOVED
?>

我目前使用的是"(\/\/[^;)]*)(?=\?\>)"模式,它适用于part,但我仍然有剩余的行(请参阅<-- NOT REMOVED标记)。我没有成功地告诉正则表达式删除,直到它找到一个?>序列,**如果它在行尾之前.
你能帮我改进这个正则表达式吗?
所以它给出了这个(或者根据注解清理后剩余的\n和倍数空间,我可以在下一次清理中删除):

<?php
some code  

/* a first comment */
/* a second comment with some // inside <-- END CAN BE REMOVED, NO PROBLEM
  

a third comment
*/
/* a fourth comment with some // inside */ <-- CAN BE REMOVED IF THE ENDING */ IS KEPT
/* a fifth comment with some ftp://ftp.google;com//onefolder inside */
some code
if {
    doit();
} /* another comment */
some more code
some more code2
?>
<?php ?>
<?php and some more code ?>
<?php var = 'http://www.google;com'; ?>
<?php var = 'https://www.google;com'; ?>
<?php var = 'ftp://www.google;com'; ?>
<?php var = 'ftp://www.google;com'; ?>
<?php var = 'ftp://www.google;com'; 
?>

注意:如果这会导致一个编译器可读的代码,那么我在多次传递中这样做没有问题。

n53p2ov0

n53p2ov01#

样品输入扩展后:
*(?<!:)//(?:.*?)(?= ?\*\/| ?\?>|$)
基本上,它归结为你想如何防范url(包含//)--我将在//之前“查找”:。您可能希望清除任何延迟的<?php ?>行作为后续(非正则表达式)替换。
此模式在注解开始之前占用零个或多个空格,但在*/?>之前留下潜在空格

q9rjltbz

q9rjltbz2#

标签:https://regex101.com/r/Pua5qG/latest

(?x)                        # freespacing
(                           # never match // in..
['"].*\/\/.*['"]            # ... a strings
|
\/\*.*?\*\/                 # .. a comment
)
(*SKIP)(*FAIL)              # if you do-skip that match
|
(?P<commented_code>\/\/.+?) # capture comments
(?P<end>\?>$)?              # omit ?> at end of line
$

并替换为${end},因此仅丢弃注解

相关问题