使用Python脚本解析PyLint输出

cu6pst1q  于 2023-02-21  发布在  Python
关注(0)|答案(1)|浏览(237)

我正在尝试编写一个简单的pylint解析器,在给定一个python项目的情况下,它可以提取代码气味、类名和得分等信息。
特别是,脚本必须分析每个python文件,并生成一个 Dataframe ,其中包含前面指出的信息(代码气味、项目名称和分数)。

import os
import pandas as pd
from pylint.lint import Run

def pylint_project(project_name):

    global project_df
    pylint_options = ["--disable=F0010"]
    python_files = [f for f in os.listdir(project_name) if f.endswith('.py')]
    for file in python_files:
        file_path = os.path.join(project_name, file)
        pylint_output = Run([file_path] + pylint_options)
        smell_count = pylint_output.lstrip().split()[1]
        score = pylint_output.split()[-2]
        project_df = pd.DataFrame({
            "project_name": [project_name],
            "smell_count": [smell_count],
            "score": [score]
        })

    return project_df

path = "path/to/analyze"
com = pylint_project(path)
com.to_csv("path/to/save")

但是,这个代码段不能正常工作,实际上,它只打印:

********* Module setup
E:\python_projects\machine_learning_projects\alibi\setup.py:17:0: C0301: Line too long (110/100) (line-too-long)
E:\python_projects\machine_learning_projects\alibi\setup.py:1:0: C0114: Missing module docstring (missing-module-docstring)
E:\python_projects\machine_learning_projects\alibi\setup.py:4:0: C0116: Missing function or method docstring (missing-function-docstring)
E:\python_projects\machine_learning_projects\alibi\setup.py:5:48: C0103: Variable name "f" doesn't conform to snake_case naming style (invalid-name)
E:\python_projects\machine_learning_projects\alibi\setup.py:10:0: W0122: Use of exec (exec-used)
E:\python_projects\machine_learning_projects\alibi\setup.py:10:5: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
E:\python_projects\machine_learning_projects\alibi\setup.py:10:5: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
E:\python_projects\machine_learning_projects\alibi\setup.py:34:18: E0602: Undefined variable '__version__' (undefined-variable)

------------------------------------------------------------------
Your code has been rated at 0.00/10 (previous run: 0.00/10, +0.00

但是,没有保存数据集,此外,它似乎只分析一个文件(setup.py)
我该怎么修呢?

qlvxas9a

qlvxas9a1#

下面的脚本可能包含比你实际想要使用的更多的信息,请根据你的需要进行调整。特别是我不知道你是否想让代码本身也有味道,所以我只是把它们包含在自己的DataFrame中。
首先,注意glob的使用,与os.listdir相反,它递归地返回文件夹中的所有文件。如果您在project文件夹中有一个虚拟环境文件夹,您需要有一些条件来避免在这些文件夹上运行pylint
使用StringIO捕获pylint的输出已经在其他一些线程中指出,例如here
我使用JSONReporter来获得一个易于解析的输出,有关得分值,请参见this answer
考虑使用带有for循环的tqdm

from pylint.reporters import JSONReporter
from pylint.lint import Run
from glob import glob
from io import StringIO
import pandas as pd
import json
import os

def pylint_project(path):
    pylint_options = ["--disable=F0010"]
    pylint_overview = []
    pylint_results = []
    glob_pattern = os.path.join(path, "**", "*.py")
    for filepath in glob(glob_pattern, recursive=True):
        reporter_buffer = StringIO()
        results = Run([filepath] + pylint_options, reporter=JSONReporter(reporter_buffer), do_exit=False)
        score = results.linter.stats.global_note
        file_results = json.loads(reporter_buffer.getvalue())
        pylint_results.extend(file_results)
        pylint_overview.append({
            "filepath": os.path.realpath(filepath),
            "smell_count": len(file_results),
            "score": score
        })
    return pd.DataFrame(pylint_overview), pd.DataFrame(pylint_results)

if __name__ == "__main__":
    overview, results = pylint_project(".")
    print("### Overview")
    print(overview)
    print("\n### All Results")
    print(results)

上述脚本的输出:

### Overview
                    filepath  smell_count     score
0  /path/to/pylint_parser.py            8  6.923077

### All Results
         type         module             obj  line  column  endLine  endColumn              path                      symbol                                            message message-id
0  convention  pylint_parser                    17       0      NaN        NaN  pylint_parser.py               line-too-long                            Line too long (105/100)      C0301
1  convention  pylint_parser                     1       0      NaN        NaN  pylint_parser.py    missing-module-docstring                           Missing module docstring      C0114
2  convention  pylint_parser  pylint_project    10       0     10.0       18.0  pylint_parser.py  missing-function-docstring               Missing function or method docstring      C0116
3     warning  pylint_parser  pylint_project    17       8     17.0       15.0  pylint_parser.py        redefined-outer-name  Redefining name 'results' from outer scope (li...      W0621
4  convention  pylint_parser                     3       0      3.0       21.0  pylint_parser.py          wrong-import-order  standard import "from glob import glob" should...      C0411
5  convention  pylint_parser                     4       0      4.0       23.0  pylint_parser.py          wrong-import-order  standard import "from io import StringIO" shou...      C0411
6  convention  pylint_parser                     6       0      6.0       11.0  pylint_parser.py          wrong-import-order  standard import "import json" should be placed...      C0411
7  convention  pylint_parser                     7       0      7.0        9.0  pylint_parser.py          wrong-import-order  standard import "import os" should be placed b...      C0411

相关问题