我正在尝试编写一个简单的pylint解析器,在给定一个python项目的情况下,它可以提取代码气味、类名和得分等信息。
特别是,脚本必须分析每个python文件,并生成一个 Dataframe ,其中包含前面指出的信息(代码气味、项目名称和分数)。
import os
import pandas as pd
from pylint.lint import Run
def pylint_project(project_name):
global project_df
pylint_options = ["--disable=F0010"]
python_files = [f for f in os.listdir(project_name) if f.endswith('.py')]
for file in python_files:
file_path = os.path.join(project_name, file)
pylint_output = Run([file_path] + pylint_options)
smell_count = pylint_output.lstrip().split()[1]
score = pylint_output.split()[-2]
project_df = pd.DataFrame({
"project_name": [project_name],
"smell_count": [smell_count],
"score": [score]
})
return project_df
path = "path/to/analyze"
com = pylint_project(path)
com.to_csv("path/to/save")
但是,这个代码段不能正常工作,实际上,它只打印:
********* Module setup
E:\python_projects\machine_learning_projects\alibi\setup.py:17:0: C0301: Line too long (110/100) (line-too-long)
E:\python_projects\machine_learning_projects\alibi\setup.py:1:0: C0114: Missing module docstring (missing-module-docstring)
E:\python_projects\machine_learning_projects\alibi\setup.py:4:0: C0116: Missing function or method docstring (missing-function-docstring)
E:\python_projects\machine_learning_projects\alibi\setup.py:5:48: C0103: Variable name "f" doesn't conform to snake_case naming style (invalid-name)
E:\python_projects\machine_learning_projects\alibi\setup.py:10:0: W0122: Use of exec (exec-used)
E:\python_projects\machine_learning_projects\alibi\setup.py:10:5: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
E:\python_projects\machine_learning_projects\alibi\setup.py:10:5: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
E:\python_projects\machine_learning_projects\alibi\setup.py:34:18: E0602: Undefined variable '__version__' (undefined-variable)
------------------------------------------------------------------
Your code has been rated at 0.00/10 (previous run: 0.00/10, +0.00
但是,没有保存数据集,此外,它似乎只分析一个文件(setup.py)
我该怎么修呢?
1条答案
按热度按时间qlvxas9a1#
下面的脚本可能包含比你实际想要使用的更多的信息,请根据你的需要进行调整。特别是我不知道你是否想让代码本身也有味道,所以我只是把它们包含在自己的DataFrame中。
首先,注意
glob
的使用,与os.listdir
相反,它递归地返回文件夹中的所有文件。如果您在project文件夹中有一个虚拟环境文件夹,您需要有一些条件来避免在这些文件夹上运行pylint
。使用
StringIO
捕获pylint
的输出已经在其他一些线程中指出,例如here。我使用
JSONReporter
来获得一个易于解析的输出,有关得分值,请参见this answer。考虑使用带有for循环的
tqdm
。上述脚本的输出: