Pandas使用字典将多列转换为百分比

iecba09b 于 2023-08-01 发布在其他

关注(0)|答案(4)|浏览(113)

我有一个100名参与者在2个不同时间点完成不同测试的数据框架。在这里，我只展示了三个这样的测试，即AB，LC，MA。这些测试的值是原始值，并且每个测试具有不同的最大值（对于AB，最大值是150，对于LC，最大值是12，并且对于MA，最大值是2000）。
数据框如下所示：

dict = {
    "ID": [1, 1, 2, 2, 3, 3],
    "Visit": [1, 2, 1, 2, 1, 2],
    "AB": [30, 40, 50, 20, 10, 10],
    "LC": [1, 4, 5, 6, 8, 9],
    "MA": [300, 900, 400, 400, 450, 350] 
}

df = pd.DataFrame(dict)
print(df)

   ID  Visit  AB  LC   MA
0   1      1  30   1  300
1   1      2  40   4  900
2   2      1  50   5  400
3   2      2  20   6  400
4   3      1  10   8  450
5   3      2  10   9  350

字符串
我想根据每个测试的最大值为每行计算一个百分比分数，并将这些分数附加到一个带有后缀“_percent”的新列中（即“AB_percent”，“LC_percent”，“MA_percent”）。有没有一种方法可以根据字典中不同的最大值来Map列的百分比，并将这些值保存到一个后缀为“_percent”的新列中？

pandas

来源：https://stackoverflow.com/questions/76721453/pandas-use-dictionary-to-convert-multiple-columns-into-percentages

4条答案

按热度按时间

0md85ypi1#

my_dict = {'AB': 150 , 'LC': 12 , 'MA': 2000}

for exam_name, max_mark in my_dict.items():
    df[exam_name + '_percent'] = (df[exam_name] * 100 )/max_mark

字符串

赞(0）回复(0）举报 2023-08-01

ru9i0ody2#

这里有一个方法：

max_score = {
    "AB": 150,
    "LC": 10,
    "MA": 1000
}
df[["AB%", "LC%", "MA%"]] = df[["AB", "LC", "MA"]].div(max_score, axis=1).mul(100)

字符串

赞(0）回复(0）举报 2023-08-01

5jvtdoz23#

这可以通过使用以下代码来完成：

df["AB_percent"] = df["AB"]/max(df["AB"])

字符串
或者更一般地说：

for col in ["AB", "LC", "MA"]:
    df[col + "_percent"] = df[col]/max(df[col])

型
这导致

ID  Visit  AB  LC   MA  AB_percent  LC_percent  MA_percent
0   1      1  30   1  300         0.6    0.111111    0.333333
1   1      2  40   4  900         0.8    0.444444    1.000000
2   2      1  50   5  400         1.0    0.555556    0.444444
3   2      2  20   6  400         0.4    0.666667    0.444444
4   3      1  10   8  450         0.2    0.888889    0.500000
5   3      2  10   9  350         0.2    1.000000    0.388889

型

赞(0）回复(0）举报 2023-08-01

vxf3dgd44#

我想不出一个Pandas内置的功能，所以我会用一个简单的小算法来解决它，就像这样：

import pandas as pd
dict = {
    "ID": [1, 1, 2, 2, 3, 3],
    "Visit": [1, 2, 1, 2, 1, 2],
    "AB": [30, 40, 50, 20, 10, 10],
    "LC": [1, 4, 5, 6, 8, 9],
    "MA": [300, 900, 400, 400, 450, 350] 
}
# your dict with the max values
maxValues = {'AB': 150, 'LC': 12, 'MA': 2000}

# iterating through the max values
# and using the testname as name of the data-column
# to be used to calculate the percentage
for test in maxValues:
    newRowPercentages = []
    # iterating through the entries in the specific column
    for result in dict[test]:
        # calculating the percentage, styling is another topic
        # take a look at 'str.format()'
        percent = (result/maxValues[test]) * 100
        # appending the result to the array that is than...
        newRowPercentages.append(percent)
    # ...added as a new Column to your dict
    dict[test + '_percent'] = newRowPercentages

df = pd.DataFrame(dict)
print(df)

## OUTPUT:
   ID  Visit  AB  LC   MA  AB_percent  LC_percent  MA_percent
0   1      1  30   1  300   20.000000    8.333333        15.0
1   1      2  40   4  900   26.666667   33.333333        45.0
2   2      1  50   5  400   33.333333   41.666667        20.0
3   2      2  20   6  400   13.333333   50.000000        20.0
4   3      1  10   8  450    6.666667   66.666667        22.5
5   3      2  10   9  350    6.666667   75.000000        17.5

字符串
顺便说一句，如果你是导出这样的数据到excel，它会变得轻而易举地计算和形成这样的东西：）但我也喜欢自己一个很好的旧“所有在代码”的解决方案

赞(0）回复(0）举报 2023-08-01

我来回答

Pandas使用字典将多列转换为百分比

4条答案

相关问题

热门标签

最新问答