pandas 如何突出显示包含分类变量的单元格？

siotufzp 于 2023-04-04 发布在其他

关注(0)|答案(3)|浏览(122)

我有一个名为value_matrix_classification的pandas数据框，如下所示：

{('wind_on_share',
  'Wind-onshore power generation'): {('AIM/CGE 2.0',
   'ADVANCE_2020_WB2C'): 'high', ('AIM/CGE 2.0',
   'ADVANCE_2030_Price1.5C'): 'high', ('AIM/CGE 2.0',
   'ADVANCE_2030_WB2C'): 'high', ('IMAGE 3.0.1',
   'ADVANCE_2020_WB2C'): 'low', ('IMAGE 3.0.1',
   'ADVANCE_2030_WB2C'): 'low', ('MESSAGE-GLOBIOM 1.0',
   'ADVANCE_2020_WB2C'): 'low'},
 ('wind_off_share',
  'Wind-offshore power generation'): {('AIM/CGE 2.0',
   'ADVANCE_2020_WB2C'): nan, ('AIM/CGE 2.0',
   'ADVANCE_2030_Price1.5C'): nan, ('AIM/CGE 2.0',
   'ADVANCE_2030_WB2C'): nan, ('IMAGE 3.0.1',
   'ADVANCE_2020_WB2C'): 'low', ('IMAGE 3.0.1',
   'ADVANCE_2030_WB2C'): 'low', ('MESSAGE-GLOBIOM 1.0',
   'ADVANCE_2020_WB2C'): 'low'}}

右边的两列包含分类变量low, medium and high。
我想突出显示pandas数据框，以便分别为高、中、低和NaN值提供红色、橙子、黄色和背景色。
我写了下面的函数

def highlight_cells(x):
    if x == "high":
        color = "red"
    elif x=="medium":
        color = "orange"
    elif x=="low":
        color = "yellow"
    else:
        color = "gray"
    
    return [f"background-color: {color}"]

并将其应用于 Dataframe

value_matrix_classification.style.apply(highlight_cells)

然而，这给出了ValueError：Series的真值不明确。请使用.empty、.bool（）、. item（）、.any（）或.all（）。在这里突出显示的适当方式是什么？
我能够突出显示具有空值的单元格，只使用

value_matrix_classification.style.highlight_null(null_color = "gray")

我在这里附上截图只是为了方便读者。

如何根据给定的类别高亮显示所有单元格：低，中，高？

pandas

来源：https://stackoverflow.com/questions/69181205/how-can-i-highlight-cells-with-categorical-variables

3条答案

按热度按时间

3htmauhk1#

apply将整行或整列作为输入。请改用applymap。
请参阅此Pandas文档部分。
编辑：您还希望highlight_cells只返回f"background-color: {color}"，而不是 Package 在列表中。

赞(0）回复(0）举报 2023-04-04

zour9fqk2#

Series.map + fillna为每个列创建一系列样式是解决此类问题的更常见方法：

def highlight_cells(x):
    return 'background-color: ' + x.map(
        # Associate Values to a given colour code
        {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
    ).fillna('gray')  # Fill unmapped values with default

value_matrix_classification.style.apply(highlight_cells)

每一列被Map到一组新的颜色代码。
这就是如何使用第二列作为引用来确定样式，但是Styler.apply将调用子集中的所有列：

value_matrix_classification.iloc[:, 1].map(
    {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
)

AIM/CGE 2.0          ADVANCE_2020_WB2C            NaN
                     ADVANCE_2030_Price1.5C       NaN
                     ADVANCE_2030_WB2C            NaN
IMAGE 3.0.1          ADVANCE_2020_WB2C         yellow
                     ADVANCE_2030_WB2C         yellow
MESSAGE-GLOBIOM 1.0  ADVANCE_2020_WB2C         yellow
Name: (wind_off_share, Wind-offshore power generation), dtype: object

然后使用fillna将未Map的值替换为默认值。请注意，这不是NaN repr，而是Map字典中未出现的 any 值的默认值：

value_matrix_classification.iloc[:, 1].map(
    {'high': 'red', 'medium': 'orange', 'low': 'yellow'}
).fillna('gray')

AIM/CGE 2.0          ADVANCE_2020_WB2C           gray  # NaN replaced with gray
                     ADVANCE_2030_Price1.5C      gray
                     ADVANCE_2030_WB2C           gray
IMAGE 3.0.1          ADVANCE_2020_WB2C         yellow
                     ADVANCE_2030_WB2C         yellow
MESSAGE-GLOBIOM 1.0  ADVANCE_2020_WB2C         yellow
Name: (wind_off_share, Wind-offshore power generation), dtype: object

最后，添加css属性：
一个三个三个一个

赞(0）回复(0）举报 2023-04-04

jpfvwuh43#

为了添加更多细节，假设您有

np.random.seed(0)
df = pd.DataFrame(np.random.randn(4,2), columns=list('AB'))

>>> df

   A         B
0 -0.686760 -0.791461
1 -0.497699 -1.287310
2  0.793787  0.525824
3  0.501172  1.695914

为了理解发生了什么，我们将一列与值=0.2进行比较。返回一列布尔值。对于and, or, not, if, while也是如此。当您有多个条件时，您将返回多个列。

>>> df.B > 0.2

0     True
1     True
2    False
3    False
Name: B, dtype: bool

现在让我们做比较

if df.B > 0.2:
   print("do something")

 >>> ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

上面的比较等于下面的情况，不清楚结果应该是什么。它应该是True，因为它不是零长度？False，因为有False值？不清楚，所以pandas引发ValueError：

if Series([True, True, False, False]) > 0.2:
     print("do something")

因此，我们需要将这些多个值转换为单个bool值，这取决于我们想要做什么。

if pd.Series([True, True, False, False]).any(): # evaluates to True
   print("I checked if there was any True value in the Series!)

>>> I checked if there was any True value in the Series!

if pd.Series([True, True, False, False]).all(): # Evaluates to False
   print("I checked if there were all True values in the Series!")

赞(0）回复(0）举报 2023-04-04

我来回答

pandas 如何突出显示包含分类变量的单元格？

3条答案

相关问题

热门标签

最新问答