matplotlib 在配对图中显示相关值

mcvgt66p  于 2023-10-24  发布在  其他
关注(0)|答案(3)|浏览(137)

我有以下数据:

prop_tenure  prop_12m  prop_6m  
0.00         0.00      0.00   
0.00         0.00      0.00   
0.06         0.06      0.10   
0.38         0.38      0.25   
0.61         0.61      0.66   
0.01         0.01      0.02   
0.10         0.10      0.12   
0.04         0.04      0.04   
0.22         0.22      0.22

我做了一个pairplot如下:

sns.pairplot(data)
plt.show()

但是我想显示变量之间的相关系数,如果可能的话,每个变量的偏度和峰度。在海运中如何做到这一点?

bsxbgnwa

bsxbgnwa1#

据我所知,没有现成的函数可以做到这一点,你必须create your own

from scipy.stats import pearsonr
import matplotlib.pyplot as plt 

def corrfunc(x, y, ax=None, **kws):
    """Plot the correlation coefficient in the top left hand corner of a plot."""
    r, _ = pearsonr(x, y)
    ax = ax or plt.gca()
    ax.annotate(f'ρ = {r:.2f}', xy=(.1, .9), xycoords=ax.transAxes)

使用您的输入的示例:

import seaborn as sns; sns.set(style='white')
import pandas as pd

data = {'prop_tenure': [0.0, 0.0, 0.06, 0.38, 0.61, 0.01, 0.10, 0.04, 0.22], 
        'prop_12m':    [0.0, 0.0, 0.06, 0.38, 0.61, 0.01, 0.10, 0.04, 0.22], 
        'prop_6m':     [0.0, 0.0, 0.10, 0.25, 0.66, 0.02, 0.12, 0.04, 0.22]}

df = pd.DataFrame(data)

g = sns.pairplot(df)
g.map_lower(corrfunc)
plt.show()

0vvn1miw

0vvn1miw2#

只是提一下,对于最近的版本(>0.11.0)的seaborn,上面的答案不再起作用。但是你需要添加一个hue=None来使它再次起作用。

def corrfunc(x, y, hue=None, ax=None, **kws):
    """Plot the correlation coefficient in the top left hand corner of a plot."""
    r, _ = pearsonr(x, y)
    ax = ax or plt.gca()
    ax.annotate(f'ρ = {r:.2f}', xy=(.1, .9), xycoords=ax.transAxes)

参考此问题https://github.com/mwaskom/seaborn/issues/2307#issuecomment-702980853

siotufzp

siotufzp3#

如果你想在每个色调级别上包含相关值,我修改了上面的代码。如果你觉得有用的话,给予个赞。

def corrfunc(x, y, hue=None, ax=None, **kws):
    '''Plot the correlation coefficient in the bottom left hand corner of a plot.'''
    if hue is not None:
        hue_order = pd.unique(g.hue_vals)
        color_dict = dict(zip(hue_order, sns.color_palette('tab10', hue_order.shape[0]) ))
        groups = x.groupby(g.hue_vals)
        r_values = []
        for name, group in groups:
            mask = (~group.isnull()) & (~y[group.index].isnull())
            if mask.sum() > 0:
                r, _ = pearsonr(group[mask], y[group.index][mask])
                r_values.append((name, r))
        text = '\n'.join([f'{name}: ρ = {r:.2f}' for name, r in r_values])
        fontcolors = [color_dict[name] for name in hue_order]
        
    else:
        mask = (~x.isnull()) & (~y.isnull())
        if mask.sum() > 0:
            r, _ = pearsonr(x[mask], y[mask])
            text = f'ρ = {r:.2f}'
            fontcolors = 'grey'
            # print(fontcolors)
        else:
            text = ''
            fontcolors = 'grey'
        
    ax = ax or plt.gca()
    if hue is not None:
        for i, name in enumerate(hue_order):
            text_i = [f'{name}: ρ = {r:.2f}' for n, r in r_values if n==name][0]
            # print(text_i)
            color_i = fontcolors[i]
            ax.annotate(text_i, xy=(.02, .98-i*.05), xycoords='axes fraction', ha='left', va='top',
                        color=color_i, fontsize=10)
    else:
        ax.annotate(text, xy=(.02, .98), xycoords='axes fraction', ha='left', va='top',
                    color=fontcolors, fontsize=10)

penguins = sns.load_dataset('penguins')
g = sns.pairplot(penguins, hue='species',diag_kind='hist',kind='reg', plot_kws={'line_kws':{'color':'red'}})
g.map_lower(corrfunc, hue='species')

相关问题