打印/记录pandas.testing.assert_frame_equal()找到的所有内容

fdbelqdn  于 2024-01-04  发布在  其他
关注(0)|答案(1)|浏览(110)

考虑下面的测试,它的目的是在给定容差的情况下检测两个矩阵之间的任何差异。

import pandas as pd

def test_two_cubes_linewise() -> None:
    df1 = pd.DataFrame({"A": [1, 2, 3, 4, 5.0], "B": [4, 5, 6, 7, 8.0001]})
    df2 = pd.DataFrame({"A": [97, 98, 3, 4, 5], "B": [99, 5, 6, 7, 8.0002]})
    pd.testing.assert_frame_equal(df1, df2, rtol=1e-3, check_dtype=False)

字符串
pandas.testing.assert_frame_equal()可以完美地实现这一点。df 1和df 2之间有几个不同之处:

  • 55.0是不同的数据类型。
  • 1不等于97
  • 2不等于98
  • 4不等于99
  • 8.0001不等于8.0002

由于最后一个差异低于容差,assert语句只检测其他差异-正如所希望的那样。然而,当我运行测试时,Assert错误消息只显示第一个差异:
x1c 0d1x的数据
对于如何访问这些差异的信息,有什么建议吗?

qq24tv8q

qq24tv8q1#

首先,你不需要自己处理任何异常。
让我们来看看assert_frame_equal签名:

pd.testing.assert_frame_equal(
    left,
    right,
    check_dtype: "bool | Literal['equiv']" = True,
    check_index_type: "bool | Literal['equiv']" = 'equiv',
    check_column_type: "bool | Literal['equiv']" = 'equiv',
    check_frame_type: 'bool' = True,
    check_names: 'bool' = True,
    by_blocks: 'bool' = False,
    check_exact: 'bool' = False,
    check_datetimelike_compat: 'bool' = False,
    check_categorical: 'bool' = True,
    check_like: 'bool' = False,
    check_freq: 'bool' = True,
    check_flags: 'bool' = True,
    rtol: 'float' = 1e-05,
    atol: 'float' = 1e-08,
    obj: 'str' = 'DataFrame') -> 'None'

字符串
正如你所看到的,它默认设置为check_dtype=True,这会阻止其他检查的发生。我建议修改你的代码如下:

import pytest
import pandas as pd

@pytest.mark.parametrize("check_dtype", [False, True])
def test_two_cubes_linewise(check_dtype) -> None:
    df1 = pd.DataFrame({"A": [1, 2, 3, 4, 5.0], "B": [4, 5, 6, 7, 8.0001]})
    df2 = pd.DataFrame({"A": [1, 2, 3, 4, 5], "B": [5, 5, 6, 7, 8.0002]})

    pd.testing.assert_frame_equal(df1, df2, rtol=1e-3,  check_dtype=check_dtype)


更改包括向测试传递一个布尔标志,因此我们可以轻松地从check_dtype=[True -> False]测试switch。有关更多详细信息,请参阅pytest文档https://docs.pytest.org/en/7.3.x/how-to/parametrize.html
然后你可以运行pytest --no-header

> pytest --no-header
============================================================= test session starts ==============================================================
collected 2 items                                                                                                                              

the_pandas_test.py FF                                                                                                                    [100%]

=================================================================== FAILURES ===================================================================
________________________________________________________ test_two_cubes_linewise[False] ________________________________________________________

check_dtype = False
    
>       pd.testing.assert_frame_equal(df1, df2, rtol=1e-3,  check_dtype=check_dtype)

the_pandas_test.py:9: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   AssertionError: DataFrame.iloc[:, 1] (column name="B") are different
E   
E   DataFrame.iloc[:, 1] (column name="B") values are different (20.0 %)
E   [index]: [0, 1, 2, 3, 4]
E   [left]:  [4.0, 5.0, 6.0, 7.0, 8.0001]
E   [right]: [5.0, 5.0, 6.0, 7.0, 8.0002]
E   At positional index 0, first diff: 4.0 != 5.0

testing.pyx:173: AssertionError
________________________________________________________ test_two_cubes_linewise[True] _________________________________________________________

check_dtype = True
    
>       pd.testing.assert_frame_equal(df1, df2, rtol=1e-3,  check_dtype=check_dtype)
E       AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="A") are different
E       
E       Attribute "dtype" are different
E       [left]:  float64
E       [right]: int64

the_pandas_test.py:9: AssertionError
=========================================================== short test summary info ============================================================
FAILED the_pandas_test.py::test_two_cubes_linewise[False] - AssertionError: DataFrame.iloc[:, 1] (column name="B") are different
FAILED the_pandas_test.py::test_two_cubes_linewise[True] - AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="A") are different
============================================================== 2 failed in 0.54s ===============================================================


您可能需要检查其他标志以获得更好的结果。

相关问题