如何通过变量来过滤条件dataframe pandas

vyswwuz2  于 2023-06-20  发布在  其他
关注(0)|答案(2)|浏览(71)

我有文件(10.230.30.146_480.txt,10.20.24.16,10.55.30.2),我需要使用文件名的第一部分作为变量1,第二部分作为变量2
我用了密码
对于na_sheets中的txt:
x=txt.replace('. txt ','')
y= x.split(“_",1)
变量1 = y[0]
变量2 =y[1]
df1=df[(df['MSAN_IP'] == 'variable1')&(df['OUTER_VLAN'] == variable2)]
然后,我创建了for循环来迭代变量variable1和variable2,过滤 Dataframe df,并将这些变量传递给过滤器条件
输出为空 Dataframe ,只有标头

4nkexdtk

4nkexdtk1#

IIUC,您可以用途:

na_sheets = [
    "10.230.30.146_480.txt",
    "10.20.24.16_480.txt",
    "10.55.30.2_383.txt"
]

dfs = {
    fn: df.loc[(df["MSAN_IP"] == v1) & (df["OUTER_VLAN"] == int(v2))] 
    for fn in na_sheets for v1, v2 in [fn.rstrip(".txt").split("_")] # maxsplit=1 ?
}
  • NB:这将创建DataFrames的字典,其中键是文件名。*

输出:

for k, v in dfs.items():
    print(k, v, sep="\n", end="\n\n")

10.230.30.146_480.txt
         MSAN_IP  OUTER_VLAN
2  10.230.30.146         480

10.20.24.16_480.txt
       MSAN_IP  OUTER_VLAN
0  10.20.24.16         480

10.55.30.2_383.txt
      MSAN_IP  OUTER_VLAN
1  10.55.30.2         383
  • 使用的输入:*
df = pd.DataFrame({
    "MSAN_IP": ["10.20.24.16", "10.55.30.2", "10.230.30.146"],
    "OUTER_VLAN": [480, 383, 480],
})
kqlmhetl

kqlmhetl2#

你没有得到结果,因为你的for循环覆盖了值,你只得到了最后一个值。您可以:

na_sheets = (
    "10.230.30.146_480.txt",
    "10.20.24.16_480.txt",
    "10.55.30.2_383.txt"
)

k=[txt.replace('.txt','').split("_", 1) for txt in na_sheets]
#[['10.230.30.146', '480'], ['10.20.24.16', '480'], ['10.55.30.2', '383']]

variable1 = [x[0] for x in k]
#['10.230.30.146', '10.20.24.16', '10.55.30.2']

variable2 = [x[1] for x in k]
#['480', '480', '383']

现在你可以在dataframe中使用它们。

df = pd.DataFrame({
    "MSAN_IP": variable1,
    "OUTER_VLAN": variable2,
})

相关问题