如何在Python中过滤掉全为0的列?

mccptt67  于 2023-03-28  发布在  Python
关注(0)|答案(6)|浏览(172)

我有一些需要过滤的结构。有没有一种方法可以在Python中很好地做到这一点?
我有一个丑陋的方式来做这件事,但我想清理它:

original_header = ['a','b','c']
original_rows = [[1,0,1], [0,0,0], [1,0,0]]

processed_header, processed_rows = some_cool_utility(original_header, original_rows)

assert_equals(['a', 'c'], processed_header)
assert_equals([[1,1], [0,0], [1,0]], processed_rows)
dzjeubhm

dzjeubhm1#

original_header = ['a','b','c']
original_rows = [[1,0,1], [0,0,0], [1,0,0]]

#transpose rows to get columns
columns = zip(*original_rows)

#build list which is true if the column should be kept (is not a column of all zeros)  
not_all_zero = [  any(x) for x in columns ]

#filter the lists based on columns
processed_header = [x for i,x in enumerate(original_header) if not_all_zero[i] ]
processed_columns = [ x for i,x in enumerate(columns) if not_all_zero[i] ]

#transpose the remaining columns back into rows.
processed_rows = zip(*processed_columns)

print (processed_header)  #['a', 'c']
print (processed_rows)    #[(1, 1), (0, 0), (1, 0)]

请注意,这将返回一个 * 元组 * 的列表,而不是列表的列表。

wpx232ag

wpx232ag2#

使用NumPy

import numpy as np

original_rows = np.asarray([[1,0,1], [0,0,0], [1,0,0]])
original_labels = np.asarray(["a", "b", "c"])

# Get locations where columns are all zeros.
nonzero_cols = np.any(original_rows!=0, axis=0)

# Get data only where column is not all zeros.
nonzero_data = original_rows[:, nonzero_cols]
nonzero_labels = original_labels[nonzero_cols]
axr492tv

axr492tv3#

这应该行得通:

>>> original_header = ['a','b','c']
>>> original_rows = [[1,0,1], [0,0,0], [1,0,0]]
>>> row_major = zip(*original_rows)
>>> filtered = [(h, col) 
...             for h, col 
...             in zip(original_header, row_major) 
...             if any(col)]
>>> header, rows = zip(*filtered)
>>> header
('a', 'c')
>>> rows
((1, 0, 1), (1, 0, 0))
>>> zip(*rows)
[(1, 1), (0, 0), (1, 0)]
>>>

**编辑:**固定;filtered列表解析添加了一个额外转置,我没有认真研究

vfhzx4xs

vfhzx4xs4#

如果你不拘泥于数据的格式,那么将数据存储为字典会让这变得简单得多:

original_header = ['a','b','c']
original_rows = [[1,0,1], [0,0,0], [1,0,0]]

# Restructure data into easier-to-process dict
to_dict = dict(zip(original_header, zip(*original_rows)))
print to_dict # {'a': (1, 0, 1), 'b': (0, 0, 0), 'c': (1, 0, 0)}

# Filter out keys with all-zero values
filtered_dict = {k:v for (k, v) in dictify.items()
                 if not all(x==0 for x in v)}

print filtered_dict # Output: {'a': (1, 0, 1), 'c': (1, 0, 0)}
piztneat

piztneat5#

以下作品:

all_rows = original_rows[:] #make a copy
all_rows.insert(0, original_header)

all_columns = list(zip(*all_rows)) #transpose
filtered_columns = [col for col in all_columns if any(col[1:])] #remove columns that only contain 0's
filtered_rows = [list(tp) for tp in zip(*filtered_columns)] #transpose back, convert each element to a list

processed_header = filtered_rows[0]
processed_rows = filtered_rows[1:]
ippsafx7

ippsafx76#

只是为了记录在案:

def some_cool_utility(header, rows):
    data = [element for element in zip(header, rows) if any(element[1])]
    head, rows = zip(*data)
    return head, rows

相关问题