使用list.remove()对列表进行排序Python

to94eoyn  于 2022-12-21  发布在  Python
关注(0)|答案(5)|浏览(119)

我试图通过删除列表中没有出现的项目来对keys_list进行排序,而该列表已经按照我想要的方式进行了排序(sorted_category_list)。

sorted_category_list = ['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'Master Men', 'Master Women', 'U21 Men', 'U21 Women',
                        'U17 Men', 'U17 Women', 'U17 Men', 'U17 Women', 'U15 Mixed', 'Hardtail', 'E-Bike']
keys_list = ['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'U15 Mixed', 'U17 Men', 'U21 Men', 'U21 Women']

for category in sorted_category_list:
    if category not in keys_list:
        sorted_category_list.remove(category)

print(sorted_category_list)
print(keys_list)

然而我只得到这个结果。它似乎删除了一些项目,但没有其他人,所以我不知道我做错了什么:

['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'Master Women', 'U21 Men', 'U21 Women', 'U17 Men', 'U17 Men', 'U15 Mixed', 'E-Bike']
['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'U15 Mixed', 'U17 Men', 'U21 Men', 'U21 Women']
bkhjykvo

bkhjykvo1#

这是因为list.remove()只删除找到的第一个元素,因此如果列表中有两个相同的元素,它只删除一个。

sorted_category_list = ['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'Master Men', 'Master Women', 'U21 Men', 'U21 Women',
                        'U17 Men', 'U17 Women', 'U17 Men', 'U17 Women', 'U15 Mixed', 'Hardtail', 'E-Bike']
keys_list = ['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'U15 Mixed', 'U17 Men', 'U21 Men', 'U21 Women']

sorted_category_list = [a for a in sorted_category_list if a in keys_list]
print(sorted_category_list)
print(keys_list)
wztqucjr

wztqucjr2#

将关键点转化为一个关键点集:

keys_list = {'Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'U15 Mixed', 'U17 Men', 'U21 Men', 'U21 Women'}

然后使用以下设置删除:

sorted_category_list = ['Elite Men', 'Elite Women', 'Open Men', 'Open Women', 'Master Men', 'Master Women', 'U21 Men', 'U21 Women',
                        'U17 Men', 'U17 Women', 'U17 Men', 'U17 Women', 'U15 Mixed', 'Hardtail', 'E-Bike']

sorted_category_list[:] = [i for i in sorted_category_list if i in keys_list]
a1o7rhls

a1o7rhls3#

我会建议你把两个列表中的项目都附加到一个新的列表中。这样你就可以避免改变你原来的列表。

repeats=[]
for item in keys_list:
    if item in sorted_category_list:
        repeats.append(item)
bfrts1fy

bfrts1fy4#

问题是您同时迭代和修改列表。
考虑一个列表['a','b ','c','d '],你有一个代码

for char in list:
  if char == 'a':
     list.remove(char)

在本例中,列表的迭代方式是
循环1:字符= a(索引0)
因为char被删除了,所以接下来要搜索的索引是1。
但是列表现在是['b','c','d '],所以索引为1的字符是'c',所以'b'被跳过了.
因此,在您的情况下,第一个被删除的元素是“男性大师”,因此下一个元素,即“女性大师”被跳过,这就是为什么它在列表中,每次删除下一个元素被跳过。

a14dhokn

a14dhokn5#

如果在迭代列表时必须修改列表(或可迭代列表),那么可以使用向后迭代,如下所示:

def clean_dataset(data: list, items_to_remove: list) -> list:
    end_index = len(data) - 1
    #enumerate the reversed list to iterate backwards from the last index
    for index, value in enumerate(reversed(data)):
        if value in items_to_remove:
            del data[end_index - index]
    return data

这在小数据集上工作正常,但随着数据集的扩展很快变得不可用。如果您可以删除列表的大切片而不是逐个删除,则可能会优化。如果您无法删除大切片,则最好按照建议追加

def new_dataset(data: list, items_to_remove: list) -> list:
    new_list = []
    for value in data:
        if value not in items_to_remove:
            new_list.append(value)
    return data

出于好奇,我检查了小型和大型数据集的时间,即使只有750,000项,添加到一个新列表也要快得多:

sorted_category_list = ['Elite Men', 'Elite Women', 'Open Men',
                        'Open Women', 'Master Men', 'Master Women',
                         'U21 Men', 'U21 Women','U17 Men', 'U17 Women',
                          'U17 Men', 'U17 Women', 'U15 Mixed',
                           'Hardtail', 'E-Bike']

sorted_category_list3 = ['Elite Men', 'Elite Women', 'Open Men',
                        'Open Women', 'Master Men', 'Master Women',
                         'U21 Men', 'U21 Women','U17 Men', 'U17 Women',
                          'U17 Men', 'U17 Women', 'U15 Mixed',
                           'Hardtail', 'E-Bike']*50000

keys_list = ['Elite Men', 'Elite Women', 'Open Men',
             'Open Women', 'U15 Mixed', 'U17 Men', 'U21 Men', 'U21 Women']

if __name__ == "__main__":
    print('timing:')

    x1 = timeit.timeit("clean_dataset(sorted_category_list, keys_list)",
                        setup="from __main__ import clean_dataset,\
                             sorted_category_list, keys_list",
                             number=1)
    print(f"removal - small dataset:\t {x1:15.15f}")

    x2 = timeit.timeit("new_dataset(sorted_category_list, keys_list)",
                        setup="from __main__ import new_dataset,\
                             sorted_category_list, keys_list",
                             number=1)
    print(f"append - small dataset: \t {x2:15.15f}")

    y1 = timeit.timeit("clean_dataset(sorted_category_list3, keys_list)",
                        setup="from __main__ import clean_dataset,\
                             sorted_category_list3, keys_list",
                             number=1)
    print(f"removal - large dataset:\t {y1:15.15f}")

    y2 = timeit.timeit("new_dataset(sorted_category_list3, keys_list)",
                        setup="from __main__ import new_dataset,\
                             sorted_category_list3, keys_list",
                             number=1)
    print(f"append - large dataset: \t {y2:15.15f}")

输出:

timing:
removal - small dataset:         0.000006600000000
append - small dataset:          0.000005500000000
removal - large dataset:         17.711741400000001
append - large dataset:          0.064716900000001

相关问题