python-3.x 你能有一个进度条来排序列表吗？

xmjla07d 于 2023-08-08 发布在 Python

关注(0)|答案(3)|浏览(89)

我有一个包含约50 k个自定义数据类型元素的列表（后者对我的问题可能不重要），我使用Python内置的list.sort()方法对列表进行排序。

myList: List[Foo] = ...
myList.sort(key=Foo.x)

字符串
由于排序需要几分钟，我想有一个排序过程的进度条。我在网上找不到任何解决方案。
这可能吗我知道排序算法可能很复杂，而且可能根本无法衡量排序进度。不过，对我的usecase来说，有一个“粗略”的测量就可以了，比如25%，50%，75%...

python-3.x

来源：https://stackoverflow.com/questions/76661046/can-you-have-a-progress-bar-for-sorting-a-list

3条答案

按热度按时间

quhf5bfb1#

考虑到sort提供的接口，您没有太多的选项来挂接实际的排序算法。但是，如果50K键很慢，则很可能调用key函数很慢，这是在实际排序之前计算的。
从文档：
对应于列表中每个项的键只计算一次，然后用于整个排序过程。
因此，如果你计算key方法被调用的次数，你就可以得到整个排序过程的粗略估计。为此，您可以为key函数创建一个 Package 器来管理此簿记：

def progress_sort(data, *, key=lambda v: v, on_increment=None):
    total = len(data)

    if on_increment is None:
        start = time.time()

        def on_increment(c):
            print(f"{time.time() - start}: {c/total * 100}%")

    count = 0

    def progress_key(val):
        nonlocal count
        if count % int(total / 10) == 0:
            on_increment(count)
        count += 1
        return key(val)

    data.sort(key=progress_key)
    on_increment(total)

字符串
使用一些虚拟数据和慢速键方法的示例

def slow_key(val):
    time.sleep(1.0/500_000)
    return val

data = [random.randint(-50_000, 50_000)/1.0 for i in range(50_000)]
progress_sort(data, key=slow_key)

0.0: 0.0%
0.5136210918426514: 10.0%
1.0435900688171387: 20.0%
1.6074442863464355: 30.0%
2.156496524810791: 40.0%
2.9734878540039062: 50.0%
3.4794368743896484: 60.0%
4.016523599624634: 70.0%
4.558118104934692: 80.0%
5.047779083251953: 90.0%
5.545809030532837: 100.0%

的数据
然后，可以将此方法与您希望用于更新状态的任何类型的库相结合。您可能希望进一步配置提供给所提供的钩子的数据，但是，原理是相同的。
下面是一个使用tqdm的示例：

def slow_key(val):
    time.sleep(1.0/500_000)
    return val

data = [random.randint(-50_000, 50_000)/1.0 for i in range(50_001)]

with tqdm(total=len(data), desc="sorting") as pbar:
    progress_sort(data, key=slow_key, on_increment=lambda c: pbar.update(c - pbar.n))
    pbar.set_description("Finished")

sorting:  80%|███████▉  | 40000/50001 [00:05<00:01, 5802.30it/s]

Finished: 100%|██████████| 50001/50001 [00:07<00:00, 6489.14it/s]

赞(0）回复(0）举报 2023-08-08

zkure5ic2#

我还假设关键字确定是排序中较慢的部分，这是一个相对较小的列表大小（50k）所期望的。这个答案的解决方案是制作一个中间列表，只包含键和对象引用。这可以被量化，并且在确定每个对象的密钥之后可以显示进展。对于这个演示，通过使用一个具有100ms睡眠的关键例程，这一过程变得很慢。
最后，真实的的排序将有望能够非常快地运行，因为密钥都已经被预先计算好了。

#!/usr/bin/python

import time
import random
from operator import attrgetter, itemgetter
class Custom:
    def __init__(self):
        self._key = random.randint(1,50000)

    @property
    def key(self):
        # print('slow key')
        time.sleep(0.1)
        return self._key

    def __repr__(self):
        return f"Custom(key={self._key})"

mylist = [Custom() for i in range(40)]

print(mylist)
mylist2 = []
display_inc = 5
display = 0
for x, ele in enumerate(mylist):
    mylist2.append((ele.key,ele))
    if x/len(mylist) * 100 >= display:
        print(f"{display}% done")
        # stop displaying after 90
        if display >= 90:
            display = 110
        display += display_inc
print("95% done")
mylist2.sort(key=itemgetter(0))
mylist = [i[1] for i in mylist2]
print("100% done")
print(mylist)

字符串
退货

$ python slowsort.py 
[Custom(key=22549), Custom(key=5431), Custom(key=8895), Custom(key=10837), Custom(key=12652), Custom(key=43897), Custom(key=24724), Custom(key=16014), Custom(key=46022), Custom(key=25979), Custom(key=45115), Custom(key=45442), Custom(key=42306), Custom(key=17611), Custom(key=25113), Custom(key=12924), Custom(key=21902), Custom(key=1661), Custom(key=6475), Custom(key=41993), Custom(key=40334), Custom(key=44407), Custom(key=20747), Custom(key=7635), Custom(key=38258), Custom(key=45187), Custom(key=13048), Custom(key=18952), Custom(key=46592), Custom(key=10790), Custom(key=24978), Custom(key=5349), Custom(key=47924), Custom(key=12413), Custom(key=7147), Custom(key=17528), Custom(key=3035), Custom(key=16639), Custom(key=17059), Custom(key=25630)]
0% done
5% done
10% done
15% done
20% done
25% done
30% done
35% done
40% done
45% done
50% done
55% done
60% done
65% done
70% done
75% done
80% done
85% done
90% done
95% done
100% done
[Custom(key=1661), Custom(key=3035), Custom(key=5349), Custom(key=5431), Custom(key=6475), Custom(key=7147), Custom(key=7635), Custom(key=8895), Custom(key=10790), Custom(key=10837), Custom(key=12413), Custom(key=12652), Custom(key=12924), Custom(key=13048), Custom(key=16014), Custom(key=16639), Custom(key=17059), Custom(key=17528), Custom(key=17611), Custom(key=18952), Custom(key=20747), Custom(key=21902), Custom(key=22549), Custom(key=24724), Custom(key=24978), Custom(key=25113), Custom(key=25630), Custom(key=25979), Custom(key=38258), Custom(key=40334), Custom(key=41993), Custom(key=42306), Custom(key=43897), Custom(key=44407), Custom(key=45115), Custom(key=45187), Custom(key=45442), Custom(key=46022), Custom(key=46592), Custom(key=47924)]

型
出于某种原因，如果对预先计算的键的实际排序很慢，那么你可以将列表划分为更小的列表，然后诉诸于一个更大的列表，但这有点混乱，所以我想了解是否有必要。希望显着的缓慢是在关键一代。

赞(0）回复(0）举报 2023-08-08

svgewumm3#

是的，可以在Python中为排序过程创建进度条。正如您正确指出的那样，排序算法可能很复杂，并且在所有情况下精确测量进度可能并不可行。但是，正如您所建议的，可以通过使用排序键的 Package 类来粗略估计进度。
要实现这一点，可以创建一个 Package 类，该 Package 类保存原始排序键和有关进度的其他信息。 Package 器类将跟踪到目前为止比较的元素的数量以及所进行的比较的总数。根据这些信息，您可以估计进度。
以下是您如何实现这一点的基本概述：

import time
from typing import List

class ProgressSortKey:
    def __init__(self, key, total_elements):
        self.key = key
        self.total_elements = total_elements
        self.comparisons_done = 0
        self.elements_compared = 0

    def __lt__(self, other):
        self.comparisons_done += 1
        self.elements_compared += 1
        return self.key < other.key

def sort_with_progress(lst):
    total_elements = len(lst)
    progress_lst = [ProgressSortKey(item, total_elements) for item in lst]

    # Perform the sorting using the modified sort keys
    progress_lst.sort()

    # Return the original objects without the progress information
    return [item.key for item in progress_lst]

# Example usage:
myList: List[Foo] = ...
sorted_list = sort_with_progress(myList)

字符串
现在，您可以使用sort_with_progress函数对列表进行排序并获得进度估计。请记住，这是一个粗略的估计，可能并不完全准确，特别是对于非常复杂的排序算法或非常大的数据集。然而，对于您的用例，它应该给予您一个足够好的排序进度指示。
要将进度可视化为进度条，您可以创建一个简单的函数，以所需的格式打印进度：

def print_progress_bar(iteration, total, prefix='', suffix='', decimals=1, length=50, fill='█'):
    percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
    filled_length = int(length * iteration // total)
    bar = fill * filled_length + '-' * (length - filled_length)
    print(f'\r{prefix} |{bar}| {percent}% {suffix}', end='\r')
    # Print new line when progress is complete
    if iteration == total:
        print()

# Example usage for sorting:
for i, item in enumerate(sort_with_progress(myList)):
    # Do something with the sorted item
    # ...

    # Update and print the progress bar
    print_progress_bar(i + 1, len(myList), prefix='Progress:', suffix='Complete', length=50)

型
这将打印一个进度条，该进度条会随着排序的进行而更新。当排序完成时，栏将完成。
请记住，确切的实现和行为可能取决于您的自定义数据类型Foo的细节以及排序算法在它上的执行情况，但是这个大纲应该为您提供一个构建进度条的起点。

赞(0）回复(0）举报 2023-08-08

我来回答

python-3.x 你能有一个进度条来排序列表吗？

3条答案

相关问题

热门标签

最新问答