python 如何找到适合分布的最佳组合

icomxhvb 于 2023-04-28 发布在 Python

关注(0)|答案(2)|浏览(114)

我试图以某种方式对数据框架进行采样，使内容的总和适合特定的分布。
假设我想用水果装满一个篮子，但每次我只能通过添加水果的混合来实现，并且我的篮子只能容纳一定数量的每种水果。我可以在一个矩阵中表示dataframe的内容，比如：
| 混合|苹果|香蕉|橙子|
| --------------|--------------|--------------|--------------|
| 1|1|1|1|
| 二|二|二|二|
| 三|三|四|三|
我的篮子可以装3个苹果、3个香蕉和3个橙子，因此如果我选择mix 1和mix 2，就可以满足篮子的最大容量。
现在我的问题是，这个 Dataframe 包含成千上万的混合，所以以贪婪的方式搜索组合将花费很长时间。有没有一种方法来近似可能的组合？（我甚至可以计算出一点误差，i。虽然mix 3多了一个banana，但如果这有助于降低运行时间，我可以接受这个组合）。

python

来源：https://stackoverflow.com/questions/76117563/how-to-find-the-best-combination-that-fits-a-distribution

2条答案

按热度按时间

xqnpmsa81#

这不应该是Python运行时的问题。该脚本运行不到一秒钟就能从1000万行代码中筛选出可能的果篮

import pandas as pd
import numpy as np
# Create a numpy array with 10.000.000 lines of random integers between 1 and 5 with shape (100000, 3)
arr = np.random.randint(low=1, high=6, size=(1000000, 3))
# Create a new dataframe with the random values and set the Mix column as the index
df = pd.DataFrame(data=arr, columns=['apples', 'bananas', 'oranges'])
df.index.name = 'Mix'
display(df.head(3))
# Select all lines with at most 3 apples, bananas and oranges in a descending order:
display(df[(df.apples<=3) & (df.bananas<=3) & (df.oranges<=3)].sort_values(by=['apples', 'bananas', 'oranges'], ascending=False))

这是生成的输出：

apples  bananas oranges
Mix         
5   3   3   3
39  3   3   3
303 3   3   3
979 3   3   3
1019    3   3   3
... ... ... ...
998527  1   1   1
998799  1   1   1
999238  1   1   1
999280  1   1   1
999441  1   1   1

展开查看全部

赞(0）回复(0）举报 2023-04-28

ve7v8dk22#

你可以使用dataframe过滤器，让pandas高效地完成它，而不是遍历每一行。创建一个新列，其中包含混合中所有水果的总和。然后添加过滤器到您的 Dataframe ，以剔除明显的混合不工作。一个过滤器可以是水果的总数应该〈=可以装在你的袋子里的水果总数。你也可以为每个水果添加过滤器，如果它超过了容量的话。按和排序以首先获得最佳组合

赞(0）回复(0）举报 2023-04-28

我来回答

python 如何找到适合分布的最佳组合

2条答案

相关问题

热门标签

最新问答