python 分形维数的质量半径法

我有一些图像，我想计算质量半径维，以确定图像中的分形特征。以下是其中之一：
Ottawa.png：

质量尺寸定义位于某个半径内的区域与该半径（或长方体）的大小之间的关系。这是针对从质心开始的各种半径执行的。质量尺寸可以从面积（非黑色像素的数量）的双对数图作为半径（以像素为单位）的函数来估计。
我的步骤：

求质心
创建一个半径为r的圆，找出非黑色像素的个数
将这些值的对数放入2个列表（R，N（R））中
迭代直到我们完成对图像的分析
在（R，N（R））之间进行线性回归
由ax+B得到的系数a是维数

我使用以下代码来计算分形维数：

from scipy.stats import linregress
from skimage import io
import matplotlib.pyplot as plt
import numpy as np

# Verify if a pixel is black
def test_pixel(l):
    return (l == np.array([0, 0, 0])).all()

# Find the center of the image
def center_of_mass(img):
    i, j, f = img.shape
    x,y,p = 0,0,0
    for k1 in range(1, i - 1):
        for k2 in range(1, j - 1):
            if not test_pixel(img[k1][k2]):
                p += 1
                x += k2
                y += k1
    return int(x / p), int(y / p)

# Return two lists of logs : the radius and the number of pixel at the radius
def D_R(img):
    (x, y) = center_of_mass(img)
    i, j, f = img.shape
    k = int(max(j - x, i - y, x, y)) # Stop when the image is analyzed
    X, Y, A, p = [], [], 0, 0
    for t in range(1, k):
        for k1 in range(-t, t+1):
            for k2 in range(-t, t+1):
                # Verify if (x+k2, y+k1) is in the image
                if -1 < (x + k2) < j and -1 < (y + k1) < i:
                    # Test if (x+k2, y+k1) is not black and is in the circle of radius t
                    if A == 0 and k1 ** 2 + k2 ** 2 < t ** 2 and not test_pixel(img[k1+y][k2+x]):
                        p += 1
                    elif A ** 2 <= k1 ** 2 + k2 ** 2 < t ** 2 and not test_pixel(img[k1+y][k2+x]):
                        p += 1
        A = t
        if p != 0:
            Y.append(np.log(p))
            X.append(np.log(t))
    return X, Y

I = io.imread('Ottawa.png')
X, Y = D_R(I)
a, b, r, p_value, std_err = linregress(X, Y)
print(a) # Corresponds to the Dimension
print(b)
print(r**2)

但是，这段代码真的很慢。
我想知道怎样才能使这段代码更有效率。

你的代码之所以这么慢是因为你的三个嵌套循环。我想你已经自己解决了这个问题，但是循环（尤其是嵌套循环）在Python中是一个坏主意。你应该尝试使用Numpy broadcasting。
只有一个细节需要注意：因为你使用的是PNG，你可能会得到四个通道（其中一个是alpha通道），这可能不是你想要的。
如果你有一个循环，它的输出只依赖于迭代变量，而这个变量是一个简单的range，那么通常可以用np.arange来替换它。如果你有嵌套循环，你可以将np.arange s广播到一起，以产生2D（或nD）数组，这些数组复制了你的for循环。在本例中，使用k1和k2很容易做到这一点，消除了两个内部循环：

k1 = np.arange(-t, t+1)[:, np.newaxis]
k2 = np.arange(-t, t+1)

然后，你可以将if/elif部分的大部分内容转换为布尔数学：

r = k1**2 + k2**2
x_k2 = x + k2
y_k1 = y + k1
p_mat = (-1 < x_k2) & (x_k2 < j) & (-1 < y_k1) & (y_k1 < i) & ((A == 0) | (A**2 <= r)) & (r < t**2)

请注意，这里的r现在是一个2D数组，并导致所有其他操作/数组广播到相同的形状。
这种方法唯一真实的的复杂之处在于实际处理图像，因为图像的索引并不总是产生大小一致的输出，例如在图像的边缘。由于边缘情况，这有点复杂，但结果是这样的：

cropped_img = img[max(y-t, 0):y+t+1,max(x-t, 0):x+t+1]
offset_y = max(-(y-t), 0)
offset_x = max(-(x-t), 0)
p_mat = p_mat[offset_y:offset_y+cropped_img.shape[0], offset_x:offset_x+cropped_img.shape[1]] & (np.any(cropped_img != 0, axis=-1))

如果只留下t循环，这已经显著提高了性能：

def D_R(img):
    (x, y) = center_of_mass(img)
    i, j, f = img.shape
    k = int(max(j - x, i - y, x, y)) # Stop when the image is analyzed
    X, Y, A, p = [], [], 0, 0
    for t in range(1, k):
        k1 = np.arange(-t, t+1)[:, np.newaxis]
        k2 = np.arange(-t, t+1)
        r = k1**2 + k2**2
        x_k2 = x + k2
        y_k1 = y + k1
        p_mat = (-1 < x_k2) & (x_k2 < j) & (-1 < y_k1) & (y_k1 < i) & ((A == 0) | (A**2 <= r)) & (r < t**2)
        cropped_img = img[max(y-t, 0):y+t+1,max(x-t, 0):x+t+1]
        offset_y = max(-(y-t), 0)
        offset_x = max(-(x-t), 0)
        p_mat = p_mat[offset_y:offset_y+cropped_img.shape[0], offset_x:offset_x+cropped_img.shape[1]] & (np.any(cropped_img != 0, axis=-1))
        p += p_mat.sum()
        A = t
        if p != 0:
            Y.append(np.log(p))
            X.append(np.log(t))
    return X, Y

请注意，您的center_of_mass函数仍然非常慢。您可以尝试应用我前面所说的用np.arange替换循环来改进它。
还有一个进一步的改进，你可以完全删除t循环：如果你仔细想想，你的t循环基本上是定义一个以质心为中心的不断增长的圆，并在其边缘执行某些计算。如果你认为t是一个三维空间，这就产生了一个圆锥的壳。圆锥体中的每个值与底层图像中的值以&组合，并对结果进行累积求和。就像这样：

def D_R(img):
    (x, y) = center_of_mass(img)
    i, j, f = img.shape
    k = int(max(j - x, i - y, x, y)) # Stop when the image is analyzed
    X, Y, A, p = [], [], 0, 0
    cone = k - 1 - np.sqrt((np.arange(i)[:, np.newaxis] - y) ** 2 + (np.arange(j) - x) ** 2)
    cone = np.maximum(cone.astype(np.int64), 0)
    cone = cone >= (k - 1 - np.arange(k).reshape((-1, 1, 1)))
    cone = cone[1:]
    cone[1:] &= ~cone[:-1]
    p_vec = np.any(img != 0, axis=-1) & cone
    p_vec = np.cumsum(np.sum(p_vec, (1, 2)))
    return np.log(np.arange(1, k)[p_vec != 0]).tolist(), np.log(p_vec[p_vec != 0]).tolist()

然而，虽然这得到了类似的结果，原来的，它不是绝对准确的，所以我假设我错过了一些细节。不过很近，你可以继续练习.
以这种方式构建函数对于应用其他有趣的东西（如np.einsum）具有很大的潜力，因为&运算等效于布尔乘法，np.any运算等效于沿着该轴的总和，np.einsum是为这两个运算而构建的。
显然，在所有这些情况下，在计算时间和内存之间存在折衷。最初的实现严重依赖于前者，而我的最后一个版本需要大量的后者。不过如果你用np.einsum重写它，你可以在低内存使用的情况下获得相当好的性能。
PS：在这一行：

k = int(max(j - x, i - y, x, y)) # Stop when the image is analyzed

你确定这是正确的吗？在我看来，你打算覆盖整个图像，但因为你的工作与圆圈，这错过了所有的角落的形象。也许这就是你的意图，也许不是。如果不是，我建议你的意思是：

k = int((max(j-x, x)**2 + max(i-y, y)**2)**0.5)

但我只是猜测。
编辑：关于我使用einsum的建议，事实证明这并不真正有效：问题是，使用最多内存的数组是圆锥体，它不能转换为einsum（至少，我看不到任何方法）。
但是，还有其他方法可以减少内存和时间：而不是将圆锥体的第一步转换为布尔数组，只需直接使用值。这在准确性方面略有妥协（与您的原始实现相比），但在性能和内存使用方面有了巨大的飞跃（我甚至无法看到内存使用情况，并且它需要以前最好的四分之一的时间）。它使用一个名为np.bincount的函数：

def D_R(img):
    (x, y) = center_of_mass(img)
    i, j, f = img.shape
    k = int(max(j - x, i - y, x, y))  # Stop when the image is analyzed
    cone = k - 1 - \
        np.sqrt((np.arange(i)[:, np.newaxis] - y)
                ** 2 + (np.arange(j) - x) ** 2)
    cone = np.maximum(cone.round().astype(np.int64), 0)
    p_vec = np.any(img != 0, axis=-1) * cone
    p_vec = np.bincount(p_vec.ravel(), minlength=k)[1:][::-1]
    p_vec = np.cumsum(p_vec)
    return np.log(np.arange(1, k)[p_vec != 0]).tolist(), np.log(p_vec[p_vec != 0]).tolist()

这是不同的原因再次可能是由于您正在测试的“圆”的确切厚度。在这种方法中，圆非常细，因为它们永远不会重叠。

python 分形维数的质量半径法

1条答案

相关问题

热门标签

最新问答