scipy 当两个csr矩阵的维数不相容时合并它们

2izufjch  于 2022-11-09  发布在  其他
关注(0)|答案(2)|浏览(196)

我有两个稀疏矩阵,其中第一个矩阵的性质为

<1x40 sparse matrix of type '<class 'numpy.intc'>'
    with 10 stored elements in Compressed Sparse Row format>

第二个:

<9x15426 sparse matrix of type '<class 'numpy.int64'>'
    with 25 stored elements in Compressed Sparse Row format>

我想把第一个矩阵的40维附加到第二个矩阵的9维中的每一个上<1x15426>,这样得到的矩阵将具有

<9x15466 sparse matrix of type '<class 'numpy.int64'>'
    with 25 stored elements in Compressed Sparse Row format>

不转换为密集阵列也可以实现吗?谢谢!

lokaqttq

lokaqttq1#

好吧,我之前的答案虽然正确,但还是太早了。下面是一个更好的尝试:

from scipy.sparse import csr_matrix, hstack
import numpy as np

csr1 = csr_matrix(
    np.random.randint(0, 3, (9, 15426)),
)
csr2 = csr_matrix(
    np.random.randint(0, 3, (1, 40)),
)
hstack((csr1, csr_matrix(np.ones([9,1]))*csr2[0]))
kninwzqo

kninwzqo2#

是的,这是可能的。你只需要进入矩阵的内容,仔细地循环它。下面是一个例子(可能可以跳过转换为列表,但我认为连接列表更简单):

from scipy.sparse import csr_matrix
import numpy as np

csr1 = csr_matrix(
    np.random.randint(0, 3, (9, 15426)),
)
csr2 = csr_matrix(
    np.random.randint(0, 3, (1, 40)),
)

def glue_csr2_to_csr1(csr1:csr_matrix, csr2:csr_matrix) -> csr_matrix:
    curr_pointer = 0
    csr1_data = list(csr1.data)
    csr1_indices = list(csr1.indices)
    csr1_indptr = list(csr1.indptr)
    csr2_data = list(csr2.data)
    csr2_indices = list(csr2.indices)
    csr2_indptr = list(csr2.indptr)

    new_pointers = [0]
    new_data = []
    new_indices = []

    for row_index in range(len(csr1.indptr)-1):
        row_data = csr1_data[csr1_indptr[row_index]:csr1_indptr[row_index + 1]]
        row_indices = csr1_indices[csr1_indptr[row_index]:csr1_indptr[row_index + 1]]
        new_row_data = row_data + csr2_data
        new_data += new_row_data
        new_row_indices = row_indices + [x + csr1.shape[1] for x in csr2_indices]
        new_indices += new_row_indices
        curr_pointer += len(row_data) +  csr2_indptr[1]
        new_pointers.append(curr_pointer)
    res = csr_matrix((new_data, new_indices, new_pointers))
    res.resize(csr1.shape[0], csr1.shape[1] + csr2.shape[1])
    return res

glue_csr2_to_csr1(csr1, csr2)

输出量:

<9x15466 sparse matrix of type '<class 'numpy.int64'>'
with 92799 stored elements in Compressed Sparse Row format>

如果我误解了你的问题,你不需要添加第二个矩阵中的元素(只需要填充0),那么你就可以做csr1.resize(9,15466),这应该就是了

相关问题