Python 如何访问稀疏矩阵元素？

Question

提问by siamii

type(A)
<class 'scipy.sparse.csc.csc_matrix'>
A.shape
(8529, 60877)
print A[0,:]
  (0, 25)   1.0
  (0, 7422) 1.0
  (0, 26062)    1.0
  (0, 31804)    1.0
  (0, 41602)    1.0
  (0, 43791)    1.0
print A[1,:]
  (0, 7044) 1.0
  (0, 31418)    1.0
  (0, 42341)    1.0
  (0, 47125)    1.0
  (0, 54376)    1.0
print A[:,0]
  #nothing returned

Now what I don't understand is that A[1,:]should select elements from the 2nd row, yet I get elements from the 1st row via print A[1,:]. Also, print A[:,0]should return the first column but I get nothing printed. Why?

现在我不明白的是A[1,:]应该从第二行中选择元素，但我通过print A[1,:]. 此外，print A[:,0]应该返回第一列，但我没有打印任何内容。为什么？

Answer 1

采纳答案by Warren Weckesser

A[1,:]is itself a sparse matrix with shape (1, 60877). Thisis what you are printing, and it has only one row, so all the row coordinates are 0.

A[1,:]本身是一个形状为 (1, 60877) 的稀疏矩阵。这就是你正在打印的，它只有一行，所以所有的行坐标都是 0。

For example:

例如：

In [41]: a = csc_matrix([[1, 0, 0, 0], [0, 0, 10, 11], [0, 0, 0, 99]])

In [42]: a.todense()
Out[42]: 
matrix([[ 1,  0,  0,  0],
        [ 0,  0, 10, 11],
        [ 0,  0,  0, 99]], dtype=int64)

In [43]: print(a[1, :])
  (0, 2)    10
  (0, 3)    11

In [44]: print(a)
  (0, 0)    1
  (1, 2)    10
  (1, 3)    11
  (2, 3)    99

In [45]: print(a[1, :].toarray())
[[ 0  0 10 11]]

You can select columns, but if there are no nonzero elements in the column, nothing is displayed when it is output with print:

您可以选择列，但如果列中没有非零元素，则输出时不会显示任何内容print：

In [46]: a[:, 3].toarray()
Out[46]: 
array([[ 0],
       [11],
       [99]])

In [47]: print(a[:,3])
  (1, 0)    11
  (2, 0)    99

In [48]: a[:, 1].toarray()
Out[48]: 
array([[0],
       [0],
       [0]])

In [49]: print(a[:, 1])


In [50]:

The last printcall shows no output because the column a[:, 1]has no nonzero elements.

最后一次print调用没有显示输出，因为该列a[:, 1]没有非零元素。

Answer 2

回答by TheGrimmScientist

To answer your title's question using a different technique than your question's details:

要使用与问题的详细信息不同的技术来回答标题的问题：

csc_matrixgives you the method .nonzero().

csc_matrix给你方法.nonzero()。

Given:

鉴于：

>>> import numpy as np
>>> from scipy.sparse.csc import csc_matrix
>>> 
>>> row = np.array( [0, 1, 3])
>>> col = np.array( [0, 2, 3])
>>> data = np.array([1, 4, 16])
>>> A = csc_matrix((data, (row, col)), shape=(4, 4))

You can access the indices poniting to non-zero data by:

您可以通过以下方式访问指向非零数据的索引：

>>> rows, cols = A.nonzero()
>>> rows
array([0, 1, 3], dtype=int32)
>>> cols
array([0, 2, 3], dtype=int32)

Which you can then use to access your data, without ever needing to make a dense version of your sparse matrix:

然后您可以使用它来访问您的数据，而无需制作稀疏矩阵的密集版本：

>>> [((i, j), A[i,j]) for i, j in zip(*A.nonzero())]
[((0, 0), 1), ((1, 2), 4), ((3, 3), 16)]

Answer 3

回答by Satyam

If it is for calculating TFIDF score using TfidfTransformer, yu can get the IDF by tfidf.idf_. Then the sparse array name, say 'a', a.toarray().

如果是使用计算 TFIDF 分数TfidfTransformer，则可以通过得到 IDF tfidf.idf_。然后是稀疏数组名称，比如'a'，a.toarray().

toarrayreturns an ndarray; todensereturns a matrix. If you want a matrix, use todense; otherwise, use toarray.

toarray返回一个 ndarray；todense返回一个矩阵。如果你想要一个矩阵，请使用todense; 否则，使用toarray.

Answer 4

回答by Rohan Pillai

I fully acknowledge all the other given answers. This is simply a different approach.

我完全承认所有其他给出的答案。这只是一种不同的方法。

To demonstrate this example I am creating a new sparse matrix:

为了演示这个例子，我创建了一个新的稀疏矩阵：

from scipy.sparse.csc import csc_matrix
a = csc_matrix([[1, 0, 0, 0], [0, 0, 10, 11], [0, 0, 0, 99]])
print(a)

Output:

输出：

(0, 0)  1
(1, 2)  10
(1, 3)  11
(2, 3)  99

To access this easily, like the way we access a list, I converted it into a list.

为了轻松访问它，就像我们访问列表的方式一样，我将其转换为列表。

temp_list = []
for i in a:
    temp_list.append(list(i.A[0]))

print(temp_list)

Output:

输出：

[[1, 0, 0, 0], [0, 0, 10, 11], [0, 0, 0, 99]]

This might look stupid, since I am creating a sparse matrix and converting it back, but there are some functions like TfidfVectorizerand others that return a sparse matrix as output and handling them can be tricky. This is one way to extract data out of a sparse matrix.

这可能看起来很愚蠢，因为我正在创建一个稀疏矩阵并将其转换回来，但是有一些函数，如TfidfVectorizer和其他函数返回一个稀疏矩阵作为输出，处理它们可能很棘手。这是从稀疏矩阵中提取数据的一种方法。

Python 如何访问稀疏矩阵元素？

提问by siamii

采纳答案by Warren Weckesser

回答by TheGrimmScientist

回答by Satyam

回答by Rohan Pillai

相关推荐

最近更新

标签

Python 如何访问稀疏矩阵元素？

提问by siamii

采纳答案by Warren Weckesser

回答by TheGrimmScientist

回答by Satyam

回答by Rohan Pillai

相关推荐

如何在python中将像293.4662543这样的浮点数变成293.47？

如何使用打印在 Python 中显示特殊字符

Python 的列表是如何实现的？

python - 如何使用Selenium WebDriver和python获取Web元素的颜色？

相关推荐

最近更新

标签