Python 将 Pandas 数据帧直接转换为稀疏 Numpy 矩阵
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20459536/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert Pandas dataframe to Sparse Numpy Matrix directly
提问by user7289
I am creating a matrix from a Pandas dataframe as follows:
我正在从 Pandas 数据帧创建一个矩阵,如下所示:
dense_matrix = np.array(df.as_matrix(columns = None), dtype=bool).astype(np.int)
And then into a sparse matrix with:
然后变成一个稀疏矩阵:
sparse_matrix = scipy.sparse.csr_matrix(dense_matrix)
Is there any way to go from a df straight to a sparse matrix?
有没有办法从 df 直接到稀疏矩阵?
Thanks in advance.
提前致谢。
采纳答案by Dan Allan
df.valuesis a numpy array, and accessing values that way is always faster than np.array.
df.values是一个 numpy 数组,以这种方式访问值总是比np.array.
scipy.sparse.csr_matrix(df.values)
You might need to take the transpose first, like df.values.T. In DataFrames, the columns are axis 0.
您可能需要先进行转置,例如df.values.T. 在 DataFrame 中,列是轴 0。

