Python 将列表转换为 1 列熊猫数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32138205/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert list to a 1-column panda dataframe
提问by HackCode
I have a file with many lines. I am reading each line, splitting each word/number and storing in a list. After this, I am trying to convert this list into a 1-column panda Dataframe.
我有一个多行的文件。我正在阅读每一行,拆分每个单词/数字并存储在列表中。在此之后,我试图将此列表转换为 1 列的熊猫数据框。
However after running my code I am getting only one row full of lists. What I need is 1 column with variable number of rowswith some value.
但是,在运行我的代码后,我只得到了一行完整的列表。我需要的是1 列,具有可变行数和一些值。
Here is the code snippet I wrote:
这是我写的代码片段:
for line1 in file:
test_set=[]
test_set.append(next(file).split())
df1 = DataFrame({'test_set': [test_set]})
My outputis something like:
我的输出是这样的:
test_set
0 [[1, 0, 0, 0, 0, 0, 1, 1, 1, 0]]
But what I want is:
但我想要的是:
test_set
0 1
1 0
2 0
3 0
4 0
5 0
6 1
7 1
8 1
9 0
Any suggestions what I'm doing wrong or how can I implement this? Thanks.
任何建议我做错了什么或我该如何实施?谢谢。
Input Data Sample snippet
输入数据样本片段
id1 id2 id3 id4
0 1 0 1
1 1 0 0
id10 id5 id6 id7
1 1 0 1
1 0 0 1
.
.
.
采纳答案by HackCode
Turn out I just had to add this
原来我只需要添加这个
df1 = DataFrame({'test_set': value for value in test_set})
But I'm still hoping to get a less costly answer because this will also increase the complexity by another factor or 'n' which is not good enough.
但我仍然希望得到一个成本更低的答案,因为这也会增加另一个因素的复杂性或“n”不够好。
回答by EdChum
You want this instead:
你想要这个:
df1 = DataFrame({'test_set': test_set})
df1 = DataFrame({'test_set': test_set})
There is no need to wrap the list again in another list, by doing that you're effectively stating your df data is a list with a single element which is another list.
没有必要再次将列表包装在另一个列表中,通过这样做,您实际上是在说明您的 df 数据是一个列表,其中包含一个单独的元素,即另一个列表。
EDIT
编辑
looking at your input data you can just load it and then construct your df as a single column like so:
查看您的输入数据,您只需加载它,然后将您的 df 构建为单个列,如下所示:
In [134]:
# load the data
import io
import pandas as pd
t="""id1 id2 id3 id4
0 1 0 1
1 1 0 0"""
df = pd.read_csv(io.StringIO(t), sep='\s+')
df
Out[134]:
id1 id2 id3 id4
0 0 1 0 1
1 1 1 0 0
Now transpose the df and perform a list comprehension, this will construct your lists and concatenate them using pd.concat
:
现在转置 df 并执行列表理解,这将构建您的列表并使用pd.concat
以下方法连接它们:
In [142]:
pd.concat([df.T[x] for x in df.T], ignore_index=True)
Out[142]:
0 0
1 1
2 0
3 1
4 1
5 1
6 0
7 0
dtype: int64
回答by YOBA
This should be fine:
这应该没问题:
df1 = DataFrame({'test_set': test_set})
test_set is already a list, you don't have to loop over it so you can add it as a value in pandas.
test_set 已经是一个列表,你不必遍历它,这样你就可以将它作为一个值添加到 Pandas 中。
print df1
test_set
0 1
1 0
2 0
3 0
4 0
5 0
6 1
7 1
8 1
9 0