Python 将多个列表放入数据框中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30522724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Take multiple lists into dataframe
提问by jfalkson
How do I take multiple lists and put them as different columns in a python dataframe? I tried this solutionbut had some trouble.
如何获取多个列表并将它们作为不同的列放在 python 数据框中?我尝试了这个解决方案,但遇到了一些麻烦。
Attempt 1:
尝试 1:
- Have three lists, and zip them together and use that
res = zip(lst1,lst2,lst3)
- Yields just one column
- 有三个列表,并将它们压缩在一起并使用它
res = zip(lst1,lst2,lst3)
- 只产生一列
Attempt 2:
尝试 2:
percentile_list = pd.DataFrame({'lst1Tite' : [lst1],
'lst2Tite' : [lst2],
'lst3Tite' : [lst3] },
columns=['lst1Tite','lst1Tite', 'lst1Tite'])
- yields either one row by 3 columns (the way above) or if I transpose it is 3 rows and 1 column
- 产生一行 3 列(上面的方式),或者如果我转置它是 3 行和 1 列
How do I get a 100 row (length of each independent list) by 3 column (three lists) pandas dataframe?
如何通过 3 列(三个列表)pandas 数据框获得 100 行(每个独立列表的长度)?
采纳答案by maxymoo
I think you're almost there, try removing the extra square brackets around the lst
's (Also you don't need to specify the column names when you're creating a dataframe from a dict like this):
我想你快到了,尝试删除lst
's周围的额外方括号(当你从这样的字典创建数据框时,你也不需要指定列名):
import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
percentile_list = pd.DataFrame(
{'lst1Title': lst1,
'lst2Title': lst2,
'lst3Title': lst3
})
percentile_list
lst1Title lst2Title lst3Title
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
...
If you need a more performant solution you can use np.column_stack
rather than zip
as in your first attempt, this has around a 2x speedup on the example here, however comes at bit of a cost of readability in my opinion:
如果您需要一个性能更高的解决方案,您可以使用np.column_stack
而不是zip
在您的第一次尝试中,这对这里的示例有大约 2 倍的加速,但在我看来,这会带来一些可读性成本:
import numpy as np
percentile_list = pd.DataFrame(np.column_stack([lst1, lst2, lst3]),
columns=['lst1Title', 'lst2Title', 'lst3Title'])
回答by Aditya Guru
Just adding that using the first approach it can be done as -
只需添加使用第一种方法即可 -
pd.DataFrame(list(map(list, zip(lst1,lst2,lst3))))
回答by Abhinav Gupta
Adding to Aditya Guru's answer here. There is no need of using map. You can do it simply by:
在此处添加Aditya Guru的答案。不需要使用地图。您可以简单地通过以下方式完成:
pd.DataFrame(list(zip(lst1, lst2, lst3)))
This will set the column's names as 0,1,2. To set your own column names, you can pass the keyword argument columns
to the method above.
这会将列的名称设置为 0,1,2。要设置您自己的列名,您可以将关键字参数传递columns
给上述方法。
pd.DataFrame(list(zip(lst1, lst2, lst3)),
columns=['lst1_title','lst2_title', 'lst3_title'])
回答by oopsi
Adding one more scalable solution.
添加一种更具可扩展性的解决方案。
lists = [lst1, lst2, lst3, lst4]
df = pd.concat([pd.Series(x) for x in lists], axis=1)
回答by Vivek Ananthan
Adding to above answers, we can create on the fly
添加以上答案,我们可以即时创建
df= pd.DataFrame()
list1 = list(range(10))
list2 = list(range(10,20))
df['list1'] = list1
df['list2'] = list2
print(df)
hope it helps !
希望能帮助到你 !
回答by dabru
@oopsi used pd.concat()
but didn't include the column names. You could do the following, which, unlike the first solution in the accepted answer, gives you control over the column order (avoids dicts, which are unordered):
@oopsi 使用pd.concat()
但不包括列名。您可以执行以下操作,与已接受答案中的第一个解决方案不同,您可以控制列顺序(避免无序的字典):
import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
s1=pd.Series(lst1,name='lst1Title')
s2=pd.Series(lst2,name='lst2Title')
s3=pd.Series(lst3 ,name='lst3Title')
percentile_list = pd.concat([s1,s2,s3], axis=1)
percentile_list
Out[2]:
lst1Title lst2Title lst3Title
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
...