Python 将多个列表放入数据框中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30522724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:35:34  来源:igfitidea点击:

Take multiple lists into dataframe

pythonnumpypandas

提问by jfalkson

How do I take multiple lists and put them as different columns in a python dataframe? I tried this solutionbut had some trouble.

如何获取多个列表并将它们作为不同的列放在 python 数据框中?我尝试了这个解决方案,但遇到了一些麻烦。

Attempt 1:

尝试 1:

  • Have three lists, and zip them together and use that res = zip(lst1,lst2,lst3)
  • Yields just one column
  • 有三个列表,并将它们压缩在一起并使用它 res = zip(lst1,lst2,lst3)
  • 只产生一列

Attempt 2:

尝试 2:

percentile_list = pd.DataFrame({'lst1Tite' : [lst1],
                                'lst2Tite' : [lst2],
                                'lst3Tite' : [lst3] }, 
                                columns=['lst1Tite','lst1Tite', 'lst1Tite'])
  • yields either one row by 3 columns (the way above) or if I transpose it is 3 rows and 1 column
  • 产生一行 3 列(上面的方式),或者如果我转置它是 3 行和 1 列

How do I get a 100 row (length of each independent list) by 3 column (three lists) pandas dataframe?

如何通过 3 列(三个列表)pandas 数据框获得 100 行(每个独立列表的长度)?

采纳答案by maxymoo

I think you're almost there, try removing the extra square brackets around the lst's (Also you don't need to specify the column names when you're creating a dataframe from a dict like this):

我想你快到了,尝试删除lst's周围的额外方括号(当你从这样的字典创建数据框时,你也不需要指定列名):

import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
percentile_list = pd.DataFrame(
    {'lst1Title': lst1,
     'lst2Title': lst2,
     'lst3Title': lst3
    })

percentile_list
    lst1Title  lst2Title  lst3Title
0          0         0         0
1          1         1         1
2          2         2         2
3          3         3         3
4          4         4         4
5          5         5         5
6          6         6         6
...

If you need a more performant solution you can use np.column_stackrather than zipas in your first attempt, this has around a 2x speedup on the example here, however comes at bit of a cost of readability in my opinion:

如果您需要一个性能更高的解决方案,您可以使用np.column_stack而不是zip在您的第一次尝试中,这对这里的示例有大约 2 倍的加速,但在我看来,这会带来一些可读性成本:

import numpy as np
percentile_list = pd.DataFrame(np.column_stack([lst1, lst2, lst3]), 
                               columns=['lst1Title', 'lst2Title', 'lst3Title'])

回答by Aditya Guru

Just adding that using the first approach it can be done as -

只需添加使用第一种方法即可 -

pd.DataFrame(list(map(list, zip(lst1,lst2,lst3))))

回答by Abhinav Gupta

Adding to Aditya Guru's answer here. There is no need of using map. You can do it simply by:

在此处添加Aditya Guru的答案。不需要使用地图。您可以简单地通过以下方式完成:

pd.DataFrame(list(zip(lst1, lst2, lst3)))

This will set the column's names as 0,1,2. To set your own column names, you can pass the keyword argument columnsto the method above.

这会将列的名称设置为 0,1,2。要设置您自己的列名,您可以将关键字参数传递columns给上述方法。

pd.DataFrame(list(zip(lst1, lst2, lst3)),
              columns=['lst1_title','lst2_title', 'lst3_title'])

回答by oopsi

Adding one more scalable solution.

添加一种更具可扩展性的解决方案。

lists = [lst1, lst2, lst3, lst4]
df = pd.concat([pd.Series(x) for x in lists], axis=1)

回答by Vivek Ananthan

Adding to above answers, we can create on the fly

添加以上答案,我们可以即时创建

df= pd.DataFrame()
list1 = list(range(10))
list2 = list(range(10,20))
df['list1'] = list1
df['list2'] = list2
print(df)

hope it helps !

希望能帮助到你 !

回答by dabru

@oopsi used pd.concat()but didn't include the column names. You could do the following, which, unlike the first solution in the accepted answer, gives you control over the column order (avoids dicts, which are unordered):

@oopsi 使用pd.concat()但不包括列名。您可以执行以下操作,与已接受答案中的第一个解决方案不同,您可以控制列顺序(避免无序的字典):

import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)

s1=pd.Series(lst1,name='lst1Title')
s2=pd.Series(lst2,name='lst2Title')
s3=pd.Series(lst3 ,name='lst3Title')
percentile_list = pd.concat([s1,s2,s3], axis=1)

percentile_list
Out[2]: 
    lst1Title  lst2Title  lst3Title
0           0          0          0
1           1          1          1
2           2          2          2
3           3          3          3
4           4          4          4
5           5          5          5
6           6          6          6
7           7          7          7
8           8          8          8
...