pandas.DataFrame.columns.values.tolist() 与 pandas.DataFrame.columns.tolist() 是否相同

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45152534/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:01:40  来源:igfitidea点击:

Is pandas.DataFrame.columns.values.tolist() the same as pandas.DataFrame.columns.tolist()

pythonpandas

提问by jxramos

We have both code popping up in our codebase

我们的代码库中同时弹出了两个代码

pandas.DataFrame.columns.values.tolist()
pandas.DataFrame.columns.tolist()

Are these always identical? I'm not sure why the valuesvariant pops up in the places it does, seems like the direct columns.tolist()is all that's needed to get the column names. I'm looking to clean up the code a bit if this is the case.

这些总是相同的吗?我不确定为什么valuescolumns.tolist()体会在它出现的地方弹出,似乎直接是获取列名所需的全部。如果是这种情况,我希望稍微清理一下代码。

Introspecting a bit seems to suggest values is just some implementation detail being a numpy.ndarray

稍微反省一下似乎表明值只是一些实现细节,即 numpy.ndarray

>>> import pandas
>>> d = pandas.DataFrame( { 'a' : [1,2,3], 'b' : [0,1,3]} )
>>> d
   a  b
0  1  0
1  2  1
2  3  3
>>> type(d.columns)
<class 'pandas.core.indexes.base.Index'>
>>> type(d.columns.values)
<class 'numpy.ndarray'>
>>> type(d.columns.tolist())
<class 'list'>
>>> type(d.columns.values.tolist())
<class 'list'>
>>> d.columns.values
array(['a', 'b'], dtype=object)
>>> d.columns.values.tolist()
['a', 'b']
>>> d.columns
Index(['a', 'b'], dtype='object')
>>> d.columns.tolist()
['a', 'b']

回答by jezrael

Output is same, but if really big dftimings are different:

输出是相同的,但如果真的大的df时间不同:

np.random.seed(23)
df = pd.DataFrame(np.random.randint(3, size=(5,10000)))
df.columns = df.columns.astype(str)
print (df)

In [90]: %timeit df.columns.values.tolist()
10000 loops, best of 3: 79.5 μs per loop

In [91]: %timeit df.columns.tolist()
10000 loops, best of 3: 173 μs per loop

Also uses different functions:

还使用不同的功能:

Index.valueswith numpy.ndarray.tolist

Index.valuesnumpy.ndarray.tolist

Index.tolist

Index.tolist

Thanks Mitchfor another solution:

感谢Mitch另一个解决方案:

In [93]: %timeit list(df.columns.values)
1000 loops, best of 3: 169 μs per loop

回答by YOBEN_S

d = pandas.DataFrame( { 'a' : [1,2,3], 'b' : [0,1,3]} )

or you can simply do

或者你可以简单地做

list(d)# it is same with d.columns.tolist()
Out[327]: ['a', 'b']

#  Time 
% timeit list(df) # after run the time , this is the slowest on my side . 
10000 loops, best of 3: 135 μs per loop