pandas.DataFrame.columns.values.tolist() 与 pandas.DataFrame.columns.tolist() 是否相同
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45152534/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is pandas.DataFrame.columns.values.tolist() the same as pandas.DataFrame.columns.tolist()
提问by jxramos
We have both code popping up in our codebase
我们的代码库中同时弹出了两个代码
pandas.DataFrame.columns.values.tolist()
pandas.DataFrame.columns.tolist()
Are these always identical? I'm not sure why the values
variant pops up in the places it does, seems like the direct columns.tolist()
is all that's needed to get the column names. I'm looking to clean up the code a bit if this is the case.
这些总是相同的吗?我不确定为什么values
变columns.tolist()
体会在它出现的地方弹出,似乎直接是获取列名所需的全部。如果是这种情况,我希望稍微清理一下代码。
Introspecting a bit seems to suggest values is just some implementation detail being a numpy.ndarray
稍微反省一下似乎表明值只是一些实现细节,即 numpy.ndarray
>>> import pandas
>>> d = pandas.DataFrame( { 'a' : [1,2,3], 'b' : [0,1,3]} )
>>> d
a b
0 1 0
1 2 1
2 3 3
>>> type(d.columns)
<class 'pandas.core.indexes.base.Index'>
>>> type(d.columns.values)
<class 'numpy.ndarray'>
>>> type(d.columns.tolist())
<class 'list'>
>>> type(d.columns.values.tolist())
<class 'list'>
>>> d.columns.values
array(['a', 'b'], dtype=object)
>>> d.columns.values.tolist()
['a', 'b']
>>> d.columns
Index(['a', 'b'], dtype='object')
>>> d.columns.tolist()
['a', 'b']
回答by jezrael
Output is same, but if really big df
timings are different:
输出是相同的,但如果真的大的df
时间不同:
np.random.seed(23)
df = pd.DataFrame(np.random.randint(3, size=(5,10000)))
df.columns = df.columns.astype(str)
print (df)
In [90]: %timeit df.columns.values.tolist()
10000 loops, best of 3: 79.5 μs per loop
In [91]: %timeit df.columns.tolist()
10000 loops, best of 3: 173 μs per loop
Also uses different functions:
还使用不同的功能:
Index.values
with numpy.ndarray.tolist
Index.values
与numpy.ndarray.tolist
Thanks Mitch
for another solution:
感谢Mitch
另一个解决方案:
In [93]: %timeit list(df.columns.values)
1000 loops, best of 3: 169 μs per loop
回答by YOBEN_S
d = pandas.DataFrame( { 'a' : [1,2,3], 'b' : [0,1,3]} )
or you can simply do
或者你可以简单地做
list(d)# it is same with d.columns.tolist()
Out[327]: ['a', 'b']
# Time
% timeit list(df) # after run the time , this is the slowest on my side .
10000 loops, best of 3: 135 μs per loop