Python 如何在没有列名或行名的熊猫中选择列和行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39158699/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to select column and rows in pandas without column or row names?
提问by Eka
I have a pandas dataframe(df) like this
我有一个像这样的熊猫数据框(df)
Close Close Close Close Close
Date
2000-01-03 00:00:00 NaN NaN NaN NaN -0.033944
2000-01-04 00:00:00 NaN NaN NaN NaN 0.0351366
2000-01-05 00:00:00 -0.033944 NaN NaN NaN -0.0172414
2000-01-06 00:00:00 0.0351366 -0.033944 NaN NaN -0.00438596
2000-01-07 00:00:00 -0.0172414 0.0351366 -0.033944 NaN 0.0396476
in R
If I want to select fifth column
在R
如果我想选择第五列
five=df[,5]
and without 5th column
并且没有第 5 列
rest=df[,-5]
How can I do similar operations with pandas dataframe
如何对 Pandas 数据框进行类似操作
I tried this in pandas
我在熊猫中试过这个
five=df.ix[,5]
but its giving this error
但它给出了这个错误
File "", line 1
df.ix[,5]
^
SyntaxError: invalid syntax
采纳答案by Hanshan
If you want the fifth column:
如果你想要第五列:
df.ix[:,4]
Stick the colon in there to take all the rows for that column.
将冒号放在那里以获取该列的所有行。
To exclude a fifth column you could try:
要排除第五列,您可以尝试:
df.ix[:, (x for x in range(0, len(df.columns)) if x != 4)]
回答by piRSquared
Use iloc
. It is explicitly a position based indexer. ix
can be both and will get confused if an index is integer based.
使用iloc
. 它显然是一个基于位置的索引器。 ix
可以两者兼而有之,如果索引是基于整数的,则会感到困惑。
df.iloc[:, [4]]
For all but the fifth column
对于除第五列之外的所有内容
slc = list(range(df.shape[1]))
slc.remove(4)
df.iloc[:, slc]
or equivalently
或等效地
df.iloc[:, [i for i in range(df.shape[1]) if i != 4]]
回答by Nehal J Wani
To select filter column by index:
要按索引选择过滤列:
In [19]: df
Out[19]:
Date Close Close.1 Close.2 Close.3 Close.4
0 2000-01-0300:00:00 NaN NaN NaN NaN -0.033944
1 2000-01-0400:00:00 NaN NaN NaN NaN 0.035137
2 2000-01-0500:00:00 -0.033944 NaN NaN NaN -0.017241
3 2000-01-0600:00:00 0.035137 -0.033944 NaN NaN -0.004386
4 2000-01-0700:00:00 -0.017241 0.035137 -0.033944 NaN 0.039648
In [20]: df.ix[:, 5]
Out[20]:
0 -0.033944
1 0.035137
2 -0.017241
3 -0.004386
4 0.039648
Name: Close.4, dtype: float64
In [21]: df.icol(5)
/usr/bin/ipython:1: FutureWarning: icol(i) is deprecated. Please use .iloc[:,i]
#!/usr/bin/python2
Out[21]:
0 -0.033944
1 0.035137
2 -0.017241
3 -0.004386
4 0.039648
Name: Close.4, dtype: float64
In [22]: df.iloc[:, 5]
Out[22]:
0 -0.033944
1 0.035137
2 -0.017241
3 -0.004386
4 0.039648
Name: Close.4, dtype: float64
To select all columns except index:
选择除索引外的所有列:
In [29]: df[[df.columns[i] for i in range(len(df.columns)) if i != 5]]
Out[29]:
Date Close Close.1 Close.2 Close.3
0 2000-01-0300:00:00 NaN NaN NaN NaN
1 2000-01-0400:00:00 NaN NaN NaN NaN
2 2000-01-0500:00:00 -0.033944 NaN NaN NaN
3 2000-01-0600:00:00 0.035137 -0.033944 NaN NaN
4 2000-01-0700:00:00 -0.017241 0.035137 -0.033944 NaN
回答by theshubhagrwl
If your DataFrame does not have column/row labels and you want to select some specific columns then you should use ilocmethod.
如果您的 DataFrame 没有列/行标签并且您想要选择某些特定的列,那么您应该使用iloc方法。
example if you want to select first column and all rows:
例如,如果要选择第一列和所有行:
df = dataset.iloc[:,0]
Here the df variable will contain the value stored in the first column of your dataframe.
这里 df 变量将包含存储在数据框第一列中的值。
Do remember that
请记住
type(df) -> pandas.core.series.Series
Hope it helps
希望能帮助到你