pandas 从数据框中随机选择列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45568427/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Randomly selecting columns from dataframe
提问by ewolsen
My question is quite simple: Is there any way to randomly choose columns from a dataframe in Pandas? To be clear, I want to randomly pick out ncolumnswith the values attached. I know there is such a method for randomly picking rows:
我的问题很简单:有没有办法从 Pandas 的数据框中随机选择列?明确地说,我想随机挑选带有附加值的n列。我知道有一种随机选择行的方法:
import pandas as pd
df = pd.read_csv(filename, sep=',', nrows=None)
a = df.sample(n = 2)
So the question is, does it exist an equivalent method for finding random columns?
所以问题是,是否存在查找随机列的等效方法?
回答by ayhan
sample
also accepts an axis parameter:
sample
还接受轴参数:
df = pd.DataFrame(np.random.randint(1, 10, (10, 5)), columns=list('abcde'))
df
Out:
a b c d e
0 4 5 9 8 3
1 7 2 2 8 7
2 1 5 7 9 2
3 3 3 5 2 4
4 8 4 9 8 6
5 6 5 7 3 4
6 6 3 6 4 4
7 9 4 7 7 3
8 4 4 8 7 6
9 5 6 7 6 9
df.sample(2, axis=1)
Out:
a d
0 4 8
1 7 8
2 1 9
3 3 2
4 8 8
5 6 3
6 6 4
7 9 7
8 4 7
9 5 6
回答by EdChum
You can just do df.columns.to_series.sample(n=2)
你可以做 df.columns.to_series.sample(n=2)
to randomly sample the columns, first you need to convert to a Series
by calling to_series
then you can call sample
as before
随机采样列,首先你需要Series
通过调用转换为 ato_series
然后你可以sample
像以前一样调用
In[24]:
df.columns.to_series().sample(2)
Out[24]:
C C
A A
dtype: object
Example:
例子:
In[30]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
df
Out[30]:
a b c
0 -0.691534 0.889799 1.137438
1 -0.949422 0.799294 1.360521
2 0.974746 -1.231078 0.812712
3 1.043434 0.982587 0.352927
4 0.462011 -0.591438 -0.214508
In[31]:
df[df.columns.to_series().sample(2)]
Out[31]:
b a
0 0.889799 -0.691534
1 0.799294 -0.949422
2 -1.231078 0.974746
3 0.982587 1.043434
4 -0.591438 0.462011