pandas 带删除的熊猫随机样本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39835021/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:07:49  来源:igfitidea点击:

Pandas random sample with remove

pythonpandas

提问by RockJake28

I'm aware of DataFrame.sample(), but how can I do this and also remove the sample from the dataset? (Note: AFAIK this has nothing to do with sampling with replacement)

我知道DataFrame.sample(),但是我怎样才能做到这一点并从数据集中删除样本?(注意:AFAIK 这与带替换采样无关

For example here is the essenceof what I want to achieve, this does not actually work:

例如这里是我想要实现的本质,这实际上不起作用:

len(df) # 1000

df_subset = df.sample(300)
len(df_subset) # 300

df = df.remove(df_subset)
len(df) # 700

回答by piRSquared

If your index is unique

如果您的索引是唯一的

df = df.drop(df_subset.index)


example

例子

df = pd.DataFrame(np.arange(10).reshape(-1, 2))


sample

样本

df_subset = df.sample(2)
df_subset

enter image description here

在此处输入图片说明



drop

降低

df.drop(df_subset.index)

enter image description here

在此处输入图片说明

回答by SerialDev

pandas random sample:

Pandas随机样本

train=df.sample(frac=0.8,random_state=200)
test=df.drop(train.index)