pandas 向数据框中的所有值添加一个
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30794525/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
adding one to all the values in a dataframe
提问by kheston Walkins
I have a dataframe like the one below. I would like to add one to all of the values in each row. I am new to this forum and python so i can't conceptualise how to do this. I need to add 1 to each value. I intend to use bayes probability and the posterior probability will be 0 when i multiply them. PS. I am also new to probability but others have applied the same method. Thanks for your help in advance. I am using pandas to do this.
我有一个如下所示的数据框。我想为每一行中的所有值添加一个。我是这个论坛和 python 的新手,所以我无法概念化如何做到这一点。我需要为每个值加 1。我打算使用贝叶斯概率,当我将它们相乘时,后验概率将为 0。附注。我也是概率的新手,但其他人应用了相同的方法。提前感谢您的帮助。我正在使用熊猫来做到这一点。
Disease Gene1 Gene2 Gene3 Gene4
D1 0 0 25 0
D2 0 0 0 0
D3 0 17 0 16
D4 24 0 0 0
D5 0 0 0 0
D6 0 32 0 11
D7 0 0 0 0
D8 4 0 0 0
采纳答案by EdChum
You can filter the df whether the underlying dtype is not 'object':
您可以过滤 df 是否底层 dtype 不是“对象”:
In [110]:
numeric_cols = [col for col in df if df[col].dtype.kind != 'O']
numeric_cols
Out[110]:
['Gene1', 'Gene2', 'Gene3', 'Gene4']
In [111]:
df[numeric_cols] += 1
df
Out[111]:
Disease Gene1 Gene2 Gene3 Gene4
0 D1 1 1 26 1
1 D2 1 1 1 1
2 D3 1 18 1 17
3 D4 25 1 1 1
4 D5 1 1 1 1
5 D6 1 33 1 12
6 D7 1 1 1 1
7 D8 5 1 1 1
EDIT
编辑
It looks like your df possibly has strings instead of numeric types, you can convert the dtype to numeric using convert_objects
:
看起来您的 df 可能具有字符串而不是数字类型,您可以使用convert_objects
以下方法将 dtype 转换为数字:
df = df.convert_objects(convert_numeric=True)
回答by firelynx
With this being your dataframe:
这是您的数据框:
df = pd.DataFrame({
"Disease":["D{}".format(i) for i in range(1,9)],
"Gene1":[0,0,0,24,0,0,0,4],
"Gene2":[0,0,17,0,0,32,0,0],
"Gene3":[25,0,0,0,0,0,0,0],
"Gene4":[0,0,16,0,0,11,0,0]})
Disease Gene1 Gene2 Gene3 Gene4
0 D1 0 0 25 0
1 D2 0 0 0 0
2 D3 0 17 0 16
3 D4 24 0 0 0
4 D5 0 0 0 0
5 D6 0 32 0 11
6 D7 0 0 0 0
7 D8 4 0 0 0
The easiest way to do this is to do
最简单的方法是这样做
df += 1
However, since you have a column which is string (The Disease column)
但是,由于您有一列是字符串(疾病列)
This will not work.
这是行不通的。
But we can conveniently set the Disease column to be the index, like this:
但是我们可以方便地将疾病列设置为索引,如下所示:
df.set_index('Disease', inplace=True)
Now your dataframe looks like this:
现在你的数据框看起来像这样:
Gene1 Gene2 Gene3 Gene4
Disease
D1 0 0 25 0
D2 0 0 0 0
D3 0 17 0 16
D4 24 0 0 0
D5 0 0 0 0
D6 0 32 0 11
D7 0 0 0 0
D8 4 0 0 0
And if we do df += 1
now, we get:
如果我们df += 1
现在这样做,我们会得到:
Gene1 Gene2 Gene3 Gene4
Disease
D1 1 1 26 1
D2 1 1 1 1
D3 1 18 1 17
D4 25 1 1 1
D5 1 1 1 1
D6 1 33 1 12
D7 1 1 1 1
D8 5 1 1 1
because the plus operation only acts on the data columns, not on the index.
因为加号操作只作用于数据列,而不作用于索引。
You can also do this on column basis, like this:
您也可以按列执行此操作,如下所示:
df.Gene1 = df.Gene1 + 1