pandas,将数据框中的所有数值乘以一个常数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38543263/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas, multiply all the numeric values in the data frame by a constant
提问by CentAu
How to multiply all the numeric values in the data frame by a constant without having to specify column names explicitly? Example:
如何将数据框中的所有数值乘以一个常量而不必显式指定列名?例子:
In [13]: df = pd.DataFrame({'col1': ['A','B','C'], 'col2':[1,2,3], 'col3': [30, 10,20]})
In [14]: df
Out[14]:
col1 col2 col3
0 A 1 30
1 B 2 10
2 C 3 20
I tried df.multiply
but it affects the string values as well by concatenating them several times.
我试过了,df.multiply
但它也会通过将它们连接几次来影响字符串值。
In [15]: df.multiply(3)
Out[15]:
col1 col2 col3
0 AAA 3 90
1 BBB 6 30
2 CCC 9 60
Is there a way to preserve the string values intact while multiplying only the numeric values by a constant?
有没有办法在仅将数值乘以常数的同时保持字符串值完好无损?
回答by MaxU
you can use select_dtypes()including number
dtype or excluding all columns of object
and datetime64
dtypes:
您可以使用select_dtypes()包括number
dtype 或排除object
和datetime64
dtypes 的所有列:
Demo:
演示:
In [162]: df
Out[162]:
col1 col2 col3 date
0 A 1 30 2016-01-01
1 B 2 10 2016-01-02
2 C 3 20 2016-01-03
In [163]: df.dtypes
Out[163]:
col1 object
col2 int64
col3 int64
date datetime64[ns]
dtype: object
In [164]: df.select_dtypes(exclude=['object', 'datetime']) * 3
Out[164]:
col2 col3
0 3 90
1 6 30
2 9 60
or a much better solution (c) ayhan:
或更好的解决方案 (c) ayhan:
df[df.select_dtypes(include=['number']).columns] *= 3
From docs:
从文档:
To select all numeric types use the numpy dtype numpy.number
要选择所有数字类型,请使用 numpy dtype numpy.number
回答by Jossie Calderon
The other answer specifies how to multiply only numeric columns. Here's how to update it:
另一个答案指定如何仅乘以数字列。更新方法如下:
df = pd.DataFrame({'col1': ['A','B','C'], 'col2':[1,2,3], 'col3': [30, 10,20]})
s = df.select_dtypes(include=[np.number])*3
df[s.columns] = s
print (df)
col1 col2 col3
0 A 3 90
1 B 6 30
2 C 9 60
回答by Divakar
One way would be to get the dtypes
, match them against object
and datetime
dtypes and exclude them with a mask, like so -
一种方法是获取dtypes
,将它们object
与datetime
dtypes匹配并用掩码排除它们,就像这样 -
df.ix[:,~np.in1d(df.dtypes,['object','datetime'])] *= 3
Sample run -
样品运行 -
In [273]: df
Out[273]:
col1 col2 col3
0 A 1 30
1 B 2 10
2 C 3 20
In [274]: df.ix[:,~np.in1d(df.dtypes,['object','datetime'])] *= 3
In [275]: df
Out[275]:
col1 col2 col3
0 A 3 90
1 B 6 30
2 C 9 60
回答by piRSquared
This should work even over mixed types within columns but is likely slow over large dataframes.
这甚至应该适用于列内的混合类型,但在大型数据帧上可能会很慢。
def mul(x, y):
try:
return pd.to_numeric(x) * y
except:
return x
df.applymap(lambda x: mul(x, 3))