pandas 如何管理熊猫数据中的单位?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39419178/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I manage units in pandas data?
提问by ajwood
I'm trying to figure out if there is a good way to manage unitsin my pandas data. For example, I have a DataFrame
that looks like this:
我想弄清楚是否有一种很好的方法来管理我的 Pandas 数据中的单位。例如,我有一个DataFrame
看起来像这样的:
length (m) width (m) thickness (cm)
0 1.2 3.4 5.6
1 7.8 9.0 1.2
2 3.4 5.6 7.8
Currently, the measurement units are encoded in column names. Downsides include:
目前,测量单位以列名称编码。缺点包括:
- column selection is awkward --
df['width (m)']
vs.df['width']
- things will likely break if the units of my source data change
- 列选择很尴尬
df['width (m)']
——vs。df['width']
- 如果我的源数据的单位发生变化,事情可能会中断
If I wanted to strip the units out of the column names, is there somewhere else that the information could be stored?
如果我想从列名中去除单位,是否还有其他地方可以存储信息?
采纳答案by chrisb
There isn't any great way to do this right now, see github issue herefor some discussion.
目前没有任何好方法可以做到这一点,请参阅此处的github 问题进行一些讨论。
As a quick hack, could do something like this, maintaining a separate dict with the units.
作为一个快速的黑客,可以做这样的事情,与单位保持一个单独的字典。
In [3]: units = {}
In [5]: newcols = []
...: for col in df:
...: name, unit = col.split(' ')
...: units[name] = unit
...: newcols.append(name)
In [6]: df.columns = newcols
In [7]: df
Out[7]:
length width thickness
0 1.2 3.4 5.6
1 7.8 9.0 1.2
2 3.4 5.6 7.8
In [8]: units['length']
Out[8]: '(m)'