Python Pandas set_index 不设置索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17328655/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas set_index does not set the index
提问by juniper-
Say I create a pandas DataFrame with two columns, b
(a DateTime) and c
(an integer). Now I want to make a DatetimeIndex from the values in the first column (b
):
假设我创建了一个包含两列b
(日期时间)和c
(整数)的Pandas DataFrame 。现在我想从第一列 ( b
) 中的值创建一个 DatetimeIndex :
import pandas as pd
import datetime as dt
a=[1371215423523845, 1371215500149460, 1371215500273673, 1371215500296504, 1371215515568529, 1371215531603530, 1371215576463339, 1371215579939113, 1371215731215054, 1371215756231343, 1371215756417484, 1371215756519690, 1371215756551645, 1371215756578979, 1371215770164647, 1371215820891387, 1371215821305584, 1371215824925723, 1371215878061146, 1371215878173401, 1371215890324572, 1371215898024253, 1371215926634930, 1371215933513122, 1371216018210826, 1371216080844727, 1371216080930036, 1371216098471787, 1371216111858392, 1371216326271516, 1371216326357836, 1371216445401635, 1371216445401635, 1371216481057049, 1371216496791894, 1371216514691786, 1371216540337354, 1371216592180666, 1371216592339578, 1371216605823474, 1371216610332627, 1371216623042903, 1371216624749566, 1371216630631179, 1371216654267672, 1371216714011662, 1371216783761738, 1371216783858402, 1371216783858402, 1371216783899118, 1371216976339169, 1371216976589850, 1371217028278777, 1371217028560770, 1371217170996479, 1371217176184425, 1371217176318245, 1371217190349372, 1371217190394753, 1371217272797618, 1371217340235667, 1371217340358197, 1371217340433146, 1371217340463797, 1371217340490876, 1371217363797722, 1371217363797722, 1371217363890678, 1371217363922929, 1371217523548405, 1371217523548405, 1371217551181926, 1371217551181926, 1371217551262975, 1371217652579855, 1371218091071955, 1371218295006690, 1371218370005139, 1371218370133637, 1371218370133637, 1371218370158096, 1371218370262823, 1371218414896836, 1371218415013417, 1371218415050485, 1371218415050485, 1371218504396524, 1371218504396524, 1371218504481537, 1371218504517462, 1371218586980079, 1371218719953887, 1371218720621245, 1371218738776732, 1371218937926310, 1371218954785466, 1371218985347070, 1371218985421615, 1371219039790991, 1371219171650043]
b=[dt.datetime.fromtimestamp(t/1000000.) for t in a]
c = {'b':b, 'c':a[:]}
df = pd.DataFrame(c)
df.set_index(pd.DatetimeIndex(df['b']))
print df
Everything seems to work fine, except that when I print the DataFrame, it says that it has an Int64Index.
一切似乎都很好,除了当我打印 DataFrame 时,它说它有一个 Int64Index。
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100 entries, 0 to 99
Data columns (total 2 columns):
b 100 non-null values
c 100 non-null values
dtypes: datetime64[ns](1), int64(1)
Am I doing something wrong or do I not understand the concept of Indeces properly?
我做错了什么还是我没有正确理解 Indeces 的概念?
采纳答案by Jeff
set_index
is not inplace (unless you pass inplace=True
). otherwise all correct
set_index
没有到位(除非你通过inplace=True
)。否则都正确
In [7]: df = df.set_index(pd.DatetimeIndex(df['b']))
In [8]: df
Out[8]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 100 entries, 2013-06-14 09:10:23.523845 to 2013-06-14 10:12:51.650043
Data columns (total 2 columns):
b 100 non-null values
c 100 non-null values
dtypes: datetime64[ns](1), int64(1)
also as a FYI, in forthcoming 0.12 release (next week),
you can pass unit=us
to specify units of microseconds since epoch
同样作为参考,在即将发布的 0.12 版本(下周)中,您可以通过unit=us
指定自纪元以来的微秒单位
In [13]: pd.to_datetime(a,unit='us')
Out[13]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-06-14 13:10:23.523845, ..., 2013-06-14 14:12:51.650043]
Length: 100, Freq: None, Timezone: None