pandas 使用熊猫绘制具有真实日期的时间序列的简单方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38837421/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Simple way to plot time series with real dates using pandas
提问by clstaudt
Starting from the following CSV data, loaded into a pandas data frame...
从以下 CSV 数据开始,加载到 Pandas 数据框中...
Buchung;Betrag;Saldo
27.06.2016;-1.000,00;42.374,95
02.06.2016;500,00;43.374,95
01.06.2016;-1.000,00;42.874,95
13.05.2016;-500,00;43.874,95
02.05.2016;500,00;44.374,95
04.04.2016;500,00;43.874,95
02.03.2016;500,00;43.374,95
10.02.2016;1.000,00;42.874,95
02.02.2016;500,00;41.874,95
01.02.2016;1.000,00;41.374,95
04.01.2016;300,00;40.374,95
30.12.2015;234,54;40.074,95
02.12.2015;300,00;39.840,41
02.11.2015;300,00;39.540,41
08.10.2015;1.000,00;39.240,41
02.10.2015;300,00;38.240,41
02.09.2015;300,00;37.940,41
31.08.2015;2.000,00;37.640,41
... I would like an intuitive way to plot the time series given by the dates in column "Buchung" and the monetary values in column "Saldo".
...我想要一种直观的方法来绘制由“Buchung”列中的日期和“Saldo”列中的货币值给出的时间序列。
I tried
我试过
seaborn.tsplot(data=data, time="Buchung", value="Saldo")
which yields
这产生
ValueError: could not convert string to float: '31.08.2015'
What is an easy way to read the dates and values and plot the time series? I assume this is such a common problem that there must be a three line solution.
读取日期和值并绘制时间序列的简单方法是什么?我认为这是一个非常普遍的问题,必须有一个三行解决方案。
回答by Kartik
You need to convert your date column into the correct format:
您需要将日期列转换为正确的格式:
data['Buchung'] = pd.to_datetime(data['Buchung'], format='%d.%m.%Y')
Now your plot will work.
现在你的情节将起作用。
Though you didn't ask, I think you will also run into a similar problem because your numbers (in 'Betrag'
and 'Saldo'
) seem to be string as well. So I recommend you convert them to numeric before plotting. Here is how you can do that by simple string manipulation:
虽然你没有问,但我认为你也会遇到类似的问题,因为你的数字( in'Betrag'
和'Saldo'
)似乎也是字符串。所以我建议你在绘图之前将它们转换为数字。以下是通过简单的字符串操作来实现的方法:
data["Saldo"] = data["Saldo"].str.replace('.', '').str.replace(',', '.')
data["Betrag"] = data["Betrag"].str.replace('.', '').str.replace(',', '.')
Or set the locale:
或设置语言环境:
import locale
# The data appears to be in a European format, German locale might
# fit. Try this on Windows machine:
locale.setlocale(locale.LC_ALL, 'de')
data['Betrag'] = data['Betrag'].apply(locale.atof)
data['Saldo'] = data['Saldo'].apply(locale.atof)
# This will reset the locale to system default
locale.setlocale(locale.LC_ALL, '')
On an Ubuntu machine, follow this answer. If the above code does not work on a Windows machine, try locale.locale_alias
to list all available locales and pick the name from that.
在 Ubuntu 机器上,按照这个答案。如果上面的代码在 Windows 机器上不起作用,请尝试locale.locale_alias
列出所有可用的语言环境并从中选择名称。
Output
输出
Using matplotlib
since I cannot install Seaborn on the machine I am working from.
使用,matplotlib
因为我无法在我正在使用的机器上安装 Seaborn。
from matplotlib import pyplot as plt
plt.plot(data['Buchung'], data['Saldo'], '-')
_ = plt.xticks(rotation=45)
Note: this has been produced using the locale
method. Hence the month names are in German.
注意:这是使用该locale
方法生成的。因此月份名称是德语。