Python 尝试绘制熊猫类型错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33676608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas type error trying to plot
提问by Tom Johnson
I'm trying to create a basic scatter plot based on a Pandas dataframe. But when I call the scatter routine I get an error "TypeError: invalid type promotion". Sample code to reproduce the problem is shown below:
我正在尝试基于 Pandas 数据框创建一个基本的散点图。但是当我调用分散例程时,我收到一个错误“TypeError: invalid type Promotion”。重现问题的示例代码如下所示:
t1 = pd.to_datetime('2015-11-01 00:00:00')
t2 = pd.to_datetime('2015-11-02 00:00:00')
Time = pd.Series([t1, t2])
r = pd.Series([-1, 1])
df = pd.DataFrame({'Time': Time, 'Value': r})
print(df)
print(type(df.Time))
print(type(df.Time[0]))
fig = plt.figure(figsize=(x_size,y_size))
ax = fig.add_subplot(111)
ax.scatter(df.Time, y=df.Value, marker='o')
The resulting output is
结果输出是
Time Value
0 2015-11-01 -1
1 2015-11-02 1
<class 'pandas.core.series.Series'>
<class 'pandas.tslib.Timestamp'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-285-f4ed0443bf4d> in <module>()
15 fig = plt.figure(figsize=(x_size,y_size))
16 ax = fig.add_subplot(111)
---> 17 ax.scatter(df.Time, y=df.Value, marker='o')
C:\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, **kwargs)
3635 edgecolors = 'face'
3636
-> 3637 offsets = np.dstack((x, y))
3638
3639 collection = mcoll.PathCollection(
C:\Anaconda3\lib\site-packages\numpy\lib\shape_base.py in dstack(tup)
365
366 """
--> 367 return _nx.concatenate([atleast_3d(_m) for _m in tup], 2)
368
369 def _replace_zero_by_x_arrays(sub_arys):
TypeError: invalid type promotion
Searching around I've found a similar post Pandas Series TypeError and ValueError when using datetimewhich suggests that the error is caused by having multiple data types in the series. But that does not appear to be the issue in my example, as evidenced by the type information I'm printing.
环顾四周,我发现了一个类似的帖子Pandas Series TypeError and ValueError when using datetime这表明该错误是由系列中有多种数据类型引起的。但这似乎不是我的示例中的问题,正如我正在打印的类型信息所证明的那样。
Note that if I stop using pandas datetime objects and make the 'Time' a float instead this works fine, e.g.
请注意,如果我停止使用 Pandas 日期时间对象并将“时间”设为浮点数,则这可以正常工作,例如
t1 = 1.1 #
t2 = 1.2
Time = pd.Series([t1, t2])
r = pd.Series([-1, 1])
df = pd.DataFrame({'Time': Time, 'Value': r})
print(df)
print(type(df.Time))
print(type(df.Time[0]))
fig = plt.figure(figsize=(x_size,y_size))
ax = fig.add_subplot(111)
ax.scatter(df.Time, y=df.Value, marker='o')
with output
带输出
Time Value
0 1.1 -1
1 1.2 1
<class 'pandas.core.series.Series'>
<class 'numpy.float64'>
and the graph looks just fine. I'm at a loss as to why the use of a datetime is causing the invalid type promotion error? I'm using Python 3.4.3 and pandas 0.16.2.
图形看起来很好。我不知道为什么使用日期时间会导致无效类型促销错误?我正在使用 Python 3.4.3 和 Pandas 0.16.2。
采纳答案by Tom Johnson
Thanks @martinvseticka. I think your assessment is correct based on the numpy code you pointed me to. I was able to simplify your tweaks a bit more (and added a third sample point) to get
谢谢@martinvseticka。我认为根据您向我指出的 numpy 代码,您的评估是正确的。我能够稍微简化您的调整(并添加第三个样本点)以获得
t1 = pd.to_datetime('2015-11-01 00:00:00')
t2 = pd.to_datetime('2015-11-02 00:00:00')
t3 = pd.to_datetime('2015-11-03 00:00:00')
Time = pd.Series([t1, t2, t3])
r = pd.Series([-1, 1, 0.5])
df = pd.DataFrame({'Time': Time, 'Value': r})
fig = plt.figure(figsize=(x_size,y_size))
ax = fig.add_subplot(111)
ax.plot_date(x=df.Time, y=df.Value, marker='o')
The key seems to be calling 'plot_date' rather than 'plot'. This seems to inform mapplotlib to not try to concatenate the arrays.
关键似乎是调用“plot_date”而不是“plot”。这似乎通知 mapplotlib 不要尝试连接数组。
回答by Martin Vseticka
Is this what you are looking for?
这是你想要的?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
t1 = pd.to_datetime('2015-11-01 00:00:00')
t2 = pd.to_datetime('2015-11-02 00:00:00')
idx = pd.Series([t1, t2])
s = pd.Series([-1, 1], index=idx)
fig, ax = plt.subplots()
ax.plot_date(idx, s, 'v-')
plt.tight_layout()
plt.show()
I'm new to Python so hopefully I'm not wrong. Basically, I tried to adapt your example according to https://stackoverflow.com/a/13674286/99256.
我是 Python 的新手,所以希望我没有错。基本上,我尝试根据https://stackoverflow.com/a/13674286/99256调整您的示例。
The problem with your script is that numpy
triesto concatenate df.Time
and df.Value
series and it can't find a suitable type for the new array because one array is numeric and the second one is composed of Timestamp
instances.
您的脚本的问题在于它numpy
尝试连接df.Time
和df.Value
序列,但无法为新数组找到合适的类型,因为一个数组是数字,而第二个数组由Timestamp
实例组成。
回答by N.C. van Gilse
scatter
plots have some properties that cannot be simulated in plot
or plot_date
(as the ability to plot markers with varying size).
scatter
图具有一些无法在plot
or 中模拟的属性plot_date
(作为绘制具有不同大小的标记的能力)。
Converting the Time series of type:pandas.tslib.Timestamp
to a list of type:datetime.datetime
before plotting the scatter did the trick for me:
将时间序列类型转换为类型pandas.tslib.Timestamp
列表:datetime.datetime
在绘制散点图之前对我有用:
times = [d.to_pydatetime() for d in df.Time]]
ax.scatter(times, y=df.Value, marker='o')
回答by Gingerbread
You can also do something like this:
你也可以做这样的事情:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import datetime
df = pd.DataFrame({"Time":["2015-11-01 00:00:00", "2015-11-02 00:00:00"], "value":[ 1, -1]})
df['Time'] = pd.to_datetime(df['Time'])
fig, ax = plt.subplots()
ax.scatter(np.arange(len(df['Time'])), df['value'], marker='o')
ax.xaxis.set_ticks(np.arange(len(df['Time'])))
ax.xaxis.set_ticklabels(df['Time'], rotation=90)
plt.xlabel("Time")
plt.ylabel("value")
plt.show()
回答by Jeff
There is another way, that we should drop uses Series. Just use list for time.
还有另一种方式,我们应该放弃使用系列。只需使用列表作为时间。
t1 = pd.to_datetime('2015-11-01 00:00:00')
t2 = pd.to_datetime('2015-11-02 00:00:00')
Time = pd.Series([t1, t2])
r = pd.Series([-1, 1])
df = pd.DataFrame({'Time': Time, 'Value': r})
print(df)
print(type(df.Time))
print(type(df.Time[0]))
x_size = 800
y_size = 600
fig = plt.figure(figsize=(x_size,y_size))
ax = fig.add_subplot(111)
ax.scatter(list(df.Time.values), list(df.Value.values), marker='o')
回答by Edward Weinert
I've changed the type of datetime column to string in fly:
我已将日期时间列的类型更改为动态字符串:
plt.scatter(df['Date'].astype('str'), df['Category'], s=df['count'])
and the scatter plot works. Regards
散点图有效。问候
回答by hardik jamnal
All the above answers are amazing. But, in my case, the error is fixed by updating the libraries. That you can do by Conda terminal with the command, conda update --all
以上所有答案都令人惊叹。但是,就我而言,错误是通过更新库来修复的。您可以通过 Conda 终端使用命令 conda update --all 执行此操作