Python 在两个Numpy数组之间创建Pandas Dataframe,然后绘制散点图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29949757/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 05:11:53  来源:igfitidea点击:

Creating Pandas Dataframe between two Numpy arrays, then draw scatter plot

pythonnumpypandasscatter

提问by n3utrino

I'm relatively new with numpy and pandas (I'm an experimental physicist so I've been using ROOT for years...). A common plot in ROOT is a 2D scatter plot where, given a list of x- and y- values, makes a "heatmap" type scatter plot of one variable versus the other.

我对 numpy 和 pandas 比较陌生(我是一名实验物理学家,所以我多年来一直在使用 ROOT ......)。ROOT 中的一个常见图是二维散点图,其中,给定 x 和 y 值列表,绘制一个变量与另一个变量的“热图”类型散点图。

How is this best accomplished with numpy and Pandas? I'm trying to use the Dataframe.plot()function, but I'm struggling to even create the Dataframe.

如何用 numpy 和 Pandas 最好地实现这一点?我正在尝试使用该Dataframe.plot()功能,但我什至都在努力创建数据框。

import numpy as np
import pandas as pd
x = np.random.randn(1,5)
y = np.sin(x)
df = pd.DataFrame(d)

First off, this dataframe has shape (1,2), but I would like it to have shape (5,2). If I can get the dataframe the right shape, I'm sure I can figure out the DataFrame.plot()function to draw what I want.

首先,此数据框具有形状 (1,2),但我希望它具有形状 (5,2)。如果我能让数据框获得正确的形状,我相信我可以找出DataFrame.plot()绘制我想要的功能的函数。

采纳答案by unutbu

There are a number of ways to create DataFrames. Given 1-dimensional column vectors, you can create a DataFrame by passing it a dict whose keys are column names and whose values are the 1-dimensional column vectors:

有多种方法可以创建 DataFrame。给定一维列向量,您可以通过向它传递一个字典来创建一个 DataFrame,它的键是列名,其值是一维列向量:

import numpy as np
import pandas as pd
x = np.random.randn(5)
y = np.sin(x)
df = pd.DataFrame({'x':x, 'y':y})
df.plot('x', 'y', kind='scatter')

回答by RKD314

In order to do what you want, I wouldn't use the DataFrame plotting methods. I'm also a former experimental physicist, and based on experience with ROOT I think that the Python analog you want is best accomplished using matplotlib. In matplotlib.pyplot there is a method, hist2d(), which will give you the kind of heat map you're looking for.

为了做你想做的事,我不会使用 DataFrame 绘图方法。我也是一名前实验物理学家,根据 ROOT 的经验,我认为你想要的 Python 模拟最好使用 matplotlib 来完成。在 matplotlib.pyplot 中有一个方法 hist2d(),它会给你你正在寻找的那种热图。

As for creating the dataframe, an easy way to do it is:

至于创建数据框,一个简单的方法是:

df=pd.DataFrame({'x':x, 'y':y})

回答by famaral42

Complementing, you can use pandasSeries, but the DataFramemust have been created.

作为补充,您可以使用pandas Series,但必须已创建DataFrame

import numpy as np
import pandas as pd

x = np.linspace(0,2*np.pi)
y = np.sin(x)

#df = pd.DataFrame()
#df['X'] = pd.Series(x)
#df['Y'] = pd.Series(y)

# You can MIX
df = pd.DataFrame({'X':x})
df['Y'] = pd.Series(y) 

df.plot('X', 'Y', kind='scatter')

This is another way that might help

这是另一种可能有帮助的方法

import numpy as np
import pandas as pd

x = np.linspace(0,2*np.pi)
y = np.sin(x)

df = pd.DataFrame(data=np.column_stack((x,y)),columns=['X','Y'])

And also, I find the examples from karlijn (DatacCamp)very helpful

而且,我发现karlijn (DatacCamp)的示例非常有帮助

import numpy as np
import pandas as pd

TAB = np.array([[''     ,'Col1','Col2'],
                 ['Row1' ,   1  ,   2  ],
                 ['Row2' ,   3  ,   4  ],
                 ['Row3' ,   5 ,   6  ]])

dados = TAB[1:,1:]
linhas = TAB[1:,0]
colunas = TAB[0,1:]

DF = pd.DataFrame(
    data=dados,
    index=linhas,
    columns=colunas
)

print('\nDataFrame:', DF)