Python 从变量中的值构造 Pandas DataFrame 给出“ValueError:如果使用所有标量值,则必须传递索引”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17839973/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Constructing pandas DataFrame from values in variables gives "ValueError: If using all scalar values, you must pass an index"
提问by Nilani Algiriyage
This may be a simple question, but I can not figure out how to do this. Lets say that I have two variables as follows.
这可能是一个简单的问题,但我不知道如何做到这一点。假设我有两个变量如下。
a = 2
b = 3
I want to construct a DataFrame from this:
我想从中构造一个 DataFrame:
df2 = pd.DataFrame({'A':a,'B':b})
This generates an error:
这会产生一个错误:
ValueError: If using all scalar values, you must pass an index
ValueError:如果使用所有标量值,则必须传递索引
I tried this also:
我也试过这个:
df2 = (pd.DataFrame({'a':a,'b':b})).reset_index()
This gives the same error message.
这给出了相同的错误消息。
采纳答案by DSM
The error message says that if you're passing scalar values, you have to pass an index. So you can either not use scalar values for the columns -- e.g. use a list:
错误消息说,如果您要传递标量值,则必须传递索引。因此,您不能为列使用标量值——例如使用列表:
>>> df = pd.DataFrame({'A': [a], 'B': [b]})
>>> df
A B
0 2 3
or use scalar values and pass an index:
或使用标量值并传递索引:
>>> df = pd.DataFrame({'A': a, 'B': b}, index=[0])
>>> df
A B
0 2 3
回答by ely
You need to provide iterables as the values for the Pandas DataFrame columns:
您需要提供可迭代对象作为 Pandas DataFrame 列的值:
df2 = pd.DataFrame({'A':[a],'B':[b]})
回答by fAX
You can also use pd.DataFrame.from_records
which is more convenient when you already have the dictionary in hand:
pd.DataFrame.from_records
当您已经拥有字典时,您还可以使用哪个更方便:
df = pd.DataFrame.from_records([{ 'A':a,'B':b }])
You can also set index, if you want, by:
如果需要,您还可以通过以下方式设置索引:
df = pd.DataFrame.from_records([{ 'A':a,'B':b }], index='A')
回答by ingrid
If you have a dictionary you can turn it into a pandas data frame with the following line of code:
如果您有字典,则可以使用以下代码行将其转换为 Pandas 数据框:
pd.DataFrame({"key": d.keys(), "value": d.values()})
回答by Rob
Maybe Series would provide all the functions you need:
也许 Series 会提供您需要的所有功能:
pd.Series({'A':a,'B':b})
DataFrame can be thought of as a collection of Series hence you can :
DataFrame 可以被认为是系列的集合,因此您可以:
Concatenate multiple Series into one data frame (as described here)
Add a Series variable into existing data frame ( example here)
回答by danuker
This is because a DataFrame has two intuitive dimensions - the columns andthe rows.
这是因为 DataFrame 有两个直观的维度——列和行。
You are only specifying the columns using the dictionary keys.
您仅使用字典键指定列。
If you only want to specify one dimensional data, use a Series!
如果您只想指定一维数据,请使用系列!
回答by MLguy
You need to create a pandas series first. The second step is to convert the pandas series to pandas dataframe.
您需要先创建一个熊猫系列。第二步是将pandas系列转换为pandas数据帧。
import pandas as pd
data = {'a': 1, 'b': 2}
pd.Series(data).to_frame()
You can even provide a column name.
您甚至可以提供列名。
pd.Series(data).to_frame('ColumnName')
回答by k0L1081
If you intend to convert a dictionary of scalars, you have to include an index:
如果您打算转换标量字典,则必须包含一个索引:
import pandas as pd
alphabets = {'A': 'a', 'B': 'b'}
index = [0]
alphabets_df = pd.DataFrame(alphabets, index=index)
print(alphabets_df)
Although index is not required for a dictionary of lists, the same idea can be expanded to a dictionary of lists:
虽然列表字典不需要索引,但同样的想法可以扩展到列表字典:
planets = {'planet': ['earth', 'mars', 'jupiter'], 'length_of_day': ['1', '1.03', '0.414']}
index = [0, 1, 2]
planets_df = pd.DataFrame(planets, index=index)
print(planets_df)
Of course, for the dictionary of lists, you can build the dataframe without an index:
当然,对于列表字典,您可以构建没有索引的数据框:
planets_df = pd.DataFrame(planets)
print(planets_df)
回答by Matthew Connell
You could try:
你可以试试:
df2 = pd.DataFrame.from_dict({'a':a,'b':b}, orient = 'index')
From the documentation on the 'orient' argument: If the keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns' (default). Otherwise if the keys should be rows, pass ‘index'.
来自'orient'参数的文档:如果传递的dict的键应该是结果DataFrame的列,则传递'columns'(默认)。否则,如果键应该是行,则传递“索引”。
回答by S.V
the input does not have to be a list of records - it can be a single dictionary as well:
输入不必是记录列表 - 它也可以是单个字典:
pd.DataFrame.from_records({'a':1,'b':2}, index=[0])
a b
0 1 2
Which seems to be equivalent to:
这似乎相当于:
pd.DataFrame({'a':1,'b':2}, index=[0])
a b
0 1 2