Python 将熊猫数据框转换为系列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33246771/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:05:45  来源:igfitidea点击:

Convert pandas data frame to series

pythonpandasdataframeseries

提问by user1357015

I'm somewhat new to pandas. I have a pandas data frame that is 1 row by 23 columns.

我对熊猫有点陌生。我有一个 1 行 x 23 列的熊猫数据框。

I want to convert this into a series? I'm wondering what the most pythonic way to do this is?

我想把这个转换成一个系列?我想知道最pythonic的方法是什么?

I've tried pd.Series(myResults)but it complains ValueError: cannot copy sequence with size 23 to array axis with dimension 1. It's not smart enough to realize it's still a "vector" in math terms.

我试过了,pd.Series(myResults)但它抱怨ValueError: cannot copy sequence with size 23 to array axis with dimension 1。意识到它仍然是数学术语中的“向量”还不够聪明。

Thanks!

谢谢!

采纳答案by DSM

It's not smart enough to realize it's still a "vector" in math terms.

意识到它仍然是数学术语中的“向量”还不够聪明。

Say rather that it's smart enough to recognize a difference in dimensionality. :-)

而是说它足够聪明,可以识别维度差异。:-)

I think the simplest thing you can do is select that row positionally using iloc, which gives you a Series with the columns as the new index and the values as the values:

我认为您可以做的最简单的事情是使用 选择该行iloc,这为您提供了一个以列作为新索引和值作为值的系列:

>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
   a0  a1  a2  a3  a4
0   0   1   2   3   4
>>> df.iloc[0]
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>

回答by themachinist

You can retrieve the series through slicing your dataframe using one of these two methods:

您可以使用以下两种方法之一通过切片数据框来检索系列:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.htmlhttp://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))

series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series

回答by Alexander

You can transpose the single-row dataframe (which still results in a dataframe) and then squeezethe results into a series (the inverse of to_frame).

您可以转置单行数据帧(仍会生成数据帧),然后结果压缩为一个系列(与 的相反to_frame)。

df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])

>>> df.T.squeeze()  # Or more simply, df.squeeze() for a single row dataframe.
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64

Note:To accommodate the point raised by @IanS (even though it is not in the OP's question), test for the dataframe's size. I am assuming that dfis a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.

注意:为了适应@IanS 提出的观点(即使它不在 OP 的问题中),请测试数据帧的大小。我假设这df是一个数据框,但边缘情况是一个空的数据框、一个形状为 (1, 1) 的数据框和一个多于一行的数据框,在这种情况下,使用应该实现其所需的功能。

if df.empty:
    # Empty dataframe, so convert to empty Series.
    result = pd.Series()
elif df.shape == (1, 1)
    # DataFrame with one value, so convert to series with appropriate index.
    result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
    # Convert to series per OP's question.
    result = df.T.squeeze()
else:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass

This can also be simplified along the lines of the answer provided by @themachinist.

这也可以按照@themachinist 提供的答案进行简化。

if len(df) > 1:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass
else:
    result = pd.Series() if df.empty else df.iloc[0, :]

回答by Tauseef Malik

Another way -

其它的办法 -

Suppose myResult is the dataFrame that contains your data in the form of 1 col and 23 rows

假设 myResult 是包含 1 列和 23 行形式的数据的数据帧

// label your columns by passing a list of names
myResult.columns = ['firstCol']

// fetch the column in this way, which will return you a series
myResult = myResult['firstCol']

print(type(myResult))

In similar fashion, you can get series from Dataframe with multiple columns.

以类似的方式,您可以从具有多列的 Dataframe 中获取系列。

回答by user12230680

data = pd.DataFrame({"a":[1,2,3,34],"b":[5,6,7,8]})
new_data = pd.melt(data)
new_data.set_index("variable", inplace=True)

This gives a dataframe with index as column name of data and all data are present in "values" column

这给出了一个带有索引作为数据列名的数据框,所有数据都存在于“值”列中