pandas 从多列制作熊猫数据框行值的列表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43023020/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Making a list of pandas dataframe row values from multiple columns
提问by Juho M
I have this data in a pandas.DataFrame
:
我有这个数据pandas.DataFrame
:
Date, Team1, Team2, Team1 Score, Team2 Score, Event
8/2/17, Juventus, Milan, 2, 1, Friendly match
6/2/17, Milan, Napoli, 3, 0, Friendly match
5/1/17, Milan, Sampdoria, 1, 0, Friendly match
25/12/16, Parma, Milan, 0, 5, Friendly match
How I can make a list of Milanscored goals?
我如何制作米兰进球列表?
The output should look like::
输出应如下所示:
[1, 3, 1, 5]
回答by Psidom
You can use numpy
arrays' boolean indexing, here use values
to get a 2D numpy array and use boolean indexing to get the values where Team
is Milan
:
您可以使用numpy
数组的布尔索引,此处用于values
获取 2D numpy 数组并使用布尔索引获取值,其中Team
is Milan
:
df[["Team1 Score", "Team2 Score"]].values[df[["Team1", "Team2"]] == "Milan"]
# array([1, 3, 1, 5])
回答by Miriam Farber
This will do the job:
这将完成这项工作:
pd.concat([df["Team1 Score"][df.Team1=='Milan'],df["Team2 Score"][df.Team2=='Milan']]).sort_index().values.tolist()
The output is [1, 3, 1, 5]
输出是 [1, 3, 1, 5]
回答by piRSquared
# slice df with just team columns and get values
t = df[['Team1', 'Team2']].values
# find the row and column slices where equal to 'Milan'
i, j = np.where(t == 'Milan')
# then slice the scores array with those positions
s = df[['Team1 Score', 'Team2 Score']].values
s[i, j]
array([1, 3, 1, 5])
I can compress this further because I know where all the columns are
我可以进一步压缩它,因为我知道所有列的位置
v = df.values
i, j = np.where(v[:, [1, 2]] == 'Milan')
v[:, [3, 4]][i, j]
array([1, 3, 1, 5])
回答by ??????
Milano squadra mia
米兰小队
df['tmp1'] = df.loc[df.Team1 == 'Milan', 'Team1 Score']
df['tmp2'] = df.loc[df.Team2 == 'Milan', 'Team2 Score']
df['milazzo'] = df.tmp1.fillna(0) + df.tmp2.fillna(0)
df.milazzo.tolist()
In [73]: df.milazzo.tolist()
Out[73]: [1.0, 3.0, 1.0, 5.0]
回答by Tristan
You can also use apply:
您还可以使用申请:
outlist = df[(df['Team1'] == 'Milan') | (df['Team2'] == 'Milan')].apply(
lambda k: k['Team1 Score'] if k['Team1'] == 'Milan' else k['Team2 Score'], axis=1
).tolist()
回答by Stephen Rauch
You can use pandas.DataFrame.apply()
with a function to return a match for the team in either column.
您可以使用pandas.DataFrame.apply()
函数返回任一列中团队的匹配项。
Code:
代码:
def get_team_score(team):
def f(row):
if row.Team1 == team:
return row['Team1 Score']
if row.Team2 == team:
return row['Team2 Score']
return f
Test Code:
测试代码:
from io import StringIO
df = pd.read_csv(data)
print(df)
print(df.apply(get_team_score('Milan'), axis=1).values)
Test Data:
测试数据:
import pandas as pd
data = StringIO(u"""Date,Team1,Team2,Team1 Score,Team2 Score,Event
8/2/17,Juventus,Milan,2,1,Friendly match
6/2/17,Milan,Napoli,3,0,Friendly match
5/1/17,Milan,Sampdoria,1,0,Friendly match
25/12/16,Parma,Milan,0,5,Friendly match
""")
Results:
结果:
Date Team1 Team2 Team1 Score Team2 Score Event
0 8/2/17 Juventus Milan 2 1 Friendly match
1 6/2/17 Milan Napoli 3 0 Friendly match
2 5/1/17 Milan Sampdoria 1 0 Friendly match
3 25/12/16 Parma Milan 0 5 Friendly match
[1 3 1 5]