Pandas - 在 applymap 期间检索每个元素的行和列名称
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43654727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas - retrieve row and column name for each element during applymap
提问by jim basquiat
I am trying to compare 2 lists of strings for similarity and present them in a pandas dataframe for inspection; so i use 1 list as index and the other as column list. I then want to compute the "Levenshtein similarity" on them (a function that compar the similarity between the 2 words).
我正在尝试比较 2 个字符串列表的相似性,并将它们呈现在 Pandas 数据框中以供检查;所以我使用 1 个列表作为索引,另一个作为列列表。然后我想计算它们的“Levenshtein 相似度”(一个比较两个词之间相似度的函数)。
I am trying to achieve that by using apply map, that will go in every cell, and compar the cell index to the cell column. But how could I do that? Or maybe there would be some simpler ways?
我试图通过使用应用映射来实现这一点,它将进入每个单元格,并将单元格索引与单元格列进行比较。但我怎么能这样做呢?或者也许会有一些更简单的方法?
things = ['car', 'bike', 'sidewalk', 'eatery']
action = ['walking', 'caring', 'biking', 'eating']
matrix = pd.DataFrame(index = things, columns = action)
def lev(x):
x = Levenshtein.distance(x.index, x.column)
matrix.applymap(lev)
so far I resorted to use the following (below) but I find it clumsy and slow
到目前为止,我使用以下(如下)但我发现它笨拙而缓慢
matrix = pd.DataFrame(data = [action for i in things], index = things, columns = action)
for i, values in matrix.iterrows():
for j, value in enumerate(values):
matrix.ix[i,j] = Levenshtein.distance(i, value)
回答by jezrael
I think you can use apply
and for columns values use .name
:
我认为您可以使用apply
和 列值使用.name
:
def lev(x):
#replace your function
return x.index + x.name
a = matrix.apply(lev)
print (a)
walking caring biking eating
car carwalking carcaring carbiking careating
bike bikewalking bikecaring bikebiking bikeeating
sidewalk sidewalkwalking sidewalkcaring sidewalkbiking sidewalkeating
eatery eaterywalking eaterycaring eaterybiking eateryeating
EDIT:
编辑:
If need some arithemtic operation use broadcasting:
如果需要一些算术运算使用广播:
a = pd.DataFrame(matrix.index.values + matrix.columns.values[:,None],
index=matrix.index,
columns=matrix.columns)
print (a)
walking caring biking eating
car carwalking bikewalking sidewalkwalking eaterywalking
bike carcaring bikecaring sidewalkcaring eaterycaring
sidewalk carbiking bikebiking sidewalkbiking eaterybiking
eatery careating bikeeating sidewalkeating eateryeating
Or:
或者:
a = pd.DataFrame(matrix.index.values + matrix.columns.values[:, np.newaxis],
index=matrix.index,
columns=matrix.columns)
print (a)
walking caring biking eating
car carwalking bikewalking sidewalkwalking eaterywalking
bike carcaring bikecaring sidewalkcaring eaterycaring
sidewalk carbiking bikebiking sidewalkbiking eaterybiking
eatery careating bikeeating sidewalkeating eateryeating
回答by chaonan99
You can do that by "nested apply
" as follows:
您可以通过“嵌套apply
”来做到这一点,如下所示:
things = ['car', 'bike', 'sidewalk', 'eatery']
action = ['walking', 'caring', 'biking', 'eating']
matrix = pd.DataFrame(index=things, columns=action)
matrix.apply(lambda x: pd.DataFrame(x).apply(lambda y: LD(x.name, y.name), axis=1))
Output:
输出:
walking caring biking eating
car 6 3 6 5
bike 6 5 3 5
sidewalk 7 8 7 8
eatery 6 5 6 3
The call pd.DataFrame(x)
here is because x
is a Series
object and the Series.apply
is similar to applymap
, which does not carry index
or columns
information.
pd.DataFrame(x)
这里的调用是因为x
是一个Series
对象,Series.apply
类似于applymap
,不携带index
或columns
信息。