Python pandas 数据框警告，建议使用 .loc 代替？

Question

提问by KubiK888

Hi I would like to manipulate the data by removing missing information and make all letters lower case. But for the lowercase conversion, I get this warning:

嗨，我想通过删除丢失的信息并使所有字母小写来操作数据。但是对于小写转换，我收到此警告：

E:\Program Files Extra\Python27\lib\site-packages\pandas\core\frame.py:1808: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  "DataFrame index.", UserWarning)
C:\Users\KubiK\Desktop\FamSeach_NameHandling.py:18: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copyframe3["name"] = frame3["name"].str.lower()

请参阅文档中的警告：http: //pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copyframe3["name"] = frame3["name"].str.lower()

C:\Users\KubiK\Desktop\FamSeach_NameHandling.py:19: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copyframe3["ethnicity"] = frame3["ethnicity"].str.lower()

请参阅文档中的警告：http: //pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copyframe3["ethnicity"] = frame3["ethnicity"].str.lower()

import pandas as pd
from pandas import DataFrame

# Get csv file into data frame
data = pd.read_csv("C:\Users\KubiK\Desktop\OddNames_sampleData.csv")
frame = DataFrame(data)
frame.columns = ["name", "ethnicity"]
name = frame.name
ethnicity = frame.ethnicity

# Remove missing ethnicity data cases
index_missEthnic = frame.ethnicity.isnull()
index_missName = frame.name.isnull()
frame2 = frame[index_missEthnic != True]
frame3 = frame2[index_missName != True]

# Make all letters into lowercase
frame3["name"] = frame3["name"].str.lower()
frame3["ethnicity"] = frame3["ethnicity"].str.lower()

# Test outputs
print frame3

This warning doesn't seem to be fatal (at least for my small sample data), but how should I deal with this?

这个警告似乎不是致命的（至少对于我的小样本数据），但是我应该如何处理？

Sample data

样本数据

Name    Ethnicity
Thos C. Martin                              Russian
Charlotte Wing                              English
Frederick A T Byrne                         Canadian
J George Christe                            French
Mary R O'brien                              English
Marie A Savoie-dit Dugas                    English
J-b'te Letourneau                           Scotish
Jane Mc-earthar                             French
Amabil?? Bonneau                            English
Emma Lef??c                                 French
C., Akeefe                                  African
D, James Matheson                           English
Marie An: Thomas                            English
Susan Rrumb;u                               English
                                            English
Kaio Chan

Answer 1

采纳答案by Alexander

When you set frame2/3, trying using .loc as follows:

当您设置 frame2/3 时，尝试使用 .loc 如下：

frame2 = frame.loc[~index_missEthnic, :]
frame3 = frame2.loc[~index_missName, :]

I think this would fix the error you're seeing:

我认为这可以解决您看到的错误：

frame3.loc[:, "name"] = frame3.loc[:, "name"].str.lower()
frame3.loc[:, "ethnicity"] = frame3.loc[:, "ethnicity"].str.lower()

You can also try the following, although it doesn't answer your question:

您也可以尝试以下操作，尽管它不能回答您的问题：

frame3.loc[:, "name"] = [t.lower() if isinstance(t, str) else t for t in frame3.name]
frame3.loc[:, "ethnicity"] = [t.lower() if isinstance(t, str) else t for t in frame3. ethnicity]

This converts any string in the column into lowercase, otherwise it leaves the value untouched.

这会将列中的任何字符串转换为小写，否则不会更改该值。

Answer 2

回答by Primer

Not sure why do you need so many booleans... Also note that .isnull()does not catch empty strings. And filtering empty string before applying .lower()doesn't seems neccessary either. But it there is a need... This works for me:

不知道为什么你需要这么多布尔值......还要注意，.isnull()不会捕获空字符串。在应用之前过滤空字符串.lower()似乎也没有必要。但它有需要......这对我有用：

frame = pd.DataFrame({'name':['Abc Def', 'EFG GH', ''], 'ethnicity':['Ethnicity1','', 'Ethnicity2']})
print frame

    ethnicity     name
0  Ethnicity1  Abc Def
1               EFG GH
2  Ethnicity2         

name_null = frame.name.str.len() == 0
frame.loc[~name_null, 'name'] = frame.loc[~name_null, 'name'].str.lower()
print frame

    ethnicity     name
0  Ethnicity1  abc def
1               efg gh
2  Ethnicity2

Python pandas 数据框警告，建议使用 .loc 代替？

提问by KubiK888

采纳答案by Alexander

回答by Primer

相关推荐

最近更新

标签

Python pandas 数据框警告，建议使用 .loc 代替？

提问by KubiK888

采纳答案by Alexander

回答by Primer

相关推荐

Pandas 将一列列表转换为哑元

在 pycharm 中绘制数据框（pandas），不显示

尽管数据已填充数据，pandas df.corr() 仍返回 NaN

pandas AttributeError: 'module' 对象没有属性 'hist'

相关推荐

最近更新

标签