Python 使用 pandas 和 str.strip 崩溃

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21286558/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:36:28  来源:igfitidea点击:

Python crashes using pandas and str.strip

pythonpandasstrip

提问by Yariv

This minimal code crashes my Python. (Setting: pandas 0.13.0, python 2.7.3 AMD64, Win7.)

这个最小的代码使我的 Python 崩溃。(设置:pandas 0.13.0、python 2.7.3 AMD64、Win7。)

import pandas as pd
input_file = r"c3.csv"
input_df = pd.read_csv(input_file)
for col in input_df.columns:  # strip whitespaces from string values
    if input_df[col].dtype == object:
        input_df[col] = input_df[col].apply(lambda x: x.strip())
print 'start'
for idx in range(len(input_df)):
    input_df['LL'].iloc[idx] = 3
    print idx
print 'finished'

Output:

输出:

start
0

Process finished with exit code -1073741819

What prevents the crash:

什么可以防止崩溃:

  1. Removing lines from c3.csv.
  2. Removing .strip()from the code.
  3. Changing c3.csv changes the amount of foriterations until the crash in unexpected ways.
  1. 从 c3.csv 中删除行。
  2. .strip()从代码中删除。
  3. 更改 c3.csv 会更改for迭代次数,直到以意想不到的方式崩溃。

Contents of c3.csv:

c3.csv 的内容:

 Size    , B/S , Symbol    , Type , BN , Duration , VR , Time    , SR ,LL,
0, xxxx , xxxx0 ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
00, xxxx , xxxxx ,   ,, xxx , 00000 , 00:00:00 , 000000000 , 00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,
0, xxxx , xxxxx ,   ,, xxx , 00000 , 00-00:00:00 , 000000000 , 00-00:00:00 ,

回答by Jeff

You are doing a chained assignment which can behave in unexpected ways. see here: http://pandas.pydata.org/pandas-docs/dev/indexing.html#indexing-view-versus-copy. This is fixed in master and will work in 0.13.1 (coming soon). see here: https://github.com/pydata/pandas/pull/6031

您正在执行一个可能以意想不到的方式运行的链式作业。请参见此处:http: //pandas.pydata.org/pandas-docs/dev/indexing.html#indexing-view-versus-copy。这已在 master 中修复,并将在 0.13.1(即将推出)中起作用。见这里:https: //github.com/pydata/pandas/pull/6031

This is not correct to do:

这样做是不正确的:

input_df['LL'].iloc[idx] = 3

Instead do:

而是这样做:

input_df.ix[ix,'LL'] = 3

Or even better (as you are assigning ALL rows to 3)

甚至更好(因为您将所有行分配给 3)

input_df['LL'] = 3

If you are assigning just some of the rows (and have say an integer/boolean indexer)

如果您只分配一些行(并说一个整数/布尔索引器)

input_df.ix[indexer,'LL'] = 3

You should also just do this to strip the whitespace:

您也应该这样做以去除空格:

input_df[col] = input_df[col].str.strip()