pandas 在熊猫中截断列宽
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22792740/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Truncating column width in pandas
提问by Luke
I'm reading in large csv files into pandas some of them with String columns in the thousands of characters. Is there any quick way to limit the width of a column, i.e. only keep the first 100 characters?
我正在将大型 csv 文件读入 Pandas,其中一些带有数千个字符的字符串列。有没有什么快速的方法来限制列的宽度,即只保留前 100 个字符?
回答by DSM
If you can read the whole thing into memory, you can use the strmethod for vector operations:
如果可以将整个内容读入内存,则可以使用该str方法进行向量操作:
>>> df = pd.read_csv("toolong.csv")
>>> df
a b c
0 1 1256378916212378918293 2
[1 rows x 3 columns]
>>> df["b"] = df["b"].str[:10]
>>> df
a b c
0 1 1256378916 2
[1 rows x 3 columns]
Also note that you can get a Series with lengths using
另请注意,您可以使用以下方法获得具有长度的系列
>>> df["b"].str.len()
0 10
Name: b, dtype: int64
I was originally wondering if
我最初想知道是否
>>> pd.read_csv("toolong.csv", converters={"b": lambda x: x[:5]})
a b c
0 1 12563 2
[1 rows x 3 columns]
would be better but I don't actually know if the converters are called row-by-row or after the fact on the whole column.
会更好,但我实际上不知道转换器是逐行调用还是在整列之后调用。

