pandas 熊猫:将多列转换为字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37035182/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:10:17  来源:igfitidea点击:

pandas: convert multiple columns to string

stringpython-2.7pandas

提问by As3adTintin

I have some columns ['a', 'b', 'c', etc.](aand care float64while bis object)

我有一些列['a', 'b', 'c', etc.]acfloat64同时bobject

I would like to convert all columns to string and preserve nans.

我想将所有列转换为字符串并保留nans。

Tried using df[['a', 'b', 'c']] == df[['a', 'b', 'c']].astype(str)but that left blanks for the float64columns.

尝试使用,df[['a', 'b', 'c']] == df[['a', 'b', 'c']].astype(str)但为float64列留下了空白。

Currently I am going through one by one with the following:

目前我正在一一处理以下内容:

df['a'] = df['a'].apply(str)
df['a'] = df['a'].replace('nan', np.nan)

Is the best way to use .astype(str)and then replace ''with np.nan? Side question: is there a difference between .astype(str)and .apply(str)?

最好的方法是使用.astype(str)然后替换''np.nan附带问题:.astype(str)和之间有区别.apply(str)吗?

Sample Input:(dtypes: a=float64, b=object, c=float64)

样本输入:(dtypes:a=float64,b=object,c=float64)

a, b, c, etc.
23, 'a42', 142, etc.
51, '3', 12, etc.
NaN, NaN, NaN, etc.
24, 'a1', NaN, etc.

Desired output:(dtypes: a=object, b=object, c=object)

所需的输出:(dtypes:a=object,b=object,c=object)

a, b, c, etc.
'23', 'a42', '142', etc.
'51', 'a3', '12', etc.
NaN, NaN, NaN, etc.
'24', 'a1', NaN, etc.

回答by Alexander

df = pd.DataFrame({
    'a': [23.0, 51.0, np.nan, 24.0],
    'b': ["a42", "3", np.nan, "a1"],
    'c': [142.0, 12.0, np.nan, np.nan]})

for col in df:
    df[col] = [np.nan if (not isinstance(val, str) and np.isnan(val)) else 
               (val if isinstance(val, str) else str(int(val))) 
               for val in df[col].tolist()]

>>> df
     a    b    c
0   23  a42  142
1   51    3   12
2  NaN  NaN  NaN
3   24   a1  NaN

>>> df.values
array([['23', 'a42', '142'],
       ['51', '3', '12'],
       [nan, nan, nan],
       ['24', 'a1', nan]], dtype=object)

回答by Surya

You could apply .astype()function on every elements of dataframe, or could select the column of interest to convert to string by following ways too.

您可以.astype()对数据框的每个元素应用函数,也可以通过以下方式选择感兴趣的列以转换为字符串。

In [41]: df1 = pd.DataFrame({
    ...:     'a': [23.0, 51.0, np.nan, 24.0],
    ...:     'b': ["a42", "3", np.nan, "a1"],
    ...:     'c': [142.0, 12.0, np.nan, np.nan]})
    ...: 

In [42]: 

In [42]: df1
Out[42]: 
      a    b      c
0  23.0  a42  142.0
1  51.0    3   12.0
2   NaN  NaN    NaN
3  24.0   a1    NaN

### Shows current data type of the columns:
In [43]: df1.dtypes
Out[43]: 
a    float64
b     object
c    float64
dtype: object

### Applying .astype() on each element of the dataframe converts the datatype to string
In [45]: df1.astype(str).dtypes
Out[45]: 
a    object
b    object
c    object
dtype: object

### Or, you could select the column of interest to convert it to strings
In [48]: df1[["a", "b", "c"]] = df1[["a","b", "c"]].astype(str)

In [49]: df1.dtypes ### Datatype update
Out[49]: 
a    object
b    object
c    object
dtype: object

回答by Raj

This gives you the list of column names

这为您提供了列名列表

lst = list(df)

This converts all the columns to string type

这会将所有列转换为字符串类型

df[lst] = df[lst].astype(str)

回答by ahmad fairuz

I did this way.

我是这样做的。

get all your value from a specific column, e.g. 'text'.

从特定列中获取所有值,例如“文本”。

k = df['text'].values

then, run each value into a new declared string, e.g. 'thestring'

然后,将每个值运行到一个新的声明字符串中,例如 'thestring'

thestring = ""
for i in range(0,len(k)):
    thestring += k[i]
print(thestring)

hence, all string in column pandas 'text' has been put into one string variable.

因此,pandas 'text' 列中的所有字符串都已放入一个字符串变量中。

cheers, fairuz

欢呼,费尔鲁兹