pandas 如何创建具有重复字符串值的数据框列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35557872/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:44:14  来源:igfitidea点击:

How to create a dataframe column with repeated string value?

pythonstringpandasdataframe

提问by Katie R

I'm reading in data from a bunch of files and storing it in a data frame. I want a column of the data frame to indicate which file the data came from. How do I create a column that has the same string repeated over and over without typing it out manually?

我正在从一堆文件中读取数据并将其存储在数据框中。我想要一列数据框来指示数据来自哪个文件。如何创建一个重复重复相同字符串的列,而无需手动输入?

Each file I'm reading in has ~100 data points (but not the same number each time). As I read each one in, I will concat to the dataframe along axis=0. It should look like this.

我正在读入的每个文件都有大约 100 个数据点(但每次都不是相同的数字)。当我阅读每一个时,我将沿轴 = 0 连接到数据框。它应该是这样的。

import numpy as np
import pandas as pd
numbers = np.random.randn(5) # this data could be of any length, ~100
labels = np.array(['file01','file01','file01','file01','file01']) 
tf = pd.DataFrame()
tf['labels'] = labels
tf['numbers'] = numbers

In [8]: tf
Out[8]: 
   labels   numbers
0  file01 -0.176737
1  file01 -1.243871
2  file01  0.154886
3  file01  0.236653
4  file01 -0.195053

(Yes, I know I could make 'file01' a column header and append each one along axis=1, but there are reasons I don't want to do it that way.)

(是的,我知道我可以将 'file01' 作为列标题并沿轴 = 1 附加每个标题,但有一些原因我不想这样做。)

采纳答案by Flavian Hautbois

There you go, your code is fixed! You can actually put a single value in the dict used in the DataFrame constructor :).

好了,您的代码已修复!您实际上可以在 DataFrame 构造函数中使用的 dict 中放置一个值:)。

import numpy as np
import pandas as pd
filename = 'file01'
numbers = np.random.randn(5) # this data could be of any length, ~100
tf = pd.DataFrame({'labels': filename , 'numbers': numbers})

In [8]: tf
Out[8]: 
   labels   numbers
0  file01 -0.176737
1  file01 -1.243871
2  file01  0.154886
3  file01  0.236653
4  file01 -0.195053