在 Pandas 中使用 TQDM 进度条
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47087741/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Use TQDM Progress Bar with Pandas
提问by sslack88
Is it possible to use TQDM progress bar when importing and indexing large datasets using Pandas?
使用 Pandas 导入和索引大型数据集时是否可以使用 TQDM 进度条?
Here is an example of of some 5-minute data I am importing, indexing, and using to_datetime. It takes a while and it would be nice to see a progress bar.
这是我正在导入、索引和使用 to_datetime 的一些 5 分钟数据的示例。这需要一段时间,如果能看到进度条就好了。
#Import csv files into a Pandas dataframes and convert to Pandas datetime and set to index
eurusd_ask = pd.read_csv('EURUSD_Candlestick_5_m_ASK_01.01.2012-05.08.2017.csv')
eurusd_ask.index = pd.to_datetime(eurusd_ask.pop('Gmt time'))
回答by Arjun Kava
Find length by getting shape
通过形状求长度
for index, row in tqdm(df.iterrows(), total=df.shape[0]):
print("index",index)
print("row",row)
回答by sonance207
I have used something like this when iterating some Dataframe rows.
我在迭代一些 Dataframe 行时使用过类似的东西。
with tqdm(total=len(list(Df.iterrows()))) as pbar:
for index, row in Df.iterrows():
pbar.update(1)
Not the best but it works until they fix the issue with pandas.
不是最好的,但它可以工作,直到他们解决了Pandas的问题。
回答by Zeke Arneodo
There is a workaround for tqdm > 4.24. As per https://github.com/tqdm/tqdm#pandas-integration:
tqdm > 4.24 有一个解决方法。根据https://github.com/tqdm/tqdm#pandas-integration:
from tqdm import tqdm
# Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
# (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.)
tqdm.pandas(desc="my bar!")
eurusd_ask['t_stamp'] = eurusd_ask['Gmt time'].progress_apply(lambda x: pd.Timestamp)
eurusd_ask.set_index(['t_stamp'], inplace=True)
回答by ZeerakW
You could fill a pandas data frame in line by line by reading the file normally and simply add each new line as a new row to the dataframe, though this would be a fair bit slower than just using Pandas own reading methods.
您可以通过正常读取文件逐行填充 Pandas 数据帧,然后简单地将每个新行作为新行添加到数据帧中,尽管这比仅使用 Pandas 自己的读取方法要慢一些。