在 Pandas 中使用 TQDM 进度条

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47087741/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:43:27  来源:igfitidea点击:

Use TQDM Progress Bar with Pandas

pythonpandastqdm

提问by sslack88

Is it possible to use TQDM progress bar when importing and indexing large datasets using Pandas?

使用 Pandas 导入和索引大型数据集时是否可以使用 TQDM 进度条?

Here is an example of of some 5-minute data I am importing, indexing, and using to_datetime. It takes a while and it would be nice to see a progress bar.

这是我正在导入、索引和使用 to_datetime 的一些 5 分钟数据的示例。这需要一段时间,如果能看到进度条就好了。

#Import csv files into a Pandas dataframes and convert to Pandas datetime and set to index

eurusd_ask = pd.read_csv('EURUSD_Candlestick_5_m_ASK_01.01.2012-05.08.2017.csv')
eurusd_ask.index = pd.to_datetime(eurusd_ask.pop('Gmt time'))

回答by Arjun Kava

Find length by getting shape

通过形状求长度

for index, row in tqdm(df.iterrows(), total=df.shape[0]):
   print("index",index)
   print("row",row)

回答by sonance207

I have used something like this when iterating some Dataframe rows.

我在迭代一些 Dataframe 行时使用过类似的东西。

    with tqdm(total=len(list(Df.iterrows()))) as pbar:
        for index, row in Df.iterrows():
                pbar.update(1)

Not the best but it works until they fix the issue with pandas.

不是最好的,但它可以工作,直到他们解决了Pandas的问题。

回答by Zeke Arneodo

There is a workaround for tqdm > 4.24. As per https://github.com/tqdm/tqdm#pandas-integration:

tqdm > 4.24 有一个解决方法。根据https://github.com/tqdm/tqdm#pandas-integration

    from tqdm import tqdm

    # Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
    # (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.)
    tqdm.pandas(desc="my bar!")
    eurusd_ask['t_stamp'] = eurusd_ask['Gmt time'].progress_apply(lambda x: pd.Timestamp)
    eurusd_ask.set_index(['t_stamp'], inplace=True)

回答by ZeerakW

You could fill a pandas data frame in line by line by reading the file normally and simply add each new line as a new row to the dataframe, though this would be a fair bit slower than just using Pandas own reading methods.

您可以通过正常读取文件逐行填充 Pandas 数据帧,然后简单地将每个新行作为新行添加到数据帧中,尽管这比仅使用 Pandas 自己的读取方法要慢一些。