Python TensorFlow - 从 TensorBoard TFEvent 文件导入数据?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37304461/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:12:35  来源:igfitidea点击:

TensorFlow - Importing data from a TensorBoard TFEvent file?

pythontensorflowtensorboard

提问by golmschenk

I've run several training sessions with different graphs in TensorFlow. The summaries I set up show interesting results in the training and validation. Now, I'd like to take the data I've saved in the summary logs and perform some statistical analysis and in general plot and look at the summary data in different ways. Is there any existing way to easily access this data?

我已经在 TensorFlow 中使用不同的图表运行了几次培训课程。我设置的摘要在训练和验证中显示了有趣的结果。现在,我想获取我保存在摘要日志中的数据并执行一些统计分析和一般绘图,并以不同的方式查看摘要数据。是否有任何现有方法可以轻松访问这些数据?

More specifically, is there any built in way to read a TFEvent record back into Python?

更具体地说,是否有任何内置方法可以将 TFEvent 记录读回 Python?

If there is no simple way to do this, TensorFlow states that all its file formats are protobuf files. From my understanding of protobufs (which is limited), I think I'd be able to extract this data if I have the TFEvent protocol specification. Is there an easy way to get ahold of this? Thank you much.

如果没有简单的方法可以做到这一点,TensorFlow 声明其所有文件格式都是 protobuf 文件。根据我对 protobufs 的理解(这是有限的),我想如果我有 TFEvent 协议规范,我就可以提取这些数据。有没有简单的方法来了解这个?非常感谢。

回答by mrry

As Fabrizio says, TensorBoard is a great tool for visualizing the contents of your summary logs. However, if you want to perform a custom analysis, you can use tf.train.summary_iterator()function to loop over all of the tf.Eventand tf.Summaryprotocol buffers in the log:

正如 Fabrizio所说,TensorBoard 是用于可视化摘要日志内容的绝佳工具。但是,如果要执行自定义分析,可以使用tf.train.summary_iterator()函数来循环日志中的所有tf.Eventtf.Summary协议缓冲区:

for summary in tf.train.summary_iterator("/path/to/log/file"):
    # Perform custom processing in here.

UPDATE for tf2:

tf2 更新:

from tensorflow.python.summary.summary_iterator import summary_iterator

You need to import it, that module level is not currently imported by default. On 2.0.0-rc2

您需要导入它,默认情况下当前未导入该模块级别。在 2.0.0-rc2 上

回答by Temak

To read a TFEvent you can get a Python iterator that yields Event protocol buffers.

要读取 TFEvent,您可以获得一个生成事件协议缓冲区的 Python 迭代器。

# This example supposes that the events file contains summaries with a
# summary value tag 'loss'.  These could have been added by calling
# `add_summary()`, passing the output of a scalar summary op created with
# with: `tf.scalar_summary(['loss'], loss_tensor)`.
for e in tf.train.summary_iterator(path_to_events_file):
    for v in e.summary.value:
        if v.tag == 'loss' or v.tag == 'accuracy':
            print(v.simple_value)

more info: summary_iterator

更多信息:summary_iterator

回答by fabrizioM

You can simply use:

您可以简单地使用:

tensorboard --inspect --event_file=myevents.out

or if you want to filter a specific subset of events of the graph:

或者,如果您想过滤图形的特定事件子集:

tensorboard --inspect --event_file=myevents.out --tag=loss

If you want to create something more custom you can dig into the

如果你想创建一些更自定义的东西,你可以深入研究

/tensorflow/python/summary/event_file_inspector.py 

to understand how to parse the event files.

了解如何解析事件文件。

回答by Duane

Here is a complete example for obtaining values from a scalar. You can see the message specification for the Event protobuf message here

这是从标量获取值的完整示例。您可以在此处查看Event protobuf 消息的消息规范

import tensorflow as tf


for event in tf.train.summary_iterator('runs/easy_name/events.out.tfevents.1521590363.DESKTOP-43A62TM'):
    for value in event.summary.value:
        print(value.tag)
        if value.HasField('simple_value'):
            print(value.simple_value)

回答by Yodogawa Mikio

Following works as of tensorflow version 2.0.0-beta1:

以下作品从 tensorflow 版本开始2.0.0-beta1

import os

import tensorflow as tf
from tensorflow.python.framework import tensor_util

summary_dir = 'tmp/summaries'
summary_writer = tf.summary.create_file_writer('tmp/summaries')

with summary_writer.as_default():
  tf.summary.scalar('loss', 0.1, step=42)
  tf.summary.scalar('loss', 0.2, step=43)
  tf.summary.scalar('loss', 0.3, step=44)
  tf.summary.scalar('loss', 0.4, step=45)


from tensorflow.core.util import event_pb2
from tensorflow.python.lib.io import tf_record

def my_summary_iterator(path):
    for r in tf_record.tf_record_iterator(path):
        yield event_pb2.Event.FromString(r)

for filename in os.listdir(summary_dir):
    path = os.path.join(summary_dir, filename)
    for event in my_summary_iterator(path):
        for value in event.summary.value:
            t = tensor_util.MakeNdarray(value.tensor)
            print(value.tag, event.step, t, type(t))

the code for my_summary_iteratoris copied from tensorflow.python.summary.summary_iterator.py- there was no way to import it at runtime.

的代码my_summary_iterator是从中复制的tensorflow.python.summary.summary_iterator.py- 无法在运行时导入它。

回答by dandelion

You can use the script serialize_tensorboard, which will take in a logdir and write out all the data in json format.

您可以使用脚本serialize_tensorboard,它将接收一个 logdir 并以 json 格式写出所有数据。

You can also use an EventAccumulatorfor a convenient Python API (this is the same API that TensorBoard uses).

您还可以将EventAccumulator用于方便的 Python API(这与 TensorBoard 使用的 API 相同)。

回答by Sam Shleifer

I've been using this. It assumes that you only want to see tags you've logged more than once whose values are floats and returns the results as a pd.DataFrame. Just call metrics_df = parse_events_file(path).

我一直在用这个。它假设您只想查看已记录多次且值为浮点数的标签,并将结果作为pd.DataFrame. 就打电话metrics_df = parse_events_file(path)

from collections import defaultdict
import pandas as pd
import tensorflow as tf

def is_interesting_tag(tag):
    if 'val' in tag or 'train' in tag:
        return True
    else:
        return False


def parse_events_file(path: str) -> pd.DataFrame:
    metrics = defaultdict(list)
    for e in tf.train.summary_iterator(path):
        for v in e.summary.value:

            if isinstance(v.simple_value, float) and is_interesting_tag(v.tag):
                metrics[v.tag].append(v.simple_value)
            if v.tag == 'loss' or v.tag == 'accuracy':
                print(v.simple_value)
    metrics_df = pd.DataFrame({k: v for k,v in metrics.items() if len(v) > 1})
    return metrics_df