Python TensorFlow对象检测API教程中获取bounding box坐标

Question

提问by Mandy

I am new to both python and Tensorflow. I am trying to run the object_detection_tutorial file from the Tensorflow Object Detection API, but I cannot find where I can get the coordinates of the bounding boxes when objects are detected.

我是 python 和 Tensorflow 的新手。我正在尝试从Tensorflow 对象检测 API运行 object_detection_tutorial 文件，但是当检测到对象时，我找不到在哪里可以获得边界框的坐标。

Relevant code:

相关代码：

 # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])

...

The place where I assume bounding boxes are drawn is like this:

我假设绘制边界框的地方是这样的：

 # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image_np)

I tried printing output_dict['detection_boxes'] but I am not sure what the numbers mean. There are a lot.

我尝试打印 output_dict['detection_boxes'] 但我不确定这些数字是什么意思。有很多。

array([[ 0.56213236,  0.2780568 ,  0.91445708,  0.69120586],
       [ 0.56261235,  0.86368728,  0.59286624,  0.8893863 ],
       [ 0.57073039,  0.87096912,  0.61292225,  0.90354401],
       [ 0.51422435,  0.78449738,  0.53994244,  0.79437423],

......

   [ 0.32784131,  0.5461576 ,  0.36972913,  0.56903434],
   [ 0.03005961,  0.02714229,  0.47211722,  0.44683522],
   [ 0.43143299, 0.09211366,  0.58121657,  0.3509962 ]], dtype=float32)

I found answers for similar questions, but I don't have a variable called boxes as they do. How can I get the coordinates? Thank you!

我找到了类似问题的答案，但我没有像他们那样有一个叫做 box 的变量。我怎样才能得到坐标？谢谢！

Answer 1

回答by MFisherKDX

I tried printing output_dict['detection_boxes'] but I am not sure what the numbers mean

我尝试打印 output_dict['detection_boxes'] 但我不确定这些数字是什么意思

You can check out the code for yourself. visualize_boxes_and_labels_on_image_arrayis defined here.

您可以自己查看代码。在这里visualize_boxes_and_labels_on_image_array定义。

Note that you are passing use_normalized_coordinates=True. If you trace the function calls, you will see your numbers [ 0.56213236, 0.2780568 , 0.91445708, 0.69120586]etc. are the values [ymin, xmin, ymax, xmax]where the image coordinates:

请注意，您正在通过use_normalized_coordinates=True。如果您跟踪函数调用，您将看到您的数字[ 0.56213236, 0.2780568 , 0.91445708, 0.69120586]等是[ymin, xmin, ymax, xmax]图像坐标的值：

(left, right, top, bottom) = (xmin * im_width, xmax * im_width, 
                              ymin * im_height, ymax * im_height)

are computed by the function:

由函数计算：

def draw_bounding_box_on_image(image,
                           ymin,
                           xmin,
                           ymax,
                           xmax,
                           color='red',
                           thickness=4,
                           display_str_list=(),
                           use_normalized_coordinates=True):
  """Adds a bounding box to an image.
  Bounding box coordinates can be specified in either absolute (pixel) or
  normalized coordinates by setting the use_normalized_coordinates argument.
  Each string in display_str_list is displayed on a separate line above the
  bounding box in black text on a rectangle filled with the input 'color'.
  If the top of the bounding box extends to the edge of the image, the strings
  are displayed below the bounding box.
  Args:
    image: a PIL.Image object.
    ymin: ymin of bounding box.
    xmin: xmin of bounding box.
    ymax: ymax of bounding box.
    xmax: xmax of bounding box.
    color: color to draw bounding box. Default is red.
    thickness: line thickness. Default value is 4.
    display_str_list: list of strings to display in box
                      (each to be shown on its own line).
    use_normalized_coordinates: If True (default), treat coordinates
      ymin, xmin, ymax, xmax as relative to the image.  Otherwise treat
      coordinates as absolute.
  """
  draw = ImageDraw.Draw(image)
  im_width, im_height = image.size
  if use_normalized_coordinates:
    (left, right, top, bottom) = (xmin * im_width, xmax * im_width,
                                  ymin * im_height, ymax * im_height)

Answer 2

回答by Vadim

I've got exactly the same story. Got an array with roughly hundred boxes (output_dict['detection_boxes']) when only one was displayed on an image. Digging deeper into code which is drawing a rectangle was able to extract that and use in my inference.py:

我有完全一样的故事。output_dict['detection_boxes']当图像上只显示一个框时，得到一个包含大约一百个框 ( )的数组。深入研究绘制矩形的代码能够提取它并在我的inference.py：

#so detection has happened and you've got output_dict as a
# result of your inference

# then assume you've got this in your inference.py in order to draw rectangles
vis_util.visualize_boxes_and_labels_on_image_array(
    image_np,
    output_dict['detection_boxes'],
    output_dict['detection_classes'],
    output_dict['detection_scores'],
    category_index,
    instance_masks=output_dict.get('detection_masks'),
    use_normalized_coordinates=True,
    line_thickness=8)

# This is the way I'm getting my coordinates
boxes = output_dict['detection_boxes']
# get all boxes from an array
max_boxes_to_draw = boxes.shape[0]
# get scores to get a threshold
scores = output_dict['detection_scores']
# this is set as a default but feel free to adjust it to your needs
min_score_thresh=.5
# iterate over all objects found
for i in range(min(max_boxes_to_draw, boxes.shape[0])):
    # 
    if scores is None or scores[i] > min_score_thresh:
        # boxes[i] is the box which will be drawn
        class_name = category_index[output_dict['detection_classes'][i]]['name']
        print ("This box is gonna get used", boxes[i], output_dict['detection_classes'][i])

Python TensorFlow对象检测API教程中获取bounding box坐标

提问by Mandy

回答by MFisherKDX

回答by Vadim

相关推荐

最近更新

标签

Python TensorFlow对象检测API教程中获取bounding box坐标

提问by Mandy

回答by MFisherKDX

回答by Vadim

相关推荐

Python 如何运行克隆的 Django 项目？

Python Google Colaboratory：关于其 GPU 的误导性信息（某些用户只能使用 5% 的 RAM）

Python 导入错误：无法导入名称工具

Python 什么是“IndexError：列表索引超出范围”？

相关推荐

最近更新

标签