tensorflow处理标注框

draw_bounding_boxes

#给一批图片绘制方框，每张图片的方框数量、大小、位置都一样。

#boxes：shape：[batch, num_bounding_boxes, 4]，方框坐标 [y_min, x_min, y_max, x_max]，取值范围[0.0, 1.0]。

import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
image_raw_data = tf.gfile.FastGFile("C:/Users/admin/Desktop/cat.jpg","r").read();
with tf.Session() as sess:
     img_data = tf.image.decode_jpeg(image_raw_data)
     print("Digital type: ", img_data.dtype)
     plt.imshow(img_data.eval())
     plt.show()
     img_data=tf.image.resize_images(img_data,[180,267],method=1)
     print("Digital type: ", img_data.dtype)
     batched=tf.expand_dims(tf.image.convert_image_dtype(img_data,tf.float32),0)
     boxes=tf.constant([[[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]]])
     result=tf.image.draw_bounding_boxes(batched,boxes)
     plt.figure(1)  
     plt.imshow(result.eval().reshape([180, 267, 3]))  
     plt.show()

https://www.jianshu.com/p/9de529f48d64

sample_distorted_bounding_box

代码

import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
image_raw_data = tf.gfile.FastGFile("C:/Users/admin/Desktop/cat.jpg","r").read();
with tf.Session() as sess:
     img_data = tf.image.decode_jpeg(image_raw_data)
     print("Digital type: ", img_data.dtype)
     plt.imshow(img_data.eval())
     plt.show()
     img_data=tf.image.resize_images(img_data,[180,267],method=1)
     print("Digital type: ", img_data.dtype)
     boxes=tf.constant([[[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]]])
     begin,size,bbox_for_draw=tf.image.sample_distorted_bounding_box(tf.shape(img_data),bounding_boxes=boxes)
     batched=tf.expand_dims(tf.image.convert_image_dtype(img_data,tf.float32),0)
     image_with_box=tf.image.draw_bounding_boxes(batched,bbox_for_draw)
     distorted_image=tf.slice(img_data,begin,size)
     plt.imshow(distorted_image.eval())  
     plt.show()

参数理解

此函数为图像生成单个随机变形的边界框。函数输出的是可用于裁剪原始图像的单个边框。返回值为3个张量：begin，size和 bboxes。前2个张量用于 tf.slice 剪裁图像。后者可以用于 tf.image.draw_bounding_boxes 函数来画出边界框。

sample_distorted_bounding_box(
image_size,
bounding_boxes,
seed=None,
seed2=None,
min_object_covered=None,
aspect_ratio_range=None,
area_range=None,
max_attempts=None,
use_image_if_no_bounding_boxes=None,
name=None
)
image_size：是包含 [height, width, channels] 三个值的一维数组。数值类型必须是 uint8，int8，int16，int32，int64 中的一种。

bounding_boxes：是一个 shape 为 [batch, N, 4] 的三维数组，数据类型为float32，第一个batch是因为函数是处理一组图片的，N表示描述与图像相关联的N个边界框的形状，而标注框由4个数字 [y_min, x_min, y_max, x_max] 表示出来。例如：tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]]) 的 shape 为 [1,2,4] 表示一张图片中的两个标注框；tf.constant([[[ 0. 0. 1. 1.]]]) 的 shape 为 [1,1,4]表示一张图片中的一个标注框

seed：（可选）数组类型为 int，默认为0。如果任一个seed或被seed2设置为非零，随机数生成器由给定的种子生成。否则，由随机种子生成。
seed2：（可选）数组类型为 int，默认为0。第二种子避免种子冲突。

min_object_covered：（可选）数组类型为 float，默认为 0.1。图像的裁剪区域必须包含所提供的任意一个边界框的至少 min_object_covered 的内容。该参数的值应为非负数，当为0时，裁剪区域不必与提供的任何边界框有重叠部分。

aspect_ratio_range：（可选）数组类型为 floats 的列表，默认为 [0.75, 1.33] 。图像的裁剪区域的宽高比（宽高比=宽/高）必须在这个范围内。

area_range：（可选）数组类型为 floats 的列表，默认为 [0.05, 1] 。图像的裁剪区域必须包含这个范围内的图像的一部分。

max_attempts：（可选）数组类型为 int，默认为100。尝试生成图像指定约束的裁剪区域的次数。经过 max_attempts 次失败后，将返回整个图像。

use_image_if_no_bounding_boxes：（可选）数组类型为 bool，默认为False。如果没有提供边框，则用它来控制行为。如果为True，则假设有一个覆盖整个输入的隐含边界框。如果为False，就报错。

name：操作的名称（可选）。

Return

Return：一个Tensor对象的元组（begin，size，bboxes）。

begin：和 image_size 具有相同的类型。包含 [offset_height, offset_width, 0] 的一维数组。作为 tf.slice 的输入。

size：和 image_size 具有相同的类型。包含 [target_height, target_width, -1] 的一维数组。作为 tf.slice 的输入。

bboxes：shape为 [1, 1, 4] 的三维矩阵，数据类型为float32，表示随机变形后的边界框。作为 tf.image.draw_bounding_boxes 的输入。

源码解释

tf.image.sample_distorted_bounding_box(image_size, bounding_boxes, seed=None, seed2=None, min_object_covered=None, aspect_ratio_range=None, area_range=None, max_attempts=None, use_image_if_no_bounding_boxes=None, name=None)

#随机输出截取图片。
Generate a single randomly distorted bounding box for an image.

Bounding box annotations are often supplied in addition to ground-truth labels
in image recognition or object localization tasks. A common technique for
training such a system is to randomly distort an image while preserving
its content, i.e. data augmentation. This Op outputs a randomly distorted
localization of an object, i.e. bounding box, given an image_size,
bounding_boxes and a series of constraints.

The output of this Op is a single bounding box that may be used to crop the
original image. The output is returned as 3 tensors: begin, size and
bboxes. The first 2 tensors can be fed directly into tf.slice to crop the
image. The latter may be supplied to tf.image.draw_bounding_boxes to visualize
what the bounding box looks like.

Bounding boxes are supplied and returned as [y_min, x_min, y_max, x_max]. The
bounding box coordinates are floats in [0.0, 1.0] relative to the width and
height of the underlying image.

For example,

# Generate a single distorted bounding box.
begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(
       tf.shape(image),
       bounding_boxes=bounding_boxes)
# Draw the bounding box in an image summary.
  image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
                                                 bbox_for_draw)
   tf.image_summary('images_with_box', image_with_box)
   # Employ the bounding box to distort the image.
   distorted_image = tf.slice(image, begin, size)

Note that if no bounding box information is available, setting
use_image_if_no_bounding_boxes = true will assume there is a single implicit
bounding box covering the whole image. If use_image_if_no_bounding_boxes is
false and no bounding boxes are supplied, an error is raised.