Basics of Images
Convert to Array
Using Numpy (1)
import numpy as np
import matplotlib.pyplot as plt
imgArr = np.asarray('imagepath')
plt.imshow(pic_arr)
Using Numpy (2)
Sometimes, a Type error will be prompted using cv2.rectangle()
when the input array is the
usual numpy array. We should use the following instead.
import numpy as np
imgArr = np.ascontiguousarray('imagepath')
Using OpenCV
import cv2
imgArr = cv2.imread('imagepath')
cv2.imshow('image',img)
# Wait for something on keyboard to be pressed to close window.
# 0 refers to 0 miliseconds of waiting
cv2.waitKey(0)
From base64 string
import base64
import cv2
npArr = np.fromstring(base64.b64decode(encodedImage), np.uint8)
imgArr = cv2.imdecode(npArr, cv2.IMREAD_ANYCOLOR)
Saving Images
cv2.imwrite('my_new_picture.jpg', imgArr)
Resizing Imagesw
Resizing by a specific scale
img = cv2.imread('imagepath', cv2.IMREAD_UNCHANGED)
scale = 0.6 # percent of original size
width = int(img.shape[1] * scale)
height = int(img.shape[0] * scale)
resized = cv2.resize(img, (width, height), interpolation = cv2.INTER_AREA)
Resizing by specific height
def img_scaling(frame, new_height=600):
'''
rescale image based on a fixed height, and width with same aspect ratio
Parameters
----------
frame (array): image array
Returns
-------
new_width (int): size of new width
new_height (int): size of new height
'''
width = frame.shape[1]
height = frame.shape[0]
if height > new_height:
scale = new_height/height
new_width = int(width * scale)
else:
new_width = width
new_height = height
return new_width, new_height
new_width, new_height = img_scaling(frame)
resized = cv2.resize(img, (new_width, new_height), interpolation = cv2.INTER_AREA)
Drawing on Images
One of the most important reason to draw on images is to draw bounding boxes representing the prediction output.
rectangles
# pt1 = top left
# pt2 = bottom right
cv2.rectangle(imgArr, pt1=(384,0), pt2=(510,128), \
color=(0,255,0), thickness=5)
Here's a typical example function from xiaochus's YOLO on how it is used.
def draw(image, boxes, scores, classes, all_classes):
'''Draw the boxes on the image.
Argument:
image: original image.
boxes: ndarray, boxes of objects.
classes: ndarray, classes of objects.
scores: ndarray, scores of objects.
all_classes: all classes name.
'''
for box, score, cl in zip(boxes, scores, classes):
x, y, w, h = box
top = max(0, np.floor(x + 0.5).astype(int))
left = max(0, np.floor(y + 0.5).astype(int))
right = min(image.shape[1], np.floor(x + w + 0.5).astype(int))
bottom = min(image.shape[0], np.floor(y + h + 0.5).astype(int))
cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
cv2.putText(image, '{0} {1:.2f}'.format(all_classes[cl], score),
(top, left - 6),
cv2.FONT_HERSHEY_SIMPLEX,
0.6, (0, 0, 255), 1,
cv2.LINE_AA)
print('class: {0}, score: {1:.2f}'.format(all_classes[cl], score))
print('box coordinate x,y,w,h: {0}'.format(box))
Wait & Break
This is not exactly pythonic, so it means it is not as easy to decipher.
0xFF
is an 8 bit binary mask that forces the result from waitKey()
to be an integer of maximum 255,
which is what a character in the keyboard can go till.
ord(char)
returns the character in integers which will also be of maximum 255.
Hence by comparing the integer to the ord(char)
value,
we can check for a key pressed event and break the loop.
# stop when character "q" is pressed
if cv2.waitKey(0) & 0xFF == ord('q'):
break
# stop when "ESC" key is pressed
if cv2.waitKey(20) & 0xFF == 27:
break
# Once script is done, its usually good practice to call this line
# It closes all windows (just in case you have multiple windows called)
cv2.destroyAllWindows()