부트캠프/컴퓨터 비전

[OpenCV] 2D Convolution, MNIST, Pooling

례지 2022. 11. 29. 18:09
728x90
2D Convolution 연산
input_shape =(1,3,3,1)
x = tf.random.normal(input_shape)
y = Conv2D(1,2,activation='relu',input_shape=input_shape[1:])(x)
print(y.shape)
input_shape =(1,3,3,1)
x = tf.random.normal(input_shape)
y = Conv2D(1,2,activation='relu',
           padding='same',input_shape=input_shape[1:])(x)
print(y.shape)
MNIST 실습
mnist_ = mnist.load_data()
(X_trn,y_trn),(X_tst,y_tst)=mnist_
print(X_trn.shape, y_trn.shape, X_tst.shape, y_tst.shape)
input_shape = (-1,28,28,1)
X_trn = tf.reshape(X_trn,input_shape)
print('X_trn.shape=',X_trn.shape)
print('X_trn[:10].shape=',X_trn[:10].shape)
X_trn = X_trn/255
y = tf.keras.layers.Conv2D(1,3,activation='relu',padding='valid',
                          input_shape = input_shape[1:])(X_trn[:10])
print('y.shape =',y.shape)
print('y[0].shape =',y[0].shape)
y = tf.reshape(y,(-1,26,26))
src = X_trn[0]
dst = tf.reshape(src,(28,28))
img = y[0]
print('ori.shape=',dst.shape)
print('dst.shape=',y[0].shape)
print('np.max(ori)=',np.max(dst))
print('np.max(dst)=',np.max(y[0]))

ax1 = plt.subplot(121)
ax2 = plt.subplot(122)
ax1.axis('off')
ax2.axis('off')
ax1.imshow(dst,cmap='gray')
ax2.imshow(y[0],cmap='gray')
ax1.set_title(f'original{dst.shape}')
ax2.set_title(f'1conv{y[0].shape}')
plt.show()

y = tf.keras.layers.Conv2D(4,3,activation='relu',padding='same',
                          input_shape = input_shape[1:])(X_trn[:10])
print('y.shape=',y.shape)
print('y[0].shape=',y[0].shape)

y = tf.reshape(y,(-1,y[0].shape[0],y[0].shape[1]))
src = X_trn[0]
dst = tf.reshape(src,(28,28))
img = y[0]
print('ori.shape=',dst.shape)
print('dst.shape=',y[0].shape)
print('np.max(ori)=',np.max(dst))
print('np.max(dst)=',np.max(y[0]))

ax1 = plt.subplot(121)
ax2 = plt.subplot(122)
ax1.axis('off')
ax2.axis('off')
ax1.imshow(dst,cmap='gray')
ax2.imshow(y[0],cmap='gray')
ax1.set_title(f'original{dst.shape}')
ax2.set_title(f'1conv{y[0].shape}')
plt.show()

Pooling
  • 풀링 층의 목적: 계산량과 메모리 사용량, (결과적으로 과대적합의 위험을 줄여주는) 파라미터 수를 줄이기 위해 입력 이미지의 부표본(즉, 축소본)을 만드는 것
  • 텐서플로 구현
CNN구조
  • 전형적인 CNN 구조
    • 합성곱 층을 몇 개 쌓고 (각각 ReLU층을 그 뒤에 놓고), 그 다음 풀링 층을 쌓고, 또 합성곱 층(+ReLU)을 몇 개 더 쌓고, 그 다음 다시 풀링 층을 쌓는 방식
    • 네트워크를 통고하여 진행할수록 이미지는 점점 작아지지만, 합성곱 층 때문에 일반적으로 점점 더 깊어짐.(즉, 더 많은 특성 맵을 가지게 됨)

풀링은 파라미터값이 없기 때문에 빠름.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow.keras.layers import MaxPool2D

x = tf.constant([[1.,1.,2.,4.],
                 [5.,6.,7.,8.],
                 [3.,2.,1.,0.],
                 [1.,2.,3.,4.]])
x_new = tf.reshape(x,[1,4,4,1])

max_pool_2d = MaxPool2D(pool_size=(2,2),strides=(1,1),padding='valid')
result = max_pool_2d(x_new)

result_image = tf.reshape(result[0],[1,3,3])
plt.figure(figsize=(12,6))
ax1=plt.subplot(1,2,1)
ax2=plt.subplot(1,2,2)
sns.heatmap(pd.DataFrame(x).astype('int64'),cbar=False,cmap='gray_r',annot=True,ax=ax1)
sns.heatmap(pd.DataFrame(result_image[0]).astype('int64'),cbar=False,cmap='gray_r',annot=True,ax=ax2)

# 4x4 image, pool_size=(2, 2), strides=(2, 2), padding=‘valid’
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow.keras.layers import MaxPool2D
x = tf.constant([[1., 1., 2., 4.],
                 [5., 6., 7., 8.],
                 [3., 2., 1., 0.],
                 [1., 2., 3., 4.]])
x_new = tf.reshape(x, [1, 4, 4, 1])
max_pool_2d_1 = MaxPool2D(pool_size=(2, 2), strides=(1, 1), padding='valid')
max_pool_2d_2 = MaxPool2D(pool_size=(2, 2), strides=(2, 2), padding='valid')
result1 = max_pool_2d_1(x_new)
result2 = max_pool_2d_2(x_new)
result_image1 = tf.reshape(result1[0], [1,3,3])
result_image2 = tf.reshape(result2[0], [1,2,2])
plt.figure(figsize=(15,5))
ax1 = plt.subplot(1,3,1)
ax2 = plt.subplot(1,3,2)
ax3 = plt.subplot(1,3,3)
sns.heatmap(pd.DataFrame(x).astype('int64'), cbar=False, cmap='Oranges', annot=True, ax=ax1)
sns.heatmap(pd.DataFrame(result_image1[0]).astype('int64'), cbar=False, cmap='Oranges', annot=True, ax=ax2)
sns.heatmap(pd.DataFrame(result_image2[0]).astype('int64'), cbar=False, cmap='Oranges', annot=True, ax=ax3)
ax1.set_title('original_image')
ax2.set_title('strides 1 max pooling')
ax3.set_title('strides 2 max pooling')
plt.show()

 

728x90