본문 바로가기
코딩

Small Dataset (cf. Convolutional Neural Network)

by Doldam Alice 2023. 1. 30.
300x250

Kaggle Computer Vision Competition

Dogs : 12500, Cats : 12500

25000건의 데이터 중 일부 데이터를 추출하여 학습

- Train Data : 2000개(Dog:1000, Cat:1000)
- Valid Data : 1000개(Dog:500, Cat:500)
- Test Data : 1000개(Dog:500, Cat:500)


문제점 

1. X만 있어서 어떤게 고양이고 어떤게 강아지인지 알 수 없음.
2. Input Data의 크기가 다 다름.



사실은 데이터 전처리(라벨링)가 되어 있음.
Dogs and Cats라는 디렉토리에서 꺼내면서 y값을 0과 1로 붙여줌.


Labels Batch가 y값인데, 20개로 뽑힘.





Image Augmentation 이미지 증강
하나의 이미지를 더 많은 것처럼 만들어줌.
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   rotation_range = 40,
                                    width_shift_range = 0.2,
                                    height_shift_range = 0.2,
                                    shear_range = 0.2,
                                    zoom_range = 0.2,
                                    horizontal_flip = True,
                                    vertical_flip = True,
                                    brightness_range = [0.5, 1.5],
                                    fill_mode = 'nearest')

valid_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = train_datagen.flow_from_directory(
                  train_dir,
                  target_size = (150, 150),
                  batch_size = 20,
                  class_mode = 'binary')


valid_generator = valid_datagen.flow_from_directory(
                valid_dir,
                  target_size = (150, 150),
                  batch_size = 20,
                  class_mode = 'binary')


인공눈물
밴드
수기작성
세탁특공대

300x250

'코딩' 카테고리의 다른 글

Recurrent Neural Network ; RNN  (0) 2023.02.02
CNN Animal(Dogs and Cats) Categorical GPU  (0) 2023.01.31
docker container commit 실습하기  (0) 2023.01.27
mariadb 구성하기  (0) 2023.01.27
dpkg 를 사용해서 패키지 설치하기  (0) 2023.01.27

댓글