2강 image classification

이미지 분류
- 컴퓨터 작업 중 가장 핵심이 되는 작업
- 이것이 가능하면, detection, segmentation, image captioning 모두 수월히 가능
문제점
- semantic gap
- 숫자로 구성된 3d array [0, 255]
- height x width x color channels
  - challenges
    - 관점의 차이 (viewpoint)
    - 조명의 차이 (illumination)
    - 형태의 차이 (deformation)
    - 은폐 (occlusion)
    - 배경과 거의 구분이 안 되는 수준 background clutter
    - intraclass variation : 고양이의 종 어떻게 구별?
  - image classifier
    - 일반적인 형태
```
def predict(image):
	'''???'''
	return class_label
```
      - ???에 명확한 알고리즘이 존재하지 않음
      - 역사적으로 오랜 시도 있었음
        
        이미지 내의 상태 차이(특징적인 방향성, 가장자리 등)을 기반한 시도 등
        
        많은 어려움이 존재함
        
        원본 이미지에 따라 천차만별인 시간 경과함
      - 해결책 → data-driven approach
        
        라벨과 그에 따른 이미지 데이터셋 구축
        
        머신러닝을 통해 이미지 분류기를 학습시킴
        
        테스트 이미지들로 이미지 분류기를 evaluate 함
        
        def train(train_images, train_labels): # build a model for images -> labels... return model def predict(model, test_images): # predict test_labels using the model... return test_labels
        
        첫 번째 분류기 : Nearest Neighbor Classifier
        
        현재는 사용하지 않음. 교육용임
        
        메모리에 모든 훈련 이미지들과 라벨들을 올려서 컴퓨터가 기억할 수 있게함.
        
        테스트 이미지를 모든 훈련 이미지 하나하나와 비교하여 가장 비슷한 훈련 이미지 라벨은 테스트 이미지 라벨 이라고 이해함
        
        비교하는 기준
        
        거리 (L1 distance)
        
        I1과 I2 사이의 차이에 절댓값 기호를 씌워준 것.
        
        distances = np.sum(np.abs(self.Xtr = X[i, :], axis = 1)
        
        단점
        
        테스트 속도가 느림
        
        Q. what is the accuracy of the nearest neighbor classifier on the training data, when using the euclidean distanch?
        
        A. 100%. 우리가 테스트한 트레이닝 데이터가 이미 트레이닝 데이터셋에 존재하기 때문이다.
        
        L2든 L1이든 마찬가지
        
        Q. what is the accuracy of the k-nearest neighbor classifier on the training data?
        
        A. k가 1일 때는 무조건 정확한 클래스를 예측하겠지만 2 ~부터 어떤 클래스를 예측하냐에 따라 (다수결에 의해) 결정되기 때문에 정확하게 알 수 없다.
        
        Q. what is the best distance to use?
        
        Q. what is the best value of k to use?
        
        i.e. how do we set the hyperparameters?
        
        → Very problem-dependent. Must try them all out and see what works best.
        
        Q. hyperparameters를 바꾸면 test data에 적용해보면 될까?
        
        A. 절대 안됨. test dataset은 최후의 보루. 성능 평가를 위해 끝까지 남겨놔야 할 데이터.
        
        → Validation data (20% of train data)
        
        use to tune hyperparameters
        
        → Cross-validation(train data가 적을 때)
        
        cycle through the choice of which fold is the validation fold, average results.
        
        1,2,3,4 : train - 5 : validation
        
        2,3,4,5 : train - 1 : validation
        
        K-Nearest Neighbor (KNN)
        
        일반적인 NN보다 성능이 좋다고 함
        
        좀 더 부드럽게 classify를 수행함
  - linear classifier
    - CNN으로 가는 길
    - parametric approach
    Q. what does the linear classifier do?
    
    A1. just a weighted sum of all the pixel values in the image.
    
    A2. counting colors at different spatial position.
    - 기본적으로 한계점이 있다.
    Q. what would be a very hard set of classes for a linear classifier to distinguish?
    
    A. 노랑 자동차, gray한 이미지들, 텍스쳐는 다르지만 색상이 동일한 경우, 강아지는 어디에 있던 잘 구별할 것이다.
    
    점수를 기반으로 loss function을 정의해야함
    
    점수에 따라 성능이 좋은지 나쁜지 정량화.