Researcher(s)
- Travis Deputy, Electrical Engineering, University of Delaware
Faculty Mentor(s)
- Austin Brockmeier, Electrical and Computer Engineering, University of Delaware
Abstract
Computer vision is the area of artificial intelligence (AI) that trains a computer to extract and interpret meaningful data from visual sources, such as photographs or video. This project explores the fundamental principles of computer vision by developing and training a unique custom computer vision model, validating it, determining the regions of an input image deemed important for classification and finally measuring the effects of distorting the critical areas of the image as well as the whole image, on the confidence of the model’s prediction on the image’s content. The model was designed to identify whether an image depicted a frog or not, though the principles used and explored would be applicable to classify the contents of any image. It was developed using 1700 unique images. The testing phase of model development determined it to have a 93.929% accuracy (N=300), which was measured as 97.5% (N=40) when manually validated. The model was then used to classify 3 series of images of frogs in which the pixel resolution of the entire image was incrementally decreased, to measure changes in the model’s confidence in its predictions. Gradient-weighted Class Activation Mapping (Grad-CAM) was used on these same images to determine the regions of the input images that had the greatest impact on the classification. The areas of importance were then pixelated to the same resolutions as the entire images had been pixelated to and the model’s confidence in the classification was recorded. These confidence measurements were then compared to show how obscuring specific important regions of can have similar effects on model confidence as obscuring the whole image. Through this, clear visual examples could be generated to illustrate how computer vision works and how it compares to human interpretation of visual stimuli.