Topic Modeling for Automated Underwater Image Analysis

Hanumant Singh and Yogesh Girdhar, Woods Hole Oceanographic Institution

Abstract

Marine robots have enabled collection of large amounts of data. Unlike other sensor data such as CTD and sonar, image data is much harder to analyze using automatic statistical tools, due to its large dimensionality. Hence, quite often, analysis of image data is a painstakingly slow manual process. The proposed research hopes to use state of the art machine learning techniques such as topic modeling, to analyze image data automatically; making it easier to and surprising observations, and similar regions. Topic modeling can be used to extract high level visual scene constructs given the low level visual features. Intuitively, topic modeling works by discovering commonly co-occurring sets of low level visual features, which often corresponds to a common underlying cause. For example, consistent co-occurrence of scales like texture, gray color, and triangular tail shape in observed data indicates a common underlying topic – a fish. Visual data as observed by a robot has spatiotemporal context. Consecutive images observed by a robot are likely to contain similar topics. Moreover, within an image, visual words that are spatially close to each other are likely to have the same underlying topic label. Taking advantage of this correlation is important for making sure we can extract consistent topic labels for the observed visual data.