Digital cameras are one of the most inexpensive and readily available automatic sensors and the quantity of information they provide is enormous. This thesis is a study of the probabilistic relationship between objects in an image and image appearance. We give a hierarchical, probabilitistic criterion for the Bayesian segmentation of photographic images combining the visual cues of size, colour, luminance, and shape. In order to facilitate object-based operations on images the segmentation should be as faithful as possible to the organization of physical objects in the image. Therefore we validate the segmentation against the Berkeley Segmentation Data Set compiled by Martin, Tal, Fowlkes and Mallows. Human subjects were asked to partition digital images into segments, each representing a `distinguished thing'. We show that there exists a strong dependency between the hierarchical segmentation criterion, based on our assumptions about the visual appearance of objects, and the distribution of ground truth data.
This is significant because it means an image segmentation can be used to predict the distribution of `distinguished things' in an image. Prediction accuracy is quantified by measuring the information cross-entropy between the prediction and the ground truth distribution. Other ground truth based evaluation methods exist, however the benefit of our approach is that the cross-entropy unit of measure relates directly to the conceptual computer vision application of `describing' the spatial extent of a ground truth segment. If the segmentation matches closely to the `distinguished things' in the image then it will be particularly efficient for this task. After training the system we obtain a compression rate of 2 per cent over a method that disregards pixel correllations.
A concise and simple description of objects is important for the efficiency and robustness of computer vision applications. Moreover from an information theoretical perspective a concise description demonstrates an accurate understanding of the underlying distribution, in this case, of objects in the scene. We consider the proposed method for estimating joint ground truth probability to be an important tool for future image analysis and visualization work.
Full Text: PDF
The application accepts standard bitmap image types and saves results in the Portable Network Graphics (PNG) format. Background pixels are coloured 'transparent', so that the subject can be composited onto a different background.