Semantic Labelling for Prosthetic Vision

Horne, Lachlan Stuart

Semantic Labelling for Prosthetic Vision

Date

2019

Authors

Horne, Lachlan Stuart

Abstract

Low or impaired vision is a common cause of disability, with prevalence rates estimated to be between 2.7% and 5.8%. Those with low vision report reduced independence and social function, resulting in lower overall quality of life. While many causes of vision loss can be treated, there are still no therapies which can fully restore vision lost to retinitis pigmentosa or macular degeneration. Prosthetic vision systems aim to improve quality of life by restoring recipients' ability to carry out everyday visual tasks. When parts of the visual system are affected by disease or injury, visual prostheses attempt to replicate their function by communicating visual information to the user. However, the state-of-the-art is limited in terms of functional outcomes for users. The visual function of prosthetic vision recipients is considered, at best, profoundly low. Users may still require mobility aids such as guide dogs or long canes for orientation and mobility. Current devices can only stimulate a small number of locations in the visual field, with few discrete levels of stimulation intensity. The resulting visual stimuli have low resolution and limited perceivable contrast. This limits the functional outcomes for the user, such as navigation or object recognition, to high-contrast environments. Many prosthetic vision systems use an external camera, with an image processing system that produces stimuli by downsampling the camera image. Advances in computer vision techniques can thus be exploited to improve functional outcomes by generating more informative stimuli. In this thesis, our goal is to overcome functional shortcomings of prosthetic vision. To that end, we present, to our knowledge, the first application of semantic labelling to prosthetic vision. Semantic labelling allows the simultaneous detection and localisation of objects in an image, enabling an assistive device to understand a scene and present a simplified, task-specific representation to the user. We show how this can apply to a range of real-world orientation and mobility tasks. Computational cost is a signifi cant obstacle to implementing semantic labelling in a wearable system. We address this by contributing novel, fast pixel-wise semantic labelling techniques based on sparsely computed unary potentials. Our method allows us to trade-off labelling accuracy for reduced computational cost. We show how semantic labelling can be applied to prosthetic vision by mapping semantic classes to stimulation intensity. We also introduce a fi rst-person dataset to show how our techniques may be applied in real-world situations to improve orientation and mobility.