Fonseca De Santa Cruz Oliveira, Rodrigo
Visual recognition of semantically meaningful entities like objects, actions, and poses in images and videos is a long standing goal of computer vision. In the last decades, we have seen progress towards this goal with the development of machine learning models that leverage huge volumes of human annotated data to perform very accurate recognition of a predefined set of visual entities. However, moving forward, this approach presents significant limitations since annotated datasets are...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.