Towards Recognizing New Semantic Concepts in New Visual Domains

Copertina anteriore
Sapienza Università Editrice, 30 nov 2022 - 280 pagine

Despite being the leading paradigm in computer vision, deep neural networks are inherently limited by the visual and semantic information contained in their training set. In this thesis, we aim to design deep models operating with previously unseen visual domains and semantic concepts. We first describe different solutions for generalizing to new visual domains, applying variants of normalization layers to multiple challenging settings e.g. where new domain data is not available but arrives online or is described by metadata. In the second part, we incorporate new semantic concepts into pretrained deep models. We propose specific solutions for different problems such as multi-task/incremental learning and open-world recognition. Finally, we merge the two challenges: given images of multiple domains and categories, can we recognize unseen concepts in unseen domains? We propose an approach that is the first, promising step, towards solving this problem.

Winner of the Competition “Prize for PhD Thesis 2020” arranged by Sapienza University Press.

 

Parole e frasi comuni

Informazioni sull'autore (2022)

Massimiliano Mancini received his BSc degree from the University of Perugia (2014), his MSc degree with honors (2016), and his Ph.D. (with honors) from the Sapienza University of Rome in 2020. During his Ph.D. he has been a member of the ELLIS Ph.D. program, the TeV lab at Fondazione Bruno Kessler, and the VANDAL lab of the Italian Institute of Technology. He is currently a postdoc at the Explainable Machine Learning lab of the University of Tübingen, working on transfer learning and compositionality.

Informazioni bibliografiche