Ph.D. Thesis


In short: You will find here some information about my Ph.D. thesis, that I defended at Nancy on the 2nd of April, 2003, and that was published by Hermes-Lavoisier during October 2004. Initial title of the thesis: Multimodal Communication Modelling: Towards a Formalization of Relevance. Title of the resulting book: Multimodal Human-Machine Dialogue (in French). ISBN 2-7462-0992-6.
Presentation: Slides (in French).
Examining board: Prof. J.-M. Pierrel, Henri Poincaré University, Nancy (president of the jury),
Prof. H. Zeevat, University of Amsterdam,
Prof. J. Siroux, Lannion IUT,
F. Alexandre, Director of Research, INRIA, Nancy,
L. Romary, Director of Research, INRIA, Nancy (Ph.D. supervisor),
N. Bellalem, maître de conférence, University of Nancy 2.
Keywords: Spontaneous multimodal communication, visual perception, natural language processing, dialogue system architecture, pragmatics, cognitive modelling, reference to objects, context, salience, relevance.
Abstract: The way we see the objects around us determines speech and gestures we use to refer to them. The gestures we produce structure our visual perception. The words we use have an influence on the way we see. In this manner, visual perception, language and gesture present multiple interactions between each other. The problem is global and has to be tackled as a whole in order to understand the complexity of reference phenomena and to deduce a formal model. This model may be useful for any kind of human-machine dialogue system that focuses on deep comprehension.

We show how a referring act takes place into a subset of objects. This subset is called reference domain and is implicit. It can be deduced from a lot of clues. Among these clues are those which come from the visual context and from the utterance, and those from the user's intention, attention and memory. We propose a formalization of reference domains taking these parameters into account. We focus on the notion of salience for which we propose a formal characterization. In fact, it seems that implicit information can most readily be retrieved from salient clues. We show how a dialogue system can exploit the resulting hypotheses with the help from a relevance criterion. We lay the foundations of the computation of this criterion. Our contribution is then directing along the identification of implicit information in multimodal communication, in terms of objects structures and of cognitive criteria formalizations.