Natural Language Generation

Much of my work in natural language generation has focussed on the problem of generating referring expressions. This line of research began with my PhD research [published as Dale 1992], where one of my contributions was the development of an algorithm for determining the semantic content of a distinguishing description [Dale 1988, Dale 1989a]. This work aimed to address the question of how one distinguishes an intended referent from other entities in a given context; the algorithm I provided reasoned over the properties of the objects in the context to identify a set of properties that would meet that goal.

The early work focussed on one-place predicates used to describe entities, and was extended and revised in work with Ehud Reiter [see Reiter and Dale 1992, Dale and Reiter 1995, Dale and Reiter 1996]. In other papers, this work was extended to cover both relational predicates (in work with Nick Haddock [Dale and Haddock 1991a, Dale and Haddock 1991b]) and references to events (in work with Jon Oberlander [Oberlander and Dale 1991]). Subsequently, a number of researchers have gone on to extend these basic algorithms in various ways, and to offer alternative algorithms. More recently I have been interested in how algorithms of this kind can be used in the context of other forms of reference, and in particular one-anaphora [Dale 1995b, Dale 2003; see also Ng et al 2005]; my contention here is that a rather different view of the process of referring expression generation is required to account for this data. Other recent work has looked at ways of consolidating the wide variety of different algorithms in the literature [Bohnet and Dale 2004, [Bohnet and Dale 2005].

I've also had a long standing interest in using natural language generation in practical domains [Reiter and Dale 1997, Reiter and Dale 2000], which generally means using real pre-existing data sources [Dale 2002]. My earliest work, on the generation of recipes, assumed as input the kinds of structures that might be produced by planning system [Dale 1989, Dale 1990a; see also Wasko and Dale 1999]; in the early 1990s, my work focussed on generating reports from stock market data and weather report data [see Boyd and Dale 1997]; and this evolved into work on the dynamic generation of hypertextual documents [Milosavlejevic et al 1996, Milosavlejevic and Dale 1996a, 1996b, Dale and Milosavlejevic 1996], most specifically in the context of museum collection databases as part of the POWER project [Dale et al 1997, Verspoor et al 1998, Milosavlejevic et al 1998, Dale et al 1998a, 1998b, 1998c, Green et al 1999; see also Dale et al 1998]. Related to this area of interest, and in parallel with my interest in the processing of real documents, I've long been fascinated by the visual and orthographic aspects of language use [see Dale 1991, Dale 1992a], particularly as this relates to diiscourse structure [Knott and Dale 1996].

Most recently, the work on practical generation and referring expressions was brought together in the Coral project, where we explored how referring expressions might be generated in the context of the automatic generation of route descriptions from underlying geographical information systems (GIS) data [Geldof and Dale 2002, 2005, Dale et al 2002, 2003, 2005; see also Cheng et al l2004].

I've also had an ongoing interest in the evaluation of natural language generation, with some early views expressed in [Dale and Mellish 1998, Mellish and Dale 1998]. This topic has come to the fore again in work with Jette Viethen, where we are looking at how to best evaluate algorithms for the generation of referring expressions [Viethen and Dale 2006].

