One of the most promising approaches to machine translation consists in formulating the problem by means of a pattern recognition approach. By doing so, there are some tasks in which online adaptation is needed in order to adapt the system to changing scenarios. In the present work, we perform an exhaustive comparison of four online learning algorithms when combined with two adaptation strategies for the task of online adaptation in statistical machine translation. Two of these algorithms are already well-known in the pattern recognition community, such as the perceptron and passive-aggressive algorithms, but here they are thoroughly analyzed for their applicability in the statistical machine translation task. In addition, we also compare them with two novel methods, i.e., Bayesian predictive adaptation and discriminative ridge regression. In statistical machine translation, the most successful approach is based on a log-linear approximation to a posteriori distribution. According to experimental results, adapting the scaling factors of this log-linear combination of models using discriminative ridge regression or Preprint submitted to Pattern Recognition November 22, 2011 Bayesian predictive adaptation yields the best performance.
Received: date / Accepted: date Abstract We conducted a field trial in computer-assisted professional translation to compare Interactive Translation Prediction (ITP) against conventional postediting (PE) of machine translation (MT) output. In contrast to the conventional PE set-up, where an MT system first produces a static translation hypothesis that is then edited by a professional translator (hence "post-editing"), ITP constantly updates the translation hypothesis in real time in response to user edits. Our study involved nine professional translators and four reviewers working with the webbased CasMaCat workbench. Various new interactive features aiming to assist the post-editor were also tested in this trial. Our results show that even with little training, ITP can be as productive as conventional PE in terms of the total time required to produce the final translation. Moreover, in the ITP setting translators require fewer key strokes to arrive at the final version of their translation.
Although Machine Translation (MT) is a very active research field which is receiving an increasing amount of attention from the research community, the results that current MT systems are capable of producing are still quite far away from perfection. Because of this, and in order to build systems that yield correct translations, human knowledge must be integrated into the translation process, which will be carried out in our case in an Interactive-Predictive (IP) framework. In this paper, we show that considering Mouse Actions as a significant information source for the underlying system improves the productivity of the human translator involved. In addition, we also show that the initial translations that the MT system provides can be quickly improved by an expert by only performing additional Mouse Actions. In this work, we will be using word graphs as an efficient interface between a phrase-based MT system and the IP engine.
In many pattern recognition problems, learning from training samples is a process that requires important amounts of training data and a high computational effort. Sometimes, only limited training data and/or limited computational resources are available, but there is also available a previous system trained for a closely related task and with enough training material. This scenario is very frequent in statistical machine translation and adaptation can be a solution to deal with this problem. In this paper, we present an adaptation technique for (state-ofthe-art) log-linear modelling based on the well-known Bayesian learning paradigm. This technique has been applied to statistical machine translation and can be easily extended to other pattern recognition areas in which log-linear models are used. We show empirical results in which a small amount of adaptation data is able to improve both the non-adapted system and a system that optimises the above-mentioned weights only on the adaptation set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.