A hyphenation procedure is described wherein the search list includes exceptional words, Prefixes, suffixes as well as a probabilistic Break-Value- Table. The list of prefixes and suffixes is augmented with what are termed as root words to achieve greater flexibility and accuracy. Importance is given to a number of ways whereby the overall algorithm can be speeded up; in this connection a number of rejection rules are formulated so that only the likely candidates are processed. The order of searching of the various data tables is also considered. A further refinement is tried wherein the common su5xes and common prefixes are given preferential treatment. The algorithm developed was tested on approximately 2,700 common English technical words and an attempt is made to analyse the incorrectly handled words.
KEY WORDS Affixes Exception dictionary Hyphenation Break-Value-Table Rejection rule Root word T H E HYPHENATION PROCESSThe need for a hyphenation procedure exists in every text composition (typesetting) system. All text composition systems are basically word processing systems. That is, generally the input text is read word by word, and the output line is filled by adding successive words until the output line can hold no more words without exceeding the right margin. The last word is termed 'overset'. I n order to determine the location of the overset word (i.e. the last word on the current output line, or as the first word in the next output line, or extended over both output lines-hyphenated) the following procedure is adopted :1. All the interword spacings in the line are reduced up to a lower limit, to see if the 2. All the interword spacings in the line are extended up to an upper limit, to see if the 3. If 1 and 2 fail, then the hyphenation procedure is applied to this word. If hyphenation 4. If 1, 2 and 3 fail, then the overset word is put at the start of the next output line, andIn short the hyphenation procedure is not invoked unless it is absolutely necessary as hyphenation is computationally very expensive.Consequently, in general, the input to any hyphenation procedure in a text formatting system is the input word and the range in which a hyphenation point is acceptable, so that overset word can be accommodated within the right margin. line can appear justified, without this word. yields a successful result then the word is hyphenated. this line is output after increasing the interword spacing to the maximum.