This development responds to a challenge. Text mining software can conveniently generate very large sets of terms or phrases. Our examples draw from use of VantagePoint or equivalently, Thomson Data "nalyzer -TD" software [ ] to analyze abstract record sets. " typical search on an ST&I topic of interest might yield, say, , records. One approach is to apply VantagePoint s Natural Language Processing NLP to the titles, and also to the abstracts and/or claims. We also take advantage of available topic-rich fields such as keywords and index terms. Merging these fields could well offer on the order of , terms and phrases in one field list . That list, unfortunately, will surely contain much noise and redundancy. The text clumping aim is to clean and consolidate such a list to provide rich, usable content information. "s described, the text field of interest can contain terms i.e., single words or unigrams and/or phrases i.e., multi-word noun + modifiers term sets . Herein, we focus on such NLP phrases, typically including many single words also. Some of the algorithms pertain especially to multi-word phrases, but, in general, many steps can usefully be applied to singleword term sets. Here we focus on analyzing NLP English noun-phrases -to be called simply "phrases.Our larger mission is to generate effective Competitive Technical Intelligence CTI . We want to answer basic questions of "Who is doing What, Where and When? In turn, that information can be used to build "innovation indicators that address users CTI needs [ ]. Typically, those users might be
• Information professionals compiling most relevant information resources• Researchers seeking to learn about the nearby "research landscape• R&D managers wanting to invest in the most promising opportunities