concollate.pl takes search strings and searches either for documents in a local directory or in a document base downloaded from Google or Medivista search hits. The Perl-script produces lists of words collocated to the search
concollate.pl is a command line tool and allows for a flexible definition of the search space and the collocation rules. The collocation output is an important data preprocessing step for domain ontology enrichment research at KOM, as the ontology enrichment approach relies on defining conceptual similarity by the similarity of word usages in domain specific text corpora.
Download at SourceForge: