Concept Suite
The TextTech Concept Suite comprises the following packages:
- TextTech Corpora Builder: Software for the creation and maintenance of customer-specific corpora
- TextTech Term Extractor: Software for the automated extraction of relevant keywords (incl. the use of corpora)
- TextTech Classifier: Software for the automated classification of text documents
- TextTech Recommender: Software for the automated recommendation of objects (e.g. texts)
- TextTech Toolbox: Workbench comprising tools for compound deconstruction, lemmatisation, phrase extraction (as expansion to term extraction) and similarity counting
The first step in the introduction of text mining to a company is to lay the linguistic foundations. The most important and crucial procedure is the creation of customer-specific corpora, which definitively influence the quality of the text mining results.
The quality of these results is crucial to the accuracy of the indexing. It should therefore be ensured both that the data pool for the creation of the corpora is representative and that the algorithms for the creation of the corpora are optimally configured. It is also necessary to make the fundamental decision as to what extent a corpus seems sufficient for the customer or topic-specific corpora are necessary. The corpora created here can also be used in other text mining cases and are therefore not application-specific.
The result of this project phase is a customer-specific corpus (or several topic-specific and customer-specific corpora), which forms the basis for the next steps. The TextTech CorporaBuilder is used to assemble the corpora. This software supports both the initial creation of the corpora as well as the updating processes of the lexicons which must be carried out at intervals still to be defined.
The TextTech TermExtractor facilitates the automated indexing of documents using various corpora. As a rule, one-off manual preparatory work is also executed at this point with particular reference to the optimal configuration of the tool.

In conjunction with classification tasks, we supply the TextTech Classifier, which is used within text mining analysis and actual real-time classification. The basic creation of the text mining models is carried out using current classification algorithms.
TextTech Recommender is used for the automated recommendation of objects. This is an expansion of classification. On the basis of real-time classification, the system is able to supply context-sensitive and content-sensitive recommendations (e.g. text objects).
The TextTech Toolbox comprises a number of tools for the enrichment and improvement of the text mining results:
- TextTech Compound Deconstructor: Deconstructs a compound into its constituent parts, subsequently reducing the constituent parts to their basic forms
- TextTech Lemmatisation Tool: Used to determine the basic form of a lexeme and allocation or return to a full form
- TextTech Phrase Extractor: Phrase extraction as an expansion to term extraction
- TextTech Similarity Tool: For counting similar words