Abstract
It is well known that manually formalizing a domain is a tedious and cumbersome process. It is constrained by the knowledge acquisition bottleneck. Therefore, many researchers have developed algorithms and systems to help automate the process. Among them are systems that incorporate text corpora in the knowledge acquisition process. Here, we provide a novel method for unsupervised bottom-up ontology generation. It is based on lexico-semantic structures and Bayesian reasoning to expedite the ontology generation process. To illustrate our approach, we provide three examples generating ontologies in diverse domains and validate them using qualitative and quantitative measures. The examples include the description of high-throughput screening data relevant to drug discovery and two custom text corpora. Our unsupervised method produces viable results with sometimes unexpected content. It is complementary to the typical top-down ontology development process. Our approach may therefore also be useful to domain experts.