HESML is an efficient, scalable and large Java software library of ontology-based
measures and Information Content (IC) models based on WordNet, SNOMED-CT, MeSH or any
OBO-based ontologies such as the Gene Ontology, which also implements the evaluation of pre-trained
word embedding models.
HESML is a self-contained experimentation platform on word/concept similarity and relatedness which is especially well suited to run large experimental surveys by supporting the execution of automatic reproducible experiment files on word similarity based on a XML-based file format. HESML software library has been completely developed in NetBeans 8 and Java 8, being distributed with three WordNet versions and the Gene Ontology. SNOMED-CT and MeSH ontology files should be obtained from the National Library of Medicine of the United States or other institutions with authority on these knowledge resources. For more information on HESML, you can read the HESML introductory paper.
Core innovation of HESML is a linearly scalable in-memory representation for large taxonomies called PosetHERep which allows the real-time computation of any topological query without any memory overhead.
PosetHERep is a new and linearly scalable representation model for taxonomies based on our adaptation of the well-known half-edge representation in the field of computational geometry, also known as a doubly-connected edge list, in order to efficiently represent and interrogate large ontologies. For more information on PosetHERep, you can read the HESML introductory paper.Learn more about the software architecture