What is HESML?

HESML is an efficient, scalable and large Java software library of ontology-based semantic similarity measures and Information Content (IC) models based on WordNet, SNOMED-CT, MeSH or any other OBO-based ontologies such as the Gene Ontology, which also implements the evaluation of pre-trained word embedding models.
HESML is a self-contained experimentation platform on word/concept similarity and relatedness which is especially well suited to run large experimental surveys by supporting the execution of automatic reproducible experiment files on word similarity based on a XML-based file format. HESML software library has been completely developed in NetBeans 8 and Java 8, being distributed with three WordNet versions and the Gene Ontology. SNOMED-CT and MeSH ontology files should be obtained from the National Library of Medicine of the United States or other institutions with authority on these knowledge resources. For more information on HESML, you can read the HESML introductory paper.

Core HESML innovation

Core innovation of HESML is a linearly scalable in-memory representation for large taxonomies called PosetHERep which allows the real-time computation of any topological query without any memory overhead.

PosetHERep is a new and linearly scalable representation model for taxonomies based on our adaptation of the well-known half-edge representation in the field of computational geometry, also known as a doubly-connected edge list, in order to efficiently represent and interrogate large ontologies. For more information on PosetHERep, you can read the HESML introductory paper.

Learn more about the software architecture

+60 reproducible experiments
with automated evaluation benchmarks

Reproducibility guidelines

XML benchmarks

Multi domain



Open source code


Contact Us

UNED - Universidad Nacional de Educación a Distancia - ETSI Informática
Juan del Rosal, 16
28040 Madrid, Spain