This function extends a given list of keywords by adding synonyms, hypernyms, hyponyms, and related words using a WordNet-like logic and word embeddings from pre-trained language models. For instance, if the word "car" is provided as input, the software will return related terms such as "automobile," "bicycle," "SUV," and "dealer," depending on the selected parameters. This function is typically useful for defining brand clusters.
List of words to extend: this is the original lexicon that is to be extended. Please use lowercase, only letters, and white spaces. Do not use quotation marks. Separate words with a comma. E.g.: "home,climate change,sun".
language. Please note that some language models provide better results than others. By default, all synsets of a word will be considered, leaving the user the final task of dropping unnecessary words.
Add hypernyms and hyponyms: if selected, the software will extend each of the provided words with their direct hypernyms and hyponyms.
Max number of related words to extract: is the maximum number of related words generated by a pre-trained language model for each input word.
Skip related words: if selected, the function will skip the search of related words that is carried out using word embeddings.
A CSV file containing the list of input words and related words - classified as words belonging to synonyms, hypernyms, hyponyms, or words that are close in the embedding space (near).