Just a heads up if you’re on the DS path and have gotten to the NLP-Word Embeddings-Gensim portion:
The version of Gensim is a previous version (3.x). The current version is 4.0.
So, if you try to practice your skills in a notebook using similar code as in the lesson, it (certain attributes) won’t work/will throw an error.
Ex:
vocabulary = list(model.wv.vocab.items())
print(vocabulary)
>>--> 734 raise AttributeError(
735 "The vocab attribute was removed from KeyedVector in Gensim 4.0.0.\n"
736 "Use KeyedVector's .key_to_index dict, .index_to_key list, and methods "
AttributeError: The vocab attribute was removed from KeyedVector in Gensim 4.0.0.
Use KeyedVector's .key_to_index dict, .index_to_key list, and methods .get_vecattr(key, attr) and .set_vecattr(key, attr, new_val) instead.
See https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4
And, most_similar
doesn’t work:
similar_to_blah = model.most_similar("blah", topn=20)
>>similar_to_blah = model.most_similar("blah", topn=20)
AttributeError: 'Word2Vec' object has no attribute 'most_similar'
Changes are noted here: