LEARN NATURAL LANGUAGE PROCESSING
#9 Uncomment the final print statement and save your code to see who your mystery friend was all along!
When doing so this results in an error.
Traceback (most recent call last):
File "c:/Users/scott/Documents/Coding/Tutorials/Codecademy/Python/Learn Natural Language Processing/00 - LEARN NATURAL LANGUAGE PROCESSING/02 - Mystery Friend/main.py", line 44, in <module>
friends_classifier.fit(friends_labels, friends_vectors)
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\naive_bayes.py", line 615, in fit
X, y = self._check_X_y(X, y)
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\naive_bayes.py", line 480, in _check_X_y
return self._validate_data(X, y, accept_sparse='csr')
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\base.py", line 432, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\utils\validation.py", line 73, in inner_f
return f(**kwargs)
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\utils\validation.py", line 803, in check_X_y
estimator=estimator)
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\utils\validation.py", line 73, in inner_f
return f(**kwargs)
File "C:\Users\scott\miniconda3\lib\site-packages\sklearn\utils\validation.py", line 624, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Seems to be pointing to this line in my code
# Train friends_classifier on friends_vectors and friends_labels using the classifier’s.fit() method.
friends_classifier.fit(friends_labels, friends_vectors)
Which was done on step #7
Train
friends_classifier
onfriends_vectors
andfriends_labels
using the classifier’s.fit()
method.
Did I do step #7 correctly? I assume I did based on info from google but now it appears it is wanting an array an I am feeding it a string?
sklearn.naive_bayes.MultinomialNB
I’m not sure if that is the issue, or something else I did wrong. Can someone look over my code (below) and see if they see an issue please?
# import CountVectorizer from sklearn.feature_extraction.text.
# import MultinomialNB from sklearn.naive_bayes.
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from goldman_emma_raw import goldman_docs
from henson_matthew_raw import henson_docs
from wu_tingfang_raw import wu_docs
# Setting up the combined list of friends' writing samples
friends_docs = goldman_docs + henson_docs + wu_docs
# Setting up labels for your three friends
friends_labels = [1] * 154 + [2] * 141 + [3] * 166
# Print out a document from each friend:
mystery_postcard = """
My friend,
From the 10th of July to the 13th, a fierce storm raged, clouds of
freeing spray broke over the ship, incasing her in a coat of icy mail,
and the tempest forced all of the ice out of the lower end of the
channel and beyond as far as the eye could see, but the _Roosevelt_
still remained surrounded by ice.
Hope to see you soon.
"""
# Define bow_vectorizer as an implementation of CountVectorizer.
bow_vectorizer = CountVectorizer()
# Use your newly minted bow_vectorizer to both fit (train) and
# transform (vectorize) all your friends’ writing (stored in the variable friends_docs).
# Save the resulting vector object as friends_vectors.
friends_vectors = bow_vectorizer.fit_transform(friends_docs)
# Create a new variable mystery_vector.
# Assign to it the vectorized form of [mystery_postcard] using the vectorizer’s .transform() method.
mystery_vector = bow_vectorizer.transform([mystery_postcard])
# Implement a Naive Bayes classifier using MultinomialNB. Save the result to friends_classifier.
friends_classifier = MultinomialNB()
# Train friends_classifier on friends_vectors and friends_labels using the classifier’s.fit() method.
friends_classifier.fit(friends_labels, friends_vectors)
# Change predictions value from ["None Yet"] to the classifier’s prediction about which friend wrote the postcard.
# You can do this by calling the classifier’s .predict() method on the mystery_vector.
predictions = friends_classifier.predict(mystery_vector)
mystery_friend = predictions[0] if predictions[0] else "someone else"
# Uncomment the print statement:
print("The postcard was from {}!".format(mystery_friend))