Date-a-Scientist ML Capstone Project: Input Variables with Inconsistent Samples

Hi! I’ve been working on the capstone project Date-a-Scientist for the Machine learning course for a while, but cannot get past an error in my code. The error reads… “found input variables with inconsistent numbers of samples…”

I previously had issues about clearing out NaN values in the data, but I’m worried I did something wrong and caused this. I’m pretty sure I’m probably missing a step in cleaning my data. So far cleaning my code looks like…

from sklearn.preprocessing import MinMaxScaler
from sklearn import preprocessing

x = features_data.values
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
features_data = pd.DataFrame(x_scaled, columns=features_data.columns)
signs_labels = df[‘sign’].replace(np.nan, ‘’, regex=True)

#split data into train and test sets
X_train, y_train, X_test, y_test = train_test_split(features_data, signs_labels, test_size=0.2, random_state=1)

I tried putting the data into an array, but each time it came back with an error that read “too many indices for array”…

I’m a beginner with coding/python so the error might be super obvious and I’m just missing it. Thanks for any help!