import codecademylib3_seaborn
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
data = load_breast_cancer()
x = data.target
y = data.target_names
x_train, y_train, x_test, y_test = train_test_split(x, y, train_size = 0.8, test_size = 0.2, random_state = 10)
Could some tell me what’s wrong with the train_test_split? I am trying to create a train_set and validation_test but it did not work well

It’s in the features and labels that you are targeting. load_breast_cancer dictionary:

Dictionary-like object, the interesting attributes are:
‘data’, the data to learn, ‘target’, the classification labels,
‘target_names’, the meaning of the labels, ‘feature_names’, the
meaning of the features, and ‘DESCR’, the full description of
the dataset, ‘filename’, the physical location of
breast cancer csv dataset

To access the data for features or X, I used data.data. For the labels associated to the features, y, I used data.target.
Within the data.data, there are 30 Dimensionalities, including mean radius’, ‘mean texture’,
‘mean perimeter’, ‘mean area’, ‘mean smoothness’, ‘mean compactness’ etc.

data = load_breast_cancer()
X= data.data # you could also store a subset of the data to variable X, to train your model with 
y= data.target
label_names = data.target_names # associated label names
feature_names = data.feature_names # associated data names

x_train, x_test, y_train, y_test= train_test_split(X, y, train_size = 0.8, test_size = 0.2)
