FAQ: Support Vector Machines - Support Vectors and Margins

This community-built FAQ covers the “Support Vectors and Margins” exercise from the lesson “Support Vector Machines”.

Paths and Courses
This exercise can be found in the following Codecademy content:

FAQs on the exercise Support Vectors and Margins

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Hello,

I am unclear about the justification of the following sentence:

’ If you are using n features, there are at least n+1 support vectors.’

Any help is appreciated :slight_smile:

3 Likes

I have the same question here :face_with_raised_eyebrow:

It’s just a guess, but I think at least n+1 support vectors are required to identify the hyperplane uniquely. Let features be represented as variables x_1, ..., x_n. A hyperplane is represented by an equation like:

c_1 * x_1 + ... + c_n * x_n + b = 0

The method described in this lesson will be a quadratic programming problem which find the coefficients c_1, ..., c_n and the intercept b (n+1 unknowns) that maximize a certain distance under the constraints given by many linear inequalities. If we take out only the necessary ones out of those linear inequalities, I think that they would correspond to the support vectors, and the number of them would be at least n+1.

I think this is similar to the fact that at least n+1 equations are needed to uniquely identify the solution of equations with n+1 unknowns.

I assume due to the following:
The sentence mentions at least n+1 support vectors for n features, which means you can have more.
Let’s imagine you have only 2 features for a classification case (two is the minimum for a classification), then n=2.
As shown in the lecture, to draw a margin lines, you need AT LEAST a line for one of the groups(features), and a point for another group (feature). Therefore, you can find distances between the two groups (features) and draw a margin band between them. That was a line and a point. The point requires one support vector; and the line, since it requires minimum two points (equation of the line), then it requires 2 support vectors. In total 3 support vectors are required for a case with two features. If you have more features and you want to find the boundaries between the features (groups), you repeat this process. That’s why it mentions that at least you need n+1 support vectors for a case with n features. You might end up needing more support vectors depending on distributions and geometry of data points.