Why is the end index one higher than the index we want?

So - why is the end index one more than the last index that we want to include?

This has me stumped, I guess I could just accept it and move on but it bothers me!

In the example given:

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
sublist = letters[1:6]
print(sublist)

would yield:

['b', 'c', 'd', 'e', 'f']

But why 6 when ‘g’ is in the sixth index?

It is a numbering convention in many (nearly all?) computer languages, and, yes, you could just accept it and move on, but consider: what if you didn’t know how long the list was?

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
sublist = letters[1:len(letters)]   # no "magic numbers" needed!
print(sublist)

# prints ['b', 'c', 'd', 'e', 'f', 'g']

If you haven’t learned the range() function yet, you soon will, and you will note a very similar numbering convention.

33 Likes

so I’m a hardware engineer and have no issue with numbering systems starting at zero, I’m just stumped with this particular example - logically:

letters [1:6]

would yield:

['b', 'c', 'd', 'e', 'f', 'g']
1 Like

Well, the example I showed - that the length of the list is compatible with thart second index - is enough for me, but between the two of us, we have only two votes against many decades of computer programming syntax.

So, better get used to it. Having learned and used the rule for awhile, the “logic” will look a bit different to you.

… or not: your call.

6 Likes

I appreciate your help, I’m just trying to understand this particular syntax :+1:

Your

letters[1:len(letters)]

example returns

['b', 'c', 'd', 'e', 'f', 'g']

as I would expect. In the exercise

letters[1:6]

returns

['b', 'c', 'd', 'e', 'f']

that’s my struggle here…

1 Like

Hey @analogue-anomaly

Good question, I think it is/was explained why this happens in the Python 2 material. Not sure whether it’s also in the Python 3 stuff.

Anyway, please refer to the Python Docs, specifically note 4 under Common Sequence Operations.

s[i:j]
slice of s from i to j
(3)(4)

  1. If i or j is negative, the index is relative to the end of sequence s : len(s) + i or len(s) + j is substituted. But note that -0 is still 0 .
  2. The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j . If i or j is greater than len(s) , use len(s) . If i is omitted or None , use 0 . If j is omitted or None , use len(s) . If i is greater than or equal to j , the slice is empty.

Hope that explains why you’re seeing that behaviour. :slight_smile:

6 Likes

Yes, because len(letters) is 7, not 6. You can always make use of that without needing to count anything. The length of the list will always give you the correct second index for a slice, or when using range().

11 Likes

Thanks again guys - I found this useful thread also, and I think its finally sinking in :+1:

6 Likes

I find it helpful to think about these by counting the commas.

So if:

letters = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’]
sublist = letters[0:3]

would yield

‘a’,‘b’, ‘c’

visualize differently count the number of commas:

letters = 0 ‘a’ 1 ‘b’ 2 ‘c’ 3

everything between comma 0 and comma 3 is in the list :slight_smile:

the whole letters[0:7] looks like this

0 ‘a’ 1 ‘b’ 2 ‘c’ 3 ‘d’ 4 ‘e’ 5 ‘f’ 6 ‘g’ 7

16 Likes

Thank you!
It Helped me a lot.

I am pretty new to coding, so take what I say with a grain of salt. I had the same question as you, but then I thought about it a different way:

Imagine that your list (letters ‘a’ to ‘g’) is printed in one row on a piece of paper. You then want to use a pair of scissors to cut letters ‘b’ to ‘f’ out of your list. You will then have a new paper list that only has the letters ‘b’ to ‘f.’

Then imagine that you are trying to precisely explain this to someone who doesn’t understand letters or why they are important to you, but is able to recognize what they are (e.g. like a small child). You have to tell them exactly where to make each cut. To keep things simple, you tell them to cut before a specific letter for each cut. So, going back to our example, you would tell them to cut before the letter ‘b’ and before the letter ‘g.’ That would result in a paper list of ‘b’ through ‘f.’

Now, going back to our code, the computer is like our small child who can recognize the letters, but doesn’t understand what they mean to us. We have to tell it precisely where to slice (cut) our list. So, we tell it to cut before index 1 and before index 6, which will result in our list ['b', 'c', 'd', 'e', 'f'].

16 Likes

Could be so that [:4] = first 4 elements of a list and [-4:] = last 4 elements of a list. That consistency between selecting first n and last n elements in a list gives some value to doing it the way they did.

Or maybe there’s some technical reason hardly anyone knows about.

The way I think about it is letters[1:6] = letters[start of included index: start of excluded index]

2 Likes

It is actually a rule to exploit the convention that the indices always start at 0.
Think about this: if you had the next list

fruits = ["apple", "cherry", "pineapple", "orange", "mango"]

and you want to select the first 3 elements, you must write:

fruits[0:3]

or

fruits[:3]

Does it makes sense?

1 Like

I am not sure either, but if I were to speculate I would guess it has to do with statistics. In Stats we learn that in order to find a range of a data series or sequence, we subtract the highest value against the lowest value in the data series. eg, say you have the following:

1, 2, 3, 4, 7, 9

in this case the Highest value is 9 and the lowest value is 1, therefore the range would be 8. So if you were to compare this with the way range and length are calculated you may see possibly why it’s so.

There’s a great section of the docs for understanding slicing and indexes at: https://docs.python.org/3/tutorial/introduction.html#strings (you’ll need to scroll down a bit till it mentions slicing/indexing).

Here’s a direct quote from that section-

One way to remember how slices work is to think of the indices 
as pointing *between* characters, with the left edge of the first
character numbered 0. Then the right edge of the last character
of a string of n characters has index n, for example:
 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

There’s more useful info on slicing in that document (it’s the tutorials on the Python site itself) so have a wee view if you wanted more info.

5 Likes

I think of it this way. I imagine slicing the list just before each of the two numbers, leaving you with the desired section of the list.

An alternative convention would be to slice just before the first number, and just after the second number, which I see nothing with other than being a little inconsistent.

I think maybe it’s so we can compensate for the 0-index of the lists without resorting to n-1 and do this cleanly, as in the next lesson, for instance:

If we want to select the first n elements of a list, we could use the following code:

fruits[:n]