Why is the end index one higher than the index we want?

So - why is the end index one more than the last index that we want to include?

This has me stumped, I guess I could just accept it and move on but it bothers me!

In the example given:

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
sublist = letters[1:6]
print(sublist)

would yield:

['b', 'c', 'd', 'e', 'f']

But why 6 when ‘g’ is in the sixth index?

It is a numbering convention in many (nearly all?) computer languages, and, yes, you could just accept it and move on, but consider: what if you didn’t know how long the list was?

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
sublist = letters[1:len(letters)]   # no "magic numbers" needed!
print(sublist)

# prints ['b', 'c', 'd', 'e', 'f', 'g']

If you haven’t learned the range() function yet, you soon will, and you will note a very similar numbering convention.

37 Likes

so I’m a hardware engineer and have no issue with numbering systems starting at zero, I’m just stumped with this particular example - logically:

letters [1:6]

would yield:

['b', 'c', 'd', 'e', 'f', 'g']
2 Likes

Well, the example I showed - that the length of the list is compatible with thart second index - is enough for me, but between the two of us, we have only two votes against many decades of computer programming syntax.

So, better get used to it. Having learned and used the rule for awhile, the “logic” will look a bit different to you.

… or not: your call.

6 Likes

I appreciate your help, I’m just trying to understand this particular syntax :+1:

Your

letters[1:len(letters)]

example returns

['b', 'c', 'd', 'e', 'f', 'g']

as I would expect. In the exercise

letters[1:6]

returns

['b', 'c', 'd', 'e', 'f']

that’s my struggle here…

2 Likes

Hey @analogue-anomaly

Good question, I think it is/was explained why this happens in the Python 2 material. Not sure whether it’s also in the Python 3 stuff.

Anyway, please refer to the Python Docs, specifically note 4 under Common Sequence Operations.

s[i:j]
slice of s from i to j
(3)(4)

  1. If i or j is negative, the index is relative to the end of sequence s : len(s) + i or len(s) + j is substituted. But note that -0 is still 0 .
  2. The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j . If i or j is greater than len(s) , use len(s) . If i is omitted or None , use 0 . If j is omitted or None , use len(s) . If i is greater than or equal to j , the slice is empty.

Hope that explains why you’re seeing that behaviour. :slight_smile:

6 Likes

Yes, because len(letters) is 7, not 6. You can always make use of that without needing to count anything. The length of the list will always give you the correct second index for a slice, or when using range().

12 Likes

Thanks again guys - I found this useful thread also, and I think its finally sinking in :+1:

7 Likes

I find it helpful to think about these by counting the commas.

So if:

letters = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’]
sublist = letters[0:3]

would yield

‘a’,‘b’, ‘c’

visualize differently count the number of commas:

letters = 0 ‘a’ 1 ‘b’ 2 ‘c’ 3

everything between comma 0 and comma 3 is in the list :slight_smile:

the whole letters[0:7] looks like this

0 ‘a’ 1 ‘b’ 2 ‘c’ 3 ‘d’ 4 ‘e’ 5 ‘f’ 6 ‘g’ 7

16 Likes

Thank you!
It Helped me a lot.

I am pretty new to coding, so take what I say with a grain of salt. I had the same question as you, but then I thought about it a different way:

Imagine that your list (letters ‘a’ to ‘g’) is printed in one row on a piece of paper. You then want to use a pair of scissors to cut letters ‘b’ to ‘f’ out of your list. You will then have a new paper list that only has the letters ‘b’ to ‘f.’

Then imagine that you are trying to precisely explain this to someone who doesn’t understand letters or why they are important to you, but is able to recognize what they are (e.g. like a small child). You have to tell them exactly where to make each cut. To keep things simple, you tell them to cut before a specific letter for each cut. So, going back to our example, you would tell them to cut before the letter ‘b’ and before the letter ‘g.’ That would result in a paper list of ‘b’ through ‘f.’

Now, going back to our code, the computer is like our small child who can recognize the letters, but doesn’t understand what they mean to us. We have to tell it precisely where to slice (cut) our list. So, we tell it to cut before index 1 and before index 6, which will result in our list ['b', 'c', 'd', 'e', 'f'].

15 Likes

Could be so that [:4] = first 4 elements of a list and [-4:] = last 4 elements of a list. That consistency between selecting first n and last n elements in a list gives some value to doing it the way they did.

Or maybe there’s some technical reason hardly anyone knows about.

The way I think about it is letters[1:6] = letters[start of included index: start of excluded index]

3 Likes

It is actually a rule to exploit the convention that the indices always start at 0.
Think about this: if you had the next list

fruits = ["apple", "cherry", "pineapple", "orange", "mango"]

and you want to select the first 3 elements, you must write:

fruits[0:3]

or

fruits[:3]

Does it makes sense?

1 Like

I am not sure either, but if I were to speculate I would guess it has to do with statistics. In Stats we learn that in order to find a range of a data series or sequence, we subtract the highest value against the lowest value in the data series. eg, say you have the following:

1, 2, 3, 4, 7, 9

in this case the Highest value is 9 and the lowest value is 1, therefore the range would be 8. So if you were to compare this with the way range and length are calculated you may see possibly why it’s so.

There’s a great section of the docs for understanding slicing and indexes at: https://docs.python.org/3/tutorial/introduction.html#strings (you’ll need to scroll down a bit till it mentions slicing/indexing).

Here’s a direct quote from that section-

One way to remember how slices work is to think of the indices 
as pointing *between* characters, with the left edge of the first
character numbered 0. Then the right edge of the last character
of a string of n characters has index n, for example:
 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

There’s more useful info on slicing in that document (it’s the tutorials on the Python site itself) so have a wee view if you wanted more info.

5 Likes

I think of it this way. I imagine slicing the list just before each of the two numbers, leaving you with the desired section of the list.

An alternative convention would be to slice just before the first number, and just after the second number, which I see nothing with other than being a little inconsistent.

1 Like

I think maybe it’s so we can compensate for the 0-index of the lists without resorting to n-1 and do this cleanly, as in the next lesson, for instance:

If we want to select the first n elements of a list, we could use the following code:

fruits[:n]

For anyone still struggling with this convention, I find that relating it to video editing can help.

If you are editing a short video that has all the shots lined up: shot 1, shot 2, shot 3; but you want to keep shot 1 and 2 only, you would need to place your cutter (slicing tool) on the 1st frame of shot 3. If you slice at the end of shot 2, you will get shot 3 + 1 frame of shot 2. So, you really need to start cutting for the 1st frame of the next shot, leaving shot 1 and shot 2 intact.

Not sure if it helps unless people have experience in video editing, but hopefully it does.

Here is my interpretation:
From “Slicing Lists II”:
If we want to select the first n elements of a list, we could use the following code:

fruits[:n]

note: starting point is not defined, just how many first elements

So based on that in

letters[1:6]

Take first 6 elements (up to letter ‘f’), slice the first (1) off, you’re left with b - f. It is not index 1 to index 6 I think.

That is telling enough so long as we recognize 0 as the first index. (If we don’t by now, we’ve been sleeping.)

Clearly, [1:6] is excluding the first index so there will only be five items in the slice.

>>> 'abcdefghijklmnopqrstuvwxyz'[1:6]
'bcdef'
>>>

At any length, you are right to have some doubt as long as you rectify it before carrying forward. That is good. Your interpretation is well grounded and meaningful. Take something away from this one.

1 Like