How to select only last names?

Goal:

In this lesson, the second task requests that you take the list of author names and convert it to just a list of last names. Here I will break down and explain each step in that process to help those struggling with this task.

Note that this explanation spoils the solution and as such if you wish to solve it yourself you may not want to read this until you are satisfied having done it yourself. As such each code block is spoilered out and can be clicked to see the code.

Explanation

In order to break down how to solve this, we should identify the format our data is already in then what steps are necessary to get it into what we want. After the first task, you should have a list that looks like
['Audre Lorde', 'Gabriela Mistral', 'Jean Toomer', 'An Qi', 'Walt Whitman', 'Shel Silverstein', 'Carmen Boullosa', 'Kamala Suraiyya', 'Langston Hughes', 'Adrienne Rich', 'Nikki Giovanni']
It is a list of strings each containing a first and last name with a space in between. So we want to isolate the last names. There are other solutions but I will use the one hinted at. The steps involved:

  1. Break the first and last name apart using .split()
  2. Isolate the last name using a negative index
  3. Create a new list of last names using .append()

So lets take the first step. We are already familiar with using .split() to break strings apart. This time we want to run it once on each item in the list, so we will use a for statement. In order to split the first and last name, the dividing character is a space ' ', but this is the default value for split so we can leave it without any parameters. This is what our code will look like so far:

for name in author_names:
  name.split()

This won’t work as is since we aren’t actually putting that split name anywhere, this will be addressed in step 3.
On to step two. In each iteration of the loop name.split() will spit out a list that looks like this ['firstname', 'lastname']. We don’t need to use the first name so we can use indexing to select the lastname. This can be done utilizing either [1] or [-1]. The advantage of [-1] is that if these names also had a middle name, or additional family names, only the last would be selected. Since name.split() returns the split list, we can add [-1] directly to it to instead return the chosen index item. Taking our return from ['firstname', 'lastname'] to just 'lastname'. Note that we are not indexing the string itself, which would return a single character but are instead indexing the list of two names.
Our code looks like this now:

for name in author_names:
  name.split()[-1]

Finally we need to send the names into a new list. This can be done by creating an empty list outside the for loop (so it isn’t rewritten each time) and using .append() as below:

author_last_names = []
for name in author_names:
  author_last_names.append(name.split()[-1])

I hope you’ve found this explanation useful. Please discuss below and ask any followup questions.

2 Likes

For the negative index, what is that mean? author_last_names.append (i.split()[-1])

In the script, I try below, but i still don’t know why and how it works. Please inform. Thank you
if [0] return first name. [-1] or [1] return last name.

we can use negative indexes to access the list from the right hand side, so -1 is the right most element in a list or string:

print ['a', 'b', 'c', 'd'][-1] #output: d

so -2 would give c

13 Likes

I see. Thank you so much

can someone explain what does the [-1] contributes to the code?

author_last_names =
for name in author_names:
author_last_names.append(name.split()[1])

print(author_last_names)

-1 get the last element from a list. You use negative values to access the list from the right hand side.

3 Likes
authors = "Audre Lorde, William Carlos Williams, Gabriela Mistral, Jean Toomer, An Qi, Walt Whitman, Shel Silverstein, Carmen Boullosa, Kamala Suraiyya, Langston Hughes, Adrienne Rich, Nikki Giovanni"

author_names = authors.split(",")
author_last_names = []
for i in author_names:
  temp = i.split( )[::-1]
  author_last_names.append(temp[0])
print(author_last_names)
2 Likes

Why doesn’t the -1 reference only the very last letter in the author’s names? Earlier in the lesson, we used -1 to select the very last letter in a string, but here it means the last name entirely. I must be missing something.

16 Likes

If you have a list of names and you get the last name, then you’ll get a name
Your confusion seems to be more about what’s in your list/whateveritis than what -1 as an index means, ie you would see the same difference for index 0 which is a simpler case

3 Likes

Thanks for the quick response! After playing around a little bit, I think I figured it out.

authors = "Audre Lorde, William Carlos Williams, Gabriela Mistral, Jean Toomer, An Qi, Walt Whitman, Shel Silverstein, Carmen Boullosa, Kamala Suraiyya, Langston Hughes, Adrienne Rich, Nikki Giovanni"

author_names = authors.split(", ")

print(author_names)

author_last_names = []
for name in author_names:
  author_last_names.append(name.split()[-1])
  
print(authors[-1])
  
print(author_names[-1])
  
print(author_last_names)

Those final three print lines helped me distinguish between the different kinds of selection. authors[-1] selects the last letter only, because authors is a single string.

6 Likes

I wouldn’t call them different kinds of selection. You’re asking differently typed values for their value at index -1, and they return some value they think is cool.
You’re asking the same way and they behave the same way. You’re just asking from different things.

5 Likes

Yes, I’m asking for the very last element in those lists in the same way each time,

but an element in the authors list is a single letter, whereas an element in the author_last_names list is the entire last name, correct?

Thanks again!

1 Like

authors isn’t a list at all, and it’s not really a container either (it’s a bunch of text), but it can produce substrings so it does behave a bit like a container

The elements of a list is whatever you put in it, and the “elements” of a string is … more strings (characters, but there’s no character type).

3 Likes

I said list because early in this lesson, this was written:

A string can be thought of as a list of characters.

Like any other list, each character in a string has an index.]

1 Like

What they mean or should mean is that they both implement behaviours like iteration, access by index, concatenation, and whatever else.
But you can pretty quickly dismiss the idea that they are or can be thought of as lists if you try to do something that only the other one is capable of. Try to upcase a list. Or try to append to a string.

Many types implement those behaviours, but they won’t implement the same set of behaviours and they won’t necessarily mean the same things due to being different things.

6 Likes

Thank you very much for your clarification. That makes sense. That original tip (“They’re all lists!”) must have confused me slightly.

You can implement these behaviours yourself, for example, iteration and subscription (access by index)

class Derp:
    def __iter__(self):
        for _ in range(5):
            yield 'Meow'

    def __getitem__(self, key):
        return 'received key: {}'.format(key)


myvalue = Derp()  # create a Derp
print(myvalue[3])  # a string saying 3 was the key

for value in myvalue:
    print(value)  # 'Meow' 5 times in total

For python2 the first line should instead be:

class Derp(object):

The main thing to note is that it’s Derp that defines what happens, these things aren’t done to the values, rather it’s asked of them “please do this and that”

It’s also the the main idea in object oriented programming (values which are in charge of the data they contain and the behaviour they have), you can go right ahead and ignore all the other things that nobody really seems to be able to explain why it’s useful but will still say it is.

6 Likes

amazing… but how does it work? ```
[::-1]

If you are referencing -1 of a string it will return the last letter in the string, for example: str1 = ‘hello there’ , str1[-1] = e
if you are referencing -1 of a list containing strings it will return the last string in the list, for example: lst1 = [‘hello’, ‘there’] , lst1[-1] = there

9 Likes

Hi, first time poster. I had to see the solution for this one, and was wondering how an alternative would work so I tried the below:

author_last_names =
for name in author_names:
author_last_names += name.split()[-1]

when I print author_last_names it prints each letter of the last names like: [‘L’, ‘o’, ‘r’, ‘d’, ‘e’, ‘M’, ‘i’, ‘s’, ‘t’, ‘r’, ‘a’, ‘l’, ‘T’, ‘o’, ‘o’, ‘m’, ‘e’, ‘r’, ‘Q’, ‘i’, ‘W’, ‘h’, ‘i’, ‘t’, ‘m’, ‘a’, ‘n’, ‘S’, ‘i’, ‘l’, ‘v’, ‘e’, ‘r’, ‘s’, ‘t’, ‘e’, ‘i’, ‘n’, ‘B’, ‘o’, ‘u’, ‘l’, ‘l’, ‘o’, ‘s’, ‘a’, ‘S’, ‘u’, ‘r’, ‘a’, ‘i’, ‘y’, ‘y’, ‘a’, ‘H’, ‘u’, ‘g’, ‘h’, ‘e’, ‘s’, ‘R’, ‘i’, ‘c’, ‘h’, ‘G’, ‘i’, ‘o’, ‘v’, ‘a’, ‘n’, ‘n’, ‘i’]

How would I correct this code?/Why does it split at each letter?

Thanks!

2 Likes