How can I find all indices where a substring appears in a string?

Question

In the context of this exercise, how can I find all indices where a substring appears in a string?

Answer

In this exercise, we were introduced to the .find() method, but it will only return the first index where the substring appears in a string.

To obtain all the indices where a substring appears, you can use a loop to iterate over the entire string, and keep track of each match’s starting index.

target = "abc"
string = "abcdababcd"

indices = []
for i in range(len(string)):
  if string[i:i+len(target)] == target:
    indices.append(i)

print(indices) # [0, 6]
11 Likes

Can you explain what is **[i:i…** ]. I don’t understand what is the i:i

3 Likes

string[start:end] is string slicing. String or list slicing has been taught?

feel free to use print to see the values, for example you could do:

for i in range(len(string)):
   print(i, i+len(target), string[i:i+len(target)])
5 Likes

So i:i is a method to select just one specific index?

i is just a specific variable in this example, as i mentioned, the general syntax is:

string[start:end]

which results in a slice, which can be a single character, but can also be multiple character

to get a single character you can do the following:

string[index]

again, this just general syntax.

1 Like
target = "abc"
string = "abcdababcd"

indices = []
for i in range(len(string)):
  if string[i:i+len(target)] == target:
    indices.append(i)

I would make one change to the above code in order to prevent iterating over the entire string which is unnecessary.

for i in range(len(string) - len(target) + 1):
4 Likes

Thank you… Its now making sense!

I know this is an old thread, but why doesn’t this code throw an index error? string[i+len(target)] would exceed the index of the string correct? Not seeing how, for example, i = 8 wouldn’t throw an error.

1 Like

List slicing seems to be able to cope. there seems to be a good explanation here:

python - Why does substring slicing with index out of range work? - Stack Overflow

(the second answer)

1 Like

QUERY:-
1.

In the code editor is the first line of Gabriela Mistral’s poem God Wills It .

At what index place does the word “disown” appear? Save that index place to the variable disown_placement .

Checkpoint

Here’s my code:-

god_wills_it_line_one = “The very earth will disown you”
disown_placemet = god_wills_it_line_one

print(disown_placemet.find(“disown”))

and I got the output of 20

But still why I’m unable to continue next unit?
In the solution there was the answer:-
Code was something different!

BUT BY BOTH METHODS ANSWER WAS 20

still I was unable to continue next unit of Strings part 2.

Codecademy runs a series of checks to make sure that you performed the tasks as directed. The directions ask you to:

  • “Save the index place to the variable disown_placement.”

In the code you shared above, you did print the correct index, but you did not save the index to the variable as directed. The Codecademy solution checker is looking at the contents of your disown_placement variable and finding the original string instead of the index value, 20.

2 Likes

wow, that’s so smart.

Loop around slices of the string starting at [0:] and moving forward after each found location.

string = 'abababa long section without substring ababab'
target ='abab'

def find_all(string, substring):
    index = 0
    found = []
    while index + len(substring) <= len(string):
        i = string[index:].find(substring)
        if -1 == i:
            print('break')
            break
        # i is only relative to the slice, make sure to add index
        index += i
        found.append(index)
        # don't find the same one again
        index += 1
    return found

print(find_all(string,target))
#   [0, 2, 39, 41]

With the same index incrementing style as you’re using you could use the optional start parameter of the .find method, e.g. string.find(substring, index). This saves you from the cost of creating lots of new strings from each slice and also simplifies the expression used in the while loop (.find returns -1 if the substring is not found).

Please do note that date on some replies in FAQ posts, if it’s more than a few months old it’s not normally worth revisiting.

2 Likes

Good point with the find start parameter.

string = 'abababa long section without substring ababab' target ='abab' def find_all(string, substring): index = 0 found = [] while string.find(substring,index) != -1: index = string.find(substring,index) found.append(index) # don't find the same one again index += 1 return found print(find_all(string,target))

and if you have python version 3.8 for Walrus Operator (Assignment Expressions) you only need to run the find once:

string = 'abababa long section without substring ababab'
target ='abab'

def find_all(string, substring):
    index = 0
    found = []
    while (index := string.find(substring,index)) != -1:
        found.append(index)
        # don't find the same one again
        index += 1
    return found

print(find_all(string,target))

Since it is a FAQ a fair number of people will be reading replies other than the original poster, I think there is value in the expanded answer.

4 Likes
def search_substr(string, substr, metod):
    if metod == 1:
        indices = []
        i = string.find(substr)
        while i != -1:
            indices.append(i)
            i = string.find(substr, i+1)
        return indices
    elif metod == 2:
        indices = [i for i in range(0, len(string)-1) if string[i:i+len(substr)] == substr]
        return indices
    elif metod == 3:
        indices = [i for i in range(len(string)) if string.startswith(substr, i)]
        return indices
    else:
        print('Invalid metod.')

text = '123456123456123456123456123456'
substr = '345'

indices = search_substr(text, substr, 1)
print(len(indices)) # => 5
print(indices) # => [2, 8, 14, 20, 26]
indices = search_substr(text, substr, 2)
print(len(indices)) # => 5
print(indices) # => [2, 8, 14, 20, 26]
indices = search_substr(text, substr, 3)
print(len(indices)) # => 5
print(indices) # => [2, 8, 14, 20, 26]
2 Likes

Could you explain why while loop not for loop?