In the context of this exercise, how can I find all indices where a substring appears in a string?
Answer
In this exercise, we were introduced to the .find() method, but it will only return the first index where the substring appears in a string.
To obtain all the indices where a substring appears, you can use a loop to iterate over the entire string, and keep track of each match’s starting index.
target = "abc"
string = "abcdababcd"
indices = []
for i in range(len(string)):
if string[i:i+len(target)] == target:
indices.append(i)
print(indices) # [0, 6]
I know this is an old thread, but why doesn’t this code throw an index error? string[i+len(target)] would exceed the index of the string correct? Not seeing how, for example, i = 8 wouldn’t throw an error.
Codecademy runs a series of checks to make sure that you performed the tasks as directed. The directions ask you to:
“Save the index place to the variable disown_placement.”
In the code you shared above, you did print the correct index, but you did not save the index to the variable as directed. The Codecademy solution checker is looking at the contents of your disown_placement variable and finding the original string instead of the index value, 20.
Loop around slices of the string starting at [0:] and moving forward after each found location.
string = 'abababa long section without substring ababab'
target ='abab'
def find_all(string, substring):
index = 0
found = []
while index + len(substring) <= len(string):
i = string[index:].find(substring)
if -1 == i:
print('break')
break
# i is only relative to the slice, make sure to add index
index += i
found.append(index)
# don't find the same one again
index += 1
return found
print(find_all(string,target))
# [0, 2, 39, 41]
With the same index incrementing style as you’re using you could use the optional start parameter of the .find method, e.g. string.find(substring, index). This saves you from the cost of creating lots of new strings from each slice and also simplifies the expression used in the while loop (.find returns -1 if the substring is not found).
Please do note that date on some replies in FAQ posts, if it’s more than a few months old it’s not normally worth revisiting.
string = 'abababa long section without substring ababab'
target ='abab'
def find_all(string, substring):
index = 0
found = []
while string.find(substring,index) != -1:
index = string.find(substring,index)
found.append(index)
# don't find the same one again
index += 1
return found
print(find_all(string,target))
string = 'abababa long section without substring ababab'
target ='abab'
def find_all(string, substring):
index = 0
found = []
while (index := string.find(substring,index)) != -1:
found.append(index)
# don't find the same one again
index += 1
return found
print(find_all(string,target))
Since it is a FAQ a fair number of people will be reading replies other than the original poster, I think there is value in the expanded answer.
def search_substr(string, substr, metod):
if metod == 1:
indices = []
i = string.find(substr)
while i != -1:
indices.append(i)
i = string.find(substr, i+1)
return indices
elif metod == 2:
indices = [i for i in range(0, len(string)-1) if string[i:i+len(substr)] == substr]
return indices
elif metod == 3:
indices = [i for i in range(len(string)) if string.startswith(substr, i)]
return indices
else:
print('Invalid metod.')
text = '123456123456123456123456123456'
substr = '345'
indices = search_substr(text, substr, 1)
print(len(indices)) # => 5
print(indices) # => [2, 8, 14, 20, 26]
indices = search_substr(text, substr, 2)
print(len(indices)) # => 5
print(indices) # => [2, 8, 14, 20, 26]
indices = search_substr(text, substr, 3)
print(len(indices)) # => 5
print(indices) # => [2, 8, 14, 20, 26]