Hello there. I’m taking a Python Course and had this problem come up in a practice session. I got the correct answer but am looking for a second opinion on potentially making the code more efficient. What would you remove or replace with this?
Write a function called m_word_count
that takes a string as an input and returns a count of the words in the string that start with the uppercase or lowercase letter, M
.
my_sentence = "My gosh, what a beautiful Monday morning this is."
def m_word_count(string):
string_list = string.split()
m_count = []
for i in string_list:
if "M" in i:
m_count.append(i)
if "m" in i:
m_count.append(i)
count = len(m_count)
return count
m_word_count(my_sentence)
2 observations.
Since you’re counting, you don’t technically need to store things in an m_count list, you can just increment a counter. This has memory implications. If your string was a large word document, the worst case of extra space consumption is some linear function of the size of the document. So that’s one thing that can improve space efficiency.
Ah, but of course, that brings up the point, by splitting into string_list
we already have a duplicate of the document anyway. And the question is, can we avoid that?
Consider scanning the string for _m
and _M
instead. This would save the space created by m_count
and string_list
. In situations where space matters this is not trivial. For example if you’re doing operations on huge text files and you pay for the extra temporary scratch space usage on the cloud (through AWS, GCP, Azure, etc.) this type of thing may incur additional costs. Just something to think about.