Solution Sharing

3 Likes

This is how I did it. I think it’s a lot more intuitive than the provided solution:

def frequency_dictionary(words):
frequency = {}
for word in words:
if word in frequency:
frequency[word] += 1
else:
frequency[word] = 1
return frequency

13 Likes

I did it in one line
def frequency_dictionary(words):
return{key:(words.count(key)) for key in words}

12 Likes

That’s quadratic though, should be linear.
for example, a string of length 1000000 should take about 1000000 steps. Yours takes 1000000000000 steps

You can test this yourself:

# should complete in at the very most a second
frequency_dictionary('a' * 1000000)
2 Likes

Hi, ionatan,
My code is inefficient too right? How should it be solved, better to run a normal for loop and update a empty dict with key and count?

# Write your frequency_dictionary function here:
def frequency_dictionary(words):
  return {word:words.count(word) for word in words}
3 Likes

It’s exactly the same. Consider what count does. Expand its equivalent code:

def frequency_dictionary(words):
    counts = {}
    for word in words:
        count = 0
        for word_ in words:
            if word == word_:
                count += 1
        counts[word] = count
    return counts

What frequency_dictionary should be doing while iterating is to keep multiple running sums so the dict has to exist before you start iterating. For each value, add 1 to the corresponding sum.

If you really wanted to one-line this then what you’re looking for is functools.reduce where you’d have a dict as the accumulator and as the function you’d have “increase the sum for this key”… but it isn’t pretty.
…also collections.Counter already does this, but the point is to implement it

5 Likes

def frequency_dictionary(words):
return {word: words.count(word) for word in words}

The point of the function being linear and not quadratic is so that it takes less time to execute?

1 Like

Less clock ticks, as it were.

1 Like

That’s a description of how the amount of work scales with the size of the input.
For example, looking through a yellow pages to find a phone number by name can be done either by reading it from start to finish, or by doing binary search (start in the middle), which is more work? Reading from the start takes O(N) (linear) amount of work where N is the amount of names, and binary search takes O(log2 N) (logarithmic) amount of work

4 Likes

Thank you all for sharing your beautiful code and many approaches. It really helps.

Now to the very entry-level question…the first code on this post (a),
is it a linear code? (b) is a quadratic code…hope I understood correctly.

How do you see at once if the code is a linear or quadratic?

a)

def frequency_dictionary(words):
  frequency = {}
  for word in words:
    frequency[word] = words.count(word)
  return frequency

b)

 def frequency_dictionary(words):
  return {word:words.count(word) for word in words}
1 Like

One easy thing to look for are nested loops. That would make it quadratic since the values in the inner loop are revisited with each iteration of the outer loop.

Your code examples above only visit upon each piece of data one time. That would make it linear.

However, …

.count() is an iterator so can be viewed as an inner loop that revisits the data in words multiple times. So, … One could say that it is quadratic. This will still need more bearing out so keep drilling down this Big O rabbit hole.

5 Likes

That makes sense…LMAO. I think I probably had the wierdest solution…that worked.

def frequency_dictionary(words):
  new_dict = {}
  for word in range(len(words)):
    if word == 0:
      new_dict[words[word]] = 1
    else:
      if words[word] == words[word - 1]:
        new_dict[words[word]] += 1
      else:
        new_dict[words[word]] = 1
  return new_dict

Not nearly as efficient…but kinda shows how my mind works…lol…I don’t recall ever going over .count()…was that something that was ever reviewed and I forgot it?


def frequency_dictionary(words):
  
  #Establish a blank list to reference words that have been added already
  reference = []
  
  #Establish a dictionary to hold the words and associated count
  frequency = {}
  
  #starts the loop to iterate through the list entered as a parameter
  for word in words:
    
    #checks to see if the word has already been counted by using reference list
    if word in reference:
      
      #if the word has been counted already it moves to the next word in the parameter list
      continue
      
    #if the word has not been counted yet, it gets added to the reference list and then the .count() has to only run once through the parameter list
    else:
      reference.append(word)
      frequency[word] = words.count(word)
      
  # returns the created dictionary after the loop has terminated
  return frequency

With this construction the code doesn’t have to iterate fully through the list and overwrite the entries in the created dictionary multiple times, nor does it need to keep track of the count with a incremented counter, and it only has to call the .count function multiple times. It also gives flexibility in terms of pre-selecting for certain words if need be, since you can hardcode values in the reference list to skip them.

You have to iterate through the whole list, you’re counting them all, so you have to look at them all.
Using list.count will iterate through that again. So since you’re using it several times you are iterating through the whole list several times.
You’re also iterating through reference many times, its information is already available in your frequency dict and is available without iteration.

Ideally what you would do is iterate through the list once, adding 1 for each item encountered.

This might be easier to realize if you implement in and .count yourself, so that you are aware of what they are doing.

Thank you for the reply, yet I need your answer broken down which many people don’t do. I’m working under the assumption that this quoted code from the beginning of the page is the ideal answer. When you say that I’m using .count several times, I thought what I was doing is using it less times than this code since the line

for word in words: frequency [word] = words.count(word)

appears to be calling .count() to assign a count for every single word in the parameter dictionary. If the solution uses .count() anyway, then why does my solution not help the problem by using it only several times as opposed to every single time?

To your second point, how am I able to utilize the information in my frequency dict without iteration? Using .count() gets the count of all like elements in the list but that is iteration, and using manual incrementation by += 1 still requires iterating through the created dictionary in order to find a value to increment if it exists. If possible could you explain how the “adding 1 for each item encountered” doesn’t require a second level of iteration?

And I don’t understand what you mean by “implement in and .count() yourself” because that is what I thought I was doing and the online code viewer doesn’t show things like a debug console or display the time it takes to actually run my code in ms. What can I do get a clearer understanding of the inner workings of these keywords/methods?

list.count searches through the whole list, so if you do that once for each word then what you have is one loop nested inside another. if you for example have 100 unique words then you would be making a total of 10000 iterations, there should be 100.

dict is lookup by key, it does not involve iteration. otherwise there would be no point you’d use list instead.

Thank you but this answer is still not clear. When I said in the previous post

I’m working under the assumption that the quoted code from the beginning of the page is the ideal answer.
…
since the line appears to be calling .count() to assign a count for every word in the parameter dictionary.

The first part of your reply:

list.count() searches through the whole list, so if you fo that once for each word then what you have is one loop nested inside another.

reinforces what I was initially asking: If the ideal code appears to start a nested loop because .count() does appear to be called for each word, then how can this be avoided. Secondly, is the quoted code from the beginning of the page by prosowski the ideal solution?

The second part of your reply:

dict is lookup by key, it does not involve iteration. otherwise there would be no point you’d use list instead.

is this referring to the use of the keyword “in”? If so, then is the code posted by mwail583 on Mar’19 the type of solution you meant by

Ideally what you would do is iterate through the list once, adding 1 for each item encountered.

My reply disqualifies that code as ideal.

There should be a total of one iteration through the list. Each time count is used, it iterates through the list, so if you need to use count multiple times, then you’ve exceeded that limit of one, that’s no longer ideal.


How you would avoid a nested loop is something you can answer by considering what a dict at all does. What does a dict provide, what is a dict?


the operator in does not itself do anything, it is a request for membership testing, dict and list will do different things in response to those requests because they’re different structures and therefore obtain that information through different strategies


Additionally, you can do better then O(n^2) with lists alone. You can get that down to O(n log n) which is much closer to linear than square. But you’ve got dict, so therefore you can do it in O(n), linear amount of time, for each unit of input you add you add one unit of time cost.

Can somebody explain to me how it works : Why Apple returns 2 not 1 ? I did thru count method.

# Write your frequency_dictionary function here:
def frequency_dictionary(words):
  freqs = {}
  for word in words:
    if word not in freqs: **<________So first loop, apple is not in freqs      | 2nd loop, apple is in freqs**
    	freqs[word] = 0    <________apple = 0 ?                                        | 
    freqs[word] += 1                                                                                  | 1 = 0 + 1 ?
  return freqs      ***_______ returns freqs apple = 0 ?                           |  returns 1 ?* 

# Uncomment these function calls to test your  function:
print(frequency_dictionary(["apple", "apple", "cat", 1]))
# should print {"apple":2, "cat":1, 1:1}
print(frequency_dictionary([0,0,0,0,0]))
# should print {0:5}