Solution Sharing

if word not in freqs: 
    	freqs[word] = 0

Consider what the above does.

* test for membership
* initialize new word entry

When the next line is encountered the key does exist in the dictionary, with a number value that can be increased by 1.

If you look at the list that is sent in the first call, there are two 'apple’s, and thus a count (frequency) of 2.


There are likely a dozen ways to create a frequency table, the above being one approach.

Following is a musing that uses structure as opposed to logic.

>>> p = "star light, star bright, first star we see tonight"
>>> q = p.split()
>>> r = set(q)
>>> s = dict(zip(r, [0] * len(r)))
>>> for t in q: s[t] += 1

>>> s
{'tonight': 1, 'star': 3, 'see': 1, 'bright,': 1, 'light,': 1, 'we': 1, 'first': 1}
>>> 
2 Likes

Very simple solutions using list comprehension:

def frequency_dictionary(words):
  return {w:words.count(w) for w in words}
1 Like

My Solution:

def frequency_dictionary(words):
new_dict = {}

for word in words:
new_dict[word] = words.count(word)
return new_dict

return reduce(lambda d, word: d.update({word: d.get(word, 0) + 1}) or d, words, {})

vs.

dictionary = {}
for word in words:
  dictionary[word] = dictionary.get(word, 0) + 1
return dictionary
1 Like

mine is long. I created a unique list. I like the idea of creating lists and combining them with list comprehension.

def frequency_dictionary(words):
 unique_list = [] 
 for x in words: 
  if x not in unique_list: 
   unique_list+=[x]
  count=[]
  for u in unique_list:
   count+=[words.count(u)]
   lc={key:value for key,value in zip(unique_list,count)}
 return lc

Isn’t lc something you would create once? You are creating that MANY times.
for a list of 100 unique values, you create lc 10000 times, and each time it would be making 100 iterations, for a total of 1000000 steps. that’s a bit more work than I’d expect to count 100 things

you’d be better off if you didn’t nest your loops within one another

def frequency_dictionary(words):
    unique_list = []
    for x in words:
        if x not in unique_list:
            unique_list += [x]
    count = []
    for u in unique_list:
        count += [words.count(u)]
    return {key: value for key, value in zip(unique_list, count)}

but that’s still 10000 steps for 100 unique values, because you’re using list.count 100 times, each of which will iterate through the list which is 100 elements

similarly, your method of finding the unique values takes 10000 steps, because each time you use not in you are iterating through the list, and you’re using it 100 times

Can you manually find the unique values from a list of size 100? does it take 10000 steps to do that?
a better strategy could be to sort the values, and then remove consecutive duplicates. you can do even better than that though, if you use a dict

2 Likes

I agree. I’m still confused by their solution. If we assign a 0 value on the 1st iteration, as per the Codecademy solution, I would expect the frequencies to all be underestimated by 1. That is not the case, it works but I don’t quite get how/why?

Follow the execution order of statements and it should make a little more sense. A pen and paper doodle might help too.

2 Likes
def frequency_dictionary(words):
  frequency = {}
  for key in words:
    if key not in frequency:
      frequency[key] = 1
    elif key in frequency:
      frequency[key] += 1
  return frequency
1 Like

Write your frequency_dictionary function here:

def frequency_dictionary(words):
dic ={}
for i in words:
dic[i] = words.count(i)
return dic

I am quite confused with the reason for why one of my variables change when I attempt to make to replicate the variable to another variable to not change the initial variable, however when changing the new variable the initial variable also changes, I do not understand how to fix this.

def frequency_dictionary(words):
  dictionary = {}
  def reduce(set, counter):
    while len(set) > 0:
      if set[0] == word:
        counter += 1
      set.pop(0)
    dictionary[word] = counter
    return counter
  for word in words:
    counter = reduce(words, 0)
  return dictionary

I’d highly suggest changing your names: set and the function reduce to something else since they are both Python keywords. Whilst they are short-lived inside the function it’s still bad practice.

As for any changes you should note that dcitionaries are mutable. If you remove a value then that’s exactly what happens. Why not try printing out some of the contents of your loops and observe what changes on each iteration and why that’s providing you with an unexpected output.

Thankyou, I got a new working solution as it is shorter and I am pretty sure is linear.

def frequency(lst):
    dictionary = {}
    for x in lst:
        if x in dictionary:
            dictionary[x] += 1
        else:
            dictionary[x] = 1
    return dictionary
def frequency_dictionary(words):
  dic = {}
  for w in words:
    if w not in dic:
      dic[w] = 1
    else:
      dic[w] += 1
  return dic

This is my original code, and I came here to see how others did. I was surprised by how it can be simplified into one single line at first. But as I was reading through everyone’s comment, I realized that shorter coding doesn’t always mean quicker process or better. The use of count adds unnecessary process which makes the shorter code actually more complex to execute. Thanks for everyone’s contribution.

def frequency_dictionary(words):
  return {word: words.count(word) for word in set(words)}

You’re given a list containing duplicate words. Using set(words) removes all duplicates. This eliminates extraneous calculations.

could you elaborate why did you use set() ? what is the problem with this ?

def frequency_dictionary(words):
  return {word:words.count(word) for word in words}

can anyone explain step by step and replace each value as you break down the problem ?

Hi Geovanny! Welcome to the forums.

The simplest, naivest solution so far (as far as I can tell) is this one:

def frequency_dictionary(words):
   dicF = {}
   for word in words:
     if word in dicF:
       dicF[word] += 1
     else:
       dicF[word] = 1
   return dicF 

With this solution you only iterate through the list words once. The function enters the loop, takes a word from that list and checks the conditionals: is this word already in the dicF dictionary? (looking up things in dictionaries is much faster than in lists, so not too much processing time is added there). If the answer is yes, the first conditional statement is entered and it adds 1 to the value of the already existing word.
If the answer is no, the function continues and enters the else statement: here it adds the new word to the dictionary and adds its first value of 1.
The function loops trough all words in the list, builds the dictionary and finally returns the latter.

The, let’s say, fanciest or most-knowledgeable-of-python-tools for this use case so far leverages the collections.Counter() class, as it seems to have been built for this purpose (counting the frequency of stuff in an iterable). You can learn more about that here.

Solution with Counter()?
def frequency_dictionary5(words):
  count = Counter(words)
  return count

It is really helpful to sprinkle the code with print() statements if you want to check what is happening in a solution. If you want a step by step, add one print() statement at a step, execute the function once and see what happens. Delete that statement, add it to the next step and check again so on so forth. Alternatively you can use python tutor to help you visualize the execution.

Hope it helps!

1 Like

If we take the naive solution and swap out the if statement with some simple logic…

>>> def frequency(s):
	d = {}
	for x in s:
		d[x] = (x in d and d[x] or 0) + 1
	return d

>>> 

A little more refinement and we can sort by count…

>>> def frequency(s):
	d = {}
	for x in s:
		d[x] = (x in d and d[x] or 0) + 1
	return dict(sorted(d.items(), key=lambda x: x[1], reverse=True))

>>> 
1 Like

Say you have a list of words like words = ['a', 'a', 'b']. Ask yourself how many times the count method will be run by your code versus mine.

your_code = {word: words.count(word) for word in words}
my_code = {word: words.count(word) for word in set(words)}

Click on the blur for the answer:

Yours will call .count() three times, once for each word, whereas mine will call .count() twice because set() eliminates the duplicate 'a'. The difference is negligible unless your word list is large and has many duplicates.

I hope that helps! :blush:

2 Likes