[Challenge] Anagram Detector

#15

Hi, here is my solution in JavaScript.

I’m sorry if my english is malformed somewhere, but it is not my native language.

This works by looping one time through both strings. Inside this loop I collect the number of occurrences of each character on each string. I collect these counts in two arrays, one for each string. In these arrays the indexes correspond to the char codes, and the values correspond to the counts.

After the loop I compare the two arrays and if they have the same count for the same characters, the strings are anagrams.

I’m assuming the strings will only contain uppercase letters and spaces.

``````function isAnagram (string1, string2) {
// These arrays will collect the counts.
var arr1 = [], arr2 = [];
var i = 0;
// One loop to traverse both strings.
// This loop will run n times, where n is the length of the longest string.
while (i < string1.length || i < string2.length) {
// It could happen that 'i' increases beyond the length of one of the two strings so first I check if there is a character at this index in the strings, and if this character is not a space. If it is a space it will just jump this character.
if (string1.charAt(i) != '' && string1.charAt(i) != ' ') {
// Then I increase the count for this character in the corresponding array.
// The first time a character occurs I asign 1, the next times I increase 1.
if (arr1[string1.charCodeAt(i)]) {
arr1[string1.charCodeAt(i)] += 1;
}
else {
arr1[string1.charCodeAt(i)] = 1;
}
}
// Do the same for the other string.
if (string2.charAt(i) != '' && string2.charAt(i) != ' ') {
if (arr2[string2.charCodeAt(i)]) {
arr2[string2.charCodeAt(i)] += 1;
}
else {
arr2[string2.charCodeAt(i)] = 1;
}
}
// Increase 'i'
i++;
}
// After the loop I compare the two arrays, and if they have the same counts for the same characters, the strings are anagrams.
i = ('A').charCodeAt(0); // Get the char code of the character 'A'
while (arr1[i] === arr2[i] && i <= ('Z').charCodeAt(0)) {
i++;
}
// After the loop if 'i' has increased beyond the char code of 'Z', all the counts were equal.
return (i > ('Z').charCodeAt(0) ? '1' : '0')
}
``````

#16

Hey all,

Just a reminder to please try and include a link to the code you submit in https://repl.it/, if possible, for easier testing - this will help us test the solutions you propose and choose the best solution out of those that work.

You may still paste in raw, markdown-formatted code in your post, but we ask that you try and give us a link to your code in repl.it for ease of testing and evaluation also.

Happy coding!
~CC mods

#17

Here's my entry, using pure Javascript. My thinking is that each word or phrase is really just a set of letters, and two anagrams have matching sets of letters. My function arranges the letters so they can be compared, and then sees if they are a match.

``````var is_anagram = function(str1,str2){
str1 = str1.replace(/\s/g,'').toUpperCase().split('').sort().join('');
str2 = str2.replace(/\s/g,'').toUpperCase().split('').sort().join('');
if (str1 === str2){
//It is an anagram!  Return 1
return "1";
}else{
//It's not a match.
return "0";
};
};``````

On each string:
1. remove whitespace with regex
2. convert to uppercase
3. split into an array to make it possible to sort
4. sort the array (alphabetically)
5. convert back to a string (because it's harder to compare arrays without looping)
After this process, strings that are anagrams should produce the same string. If the produced strings don't match, then you don't have an anagram.

My working example is here: https://jsfiddle.net/clintongalloway/cfb0Lhxc/
I just saw that you prefer this so I took a few seconds to upload this https://repl.it/H2oZ/1

#18

Hi everyone,

Here I present you my Python code for solving the challenge so you can read it and test it yourselves. Right after it you will find a brief explanation on how the code works and how I came up with it:

https://repl.it/H2tf/4

So, if you have looked at the code you would have noticed that the most substantial part is contained in the `anagrams` function, which contains the logic that can tell if two strings are anagrams in O(n+m).

In order to be able to comply the time complexity constraint I assumed that both input arguments satisfied the precondition of being strings composed of uppercase ASCII letters. Then, if I wanted to accomplish the time complexity goal, I knew that I needed to iterate each argument one time at most. So, I came up with this idea of iterate over the first string and count the occurrences of each letter, and then iterate over the second string while trying to “undo” the counting I did before. This way, if the strings were actually anagrams, all my letters counts would be zero by the time I had finished iterating over the second string, and I wouldn’t have found any letter apart from the ones I encountered in the first string.

So, if we inspect the code of the `anagrams` function you will see that basically I use a dictionary for counting the letters occurrences. The first `for` loop is where I define each letter from the first word as a key in the dictionary, and I increment its value as I see them along the string:

``````for letter in first_string:
letters_count[letter] = letters_count.get(letter, 0) + 1
``````

Later on, in the second `for` loop, I walk through each letter of the second string and I retrieve from the dictionary its number of occurrences:

``````    for letter in second_string:
occurrences = letters_count.get(letter, None)
``````

If that value is higher than 1, then I decrease it and store it back, as I already managed to match a pair of the same letter in both strings:

``````if occurrences > 1:
letters_count[letter] -= 1
``````

Otherwise, if that value is exactly 1, that means I have just matched the last occurrence of that letter in the first string. Hence, I remove the letter from the dictionary as I shouldn’t be expecting to encounter this letter again while I finish my second string iteration.

``````if occurrences == 1:
letters_count.pop(letter, None)
``````

Lastly, if that value is None, it means that I have found a letter that I cannot match to an occurrence of the same letter in the first string. Therefore, I can immediately conclude that the strings are note anagrams and return `False`:

``````        if occurrences is None:
return False

``````

By the end of the loop, if no keys are left in the dictionary (meaning that I was able to match each letter from the second string with a letter from the first string and no letter from the first string remained without match), then the words are anagrams. Otherwise, they are not.

``````return True if not letters_count else False
``````

On top of that, I defined a `validate_string` function that runs some checks to the guarantee those assumptions I made for the `anagram` function inputs, and it raises some custom made exceptions in case one of those checks fails.

Then I assembled both of the previous mentioned functions inside a wrapper function called `anagram_detector`, which also is responsible for translating the boolean value returned by `anagram` into a 1 or 0 accordingly.

Finally, I added some tests cases and an interactive routine that allows the user to run the whole test suite or just type in a pair of strings for detecting if they are anagrams.

Well, that is all about the code. I hope you liked it or you find it useful.

Carlos.

Exception handling
#19

A post was split to a new topic: Read those docs

#20

A post was split to a new topic: Exception handling

#21

``````function isAnagram(word1, word2){
var word2Arr = word2.split("");
for(var letter of word1){
var index = word2Arr.indexOf(letter);
if(index == -1){
return "0";
}
word2Arr.splice(index,1);
}

if(word2Arr.length == 0){
return "1";
}else{
return "0";
}
}``````

The function isAnagram takes two arguments: word1 and word2.
Then word2 gets spilled into an array of charactes called word2Arr. Then word1 gets looped through. Each letter gets queried in the word2Arr array. If a match is found, that letter gets removed from the word2Arr array. If a match is NOT found, it means word1 contains a letter that word2 does not, and is thus not an anagram of word2, and returns "0". Once the loop has finished, the length of word2Arr is checked. If all the characters have been removed (an empty array), it means the word is an anagram and "1" is returned.

#22

My solution in Java

``````import java.util.Arrays;
class Input{
static int find(String x, String y){
String s1=x.replaceAll("\\s","");
String s2=y.replaceAll("\\s","");
boolean state=true;
if(s1.length()!=s2.length()){
state=false;
}else{
char[] s1Array=s1.toCharArray();
char[] s2Array=s2.toCharArray();
Arrays.sort(s1Array);
Arrays.sort(s2Array);
state=Arrays.equals(s1Array,s2Array);
}
if(state){
System.out.print("Anagrams");
return 1;
}else{
System.out.print("Not anagrams");
return 0;
}
}
}
class Anagram{
public static void main(String...s){
Input.find("TOM MARVOLO RIDDLE","I AM LORD VOLDEMORT");
}
}``````

Following steps are involved in the code:
1. Remove the white spaces using Regex
2. Convert the Strings into character arrays
3. Sort the characters in the array
4. Use the equals() method to check for matches
5. Return "1" if they are anagrams and "0" if they are not anagrams

#23

Since the input is guaranteed to be strings of spaces and uppercase letters, removing the spaces and comparing the sorted strings will do in Python 3:

``````def isanagram_slow(s1, s2):
if [a for a in sorted(s1) if a!=" "]==[b for b in sorted(s2) if b!=" "]:
return 1
return 0
#    return 1 if sorted(s1)==sorted(s2) else 0``````

This is inefficient but compact easy version. Should whitespace be treated like any other char, the hashed line is the whole function in an even more compact form.

Intermediate difficulty version:

``````def isanagram(s1, s2):
A=[0 for i in range(27)]
B=[0 for i in range(27)]
#    A=[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
#    B=[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
for i in s1:
if i == " ": pass
#        if i == " ": A[0]+=1
else: A[ord(i)-64]+=1
for i in s2:
if i == " ": pass
#        if i == " ": B[0]+=1
else: B[ord(i)-64]+=1
return 1 if A == B else 0``````

A & B will hold the amount of char occurencies in each string. In this case, position 0 counts spaces, others count letters. According to wikipedia, whitespaces don't count (eg "A B C" == "ABC") so I pass them, but they can be enabled by uncommenting the hashed tests.

Should list comprehension be considered to put a dent in the O(m+n) time for shorter strings, A&B arrays can be defined like the hashed examples. The code could easily be scaled to handle all the chars recognized by Python at the memory cost of 2*1,114,111 bytes (valid chr() range according to docs) for the A & B arrays.

#24

https://repl.it/H42L/0

#26

Hi, my first challenge Solution on python:

``````def anagram_tester(s1,s2):
if len(s1)!=len(s2):
return 0
s = [0]*23
for a in s1:
s[ord(a)-ord('A')]+=1
for a in s2:
s[ord(a)-ord('A')]-=1
if s[ord(a)-ord('A')]<0:
return 0
if sum(s)==0:
return 1
else:return 0

s1,s2 = "ANAGRAMANAGRAMAMANAGR","RAMANAGRGRAMANAGAMANA"
print ("Anagram? %d" % anagram_tester(s1,s2))
``````

It’s basically creating a list of the number of presences of each letter and increment on the first word and decrement on the second, the result should be that first there’s no negative numbers, and second the sum should be zero.

https://repl.it/H5NZ/3

#27

Link to Python code on repl.it:
Anagram Tester from Anupama.

Explanation:

ifelse loop logic within the function:

# 1) if length of the two strings do not match, then the strings are NOT anagrams. Stop checking. (iteration = 1)
# 2) if the number of unique letters in the two strings are NOT equal, then strings are NOT anagrams. Stop checking. (iteration = 2)
# 3) Convert to upper case and Sort the strings. The two strings should match at every index. If not, they are not anagrams. Stop checking. (iteration = 2 + length(string1) )

O = (number of iterations) = length(string1) + 2

Raw code:

``````def anagramfn(a, b, strflag):
if(len(a) != len(b) ):
strflag = strflag + 1
elif len(''.join(sorted(set(a.lower()))) ) != len(''.join(sorted(set(a.lower()))) ):
strflag = strflag + 1
else:
a = a.upper()
a = ''.join(sorted(a))

b = b.upper()
b = ''.join(sorted(b))

ptr = 0

while(ptr < len(a)):
if(a[ptr] != b[ptr]):
strflag = strflag + 1
ptr = ptr + 1

if strflag > 0:
print("NOT Anagrams")
else :
print("Anagrams")

return;

# initialize test strings:
a = 'practi ce'
b = 'ce racpti;'

# initialize flag to indicate whether the two strings are anagrams or not.
# strflag = 1 implies the strings are true anagrams.
strflag = 0

# accept the two strings:
print("Anagram Tester")
a = input("Input String1: ")
b = input("Input String2: ")

anagramfn(a, b, strflag) ;``````

#28

A post was split to a new topic: Validation for incorrect inputs

#29

Like last week, I thought I would take a slightly more functional approach to this challenge, and like most people, I figured comparing two sorted lists is the way to go. While a lot of people just used Python’s sorted() function, I figured that if I want to make things more interesting, I could implement my own quicksort function. The quicksort I used is inspired by the elegant Haskell implementation of quicksort. Here it is:

https://repl.it/H54D/2

Now for a quick explanation:

First, I define the quickSort function, which takes a pivot (the first element in a word or list) and then creates two lists, one of which contains all the values in the list after the pivot greater than or equal to the pivot, and one of those values less than the pivot. Then, it puts the pivot between these lists and sorts both of these lists using the same method recursively.

The actual anagram function is pretty straightforward. All that this does is convert the supplied words into a usable format, sorts them, and compares them to see if they are the same. If they are, then they are anagrams, otherwise they are not.

Finally, let’s talk about efficiency. I realize that this is definitely not O(M + N), since it uses quicksort, which is at best O(NlogN). This program is probably slower than this, though, due to the other stuff involved, but the purpose of this exercise for me is to practice my functional programming skills with Python, so I don’t mind the slowness.

Note: here is the wikipedia article on quicksort for anyone looking to read about it in more depth https://en.wikipedia.org/wiki/Quicksort

#32

A post was merged into an existing topic: Validation for incorrect inputs

#34

A solution in Python which has a O(n+m) time complexity. It iterates through the first string and counts up the occurences of each character, and then does the same for the second string but decrements the count instead. If the two strings are anagrams then the count for every character should be 0.

Decided to handle all 128 ASCII codes so that the function works with all ASCII characters and not just capital letters, and it removes the need to subtract a base value from each of the ASCII values to convert it to a 0-based index.

``````def is_anagram(s1, s2):
# can return 0 immediately if the strings are different lengths
if len(s1) != len(s2):
return 0
# for counting the ASCII characters 0 to 127
counts = [0]*128
# increment the count for each character in the first string
for c in s1:
counts[ord(c)] += 1
# decrement the count for each character in the second string
for c in s2:
counts[ord(c)] -= 1
# the strings are anagrams if all the counts cancel each other out
return 0 if any(counts) else 1``````

There is a simple optimisation to check that the strings are the same length at the start of the function; if the strings are different lengths then they cannot be anagrams. Would be significant improvement for very big strings.

A totally inconsequential optimisation (but why not) is the use of the any() function instead of sum(). It is ever so slightly faster as it only needs to do a comparison with 0 instead of addition, and it does lazy evaluation, meaning it will return as soon as a non-zero value is found.

``````%timeit sum([0]*128)
100000 loops, best of 3: 2.48 µs per loop

%timeit any([0]*128)
100000 loops, best of 3: 2.07 µs per loop

%timeit sum([1]*128)
100000 loops, best of 3: 2.47 µs per loop

%timeit any([1]*128)
1000000 loops, best of 3: 764 ns per loop``````

Out of interest, I compared the runtime with a solution that use sorting to compare the strings (shown below). The time complexity for sorting is O(n log(n)), but for short strings the function below is faster. The point where the function above is faster is about strings of length 70 on my laptop.

``````def is_anagram(s1, s2):
return 1 if len(s1) == len(s2) and sorted(s1) == sorted(s2) else 0``````

#35

I had this idea as well, but I don't think it works! Just adding up the ASCII values would mean it would see these two strings as anagrams, "AD" and "BC".

I thought maybe you can use the product but it would only work if each letter was represented by a prime number and the calculation of that product would probably make the function slow compared to the other methods, especially for large strings.

#36

Here is my entry using Python 3.6.1 hosted on repl.it:

https://repl.it/H6EZ/0

My first bit of thinking was to be sure to get the input from the user and prompt them as to what words they would like to compare. I then assigned another set of variables equal to the original input ran through the built-in .upper() function, to guarantee that the function would be handling uppercase letters, as per the challenge rules. Next up was converting the strings into lists with a for loop to iterate through each of the words and append each iterated item to the appropriate list. After assigning the lists to sorted versions of themselves, I then compared the sorted lists to each other and returned 1 for anagrams, and 0 for not anagrams.

``````def anagram():
#ask the user for words to compare
print('What two words would you like to compare and see if they are anagrams?')
print('First word: ')
word1 = str(input())
print('Second word: ')
word2 = str(input())

#converts the strings from the user to uppercase
word1Upper = word1.upper()
word2Upper = word2.upper()

#creation of empty lists for conversion from strings to lists
word1u = []
word2u = []
#for loops do the work of making lists from the strings
for char in word1Upper:
word1u.append(char)

for char in word2Upper:
word2u.append(char)

#sorting the letters into sequential order so that Python can compare the lists
word1u = sorted(word1u)
word2u = sorted(word2u)

#condition that compares the sorted lists to each other to see if they match
if word1u == word2u:
#returns 1 if they match
""" print('anagram')"""
return 1

else:
""" print('not an anagram')"""
#returns 0 if they don't
return 0

#print(word1u, word2u)
``````

Time complexity was not taken into consideration when devising this solution as I’m still definitely a beginner.

#37

Here is my simple code in PHP.

1. Then the string length, if it is same then it will proceed to next step.
2. Then using a for loop, checked each character of the string 1 against string 2 in the reverse order.

``````<?php
function find_anagram (\$sample_string1, \$sample_string2)
{
\$string1 = str_replace(' ', '', \$sample_string1);
\$string2 = str_replace(' ', '', \$sample_string2);

\$str_len1 = strlen(\$string1);
\$str_len2 = strlen(\$string2);

\$flag=1;
if(\$str_len1!=\$str_len2)
{
\$flag=0;
}
else {
\$c = \$str_len2-1;
for(\$i = 0; \$i<\$str_len1; \$i++)
{
If (\$string1[\$i] != \$string2[\$c])
{
\$flag=0;
break;
}
\$c--;
}
}

echo \$flag . "<br />";
}
find_anagram("amma", "amma");
find_anagram("asdf", "asdf");
?>``````

#38

Just a note to anyone using a SORTING FUNCTION this is not `O(n + m)`, so you might want to look for a different solution.