RPA Fraud Detection Challenge 7

Hi all,
This is my code for challenge number 7 in the RPA FRAUD DETECTION project but it does not return any values.

SELECT full_name, zip_code
FROM transaction_data
WHERE zip_code BETWEEN '300%' AND '319%'

Is this correct ?

After looking the forums I found this solution:

ALTER TABLE transaction_data
ADD COLUMN zip_code VARCHAR;

UPDATE transaction_data

SET zip_code = zip;

SELECT full_name, zip_code FROM transaction_data
WHERE zip_code BETWEEN '300%' AND '319%' OR  zip_code  LIKE '398%';

I’m not sure why do we have to create a new column and set it LIKE ‘398%’

Thank you

1 Like

Could you please link the lesson, some context might help :slightly_smiling_face:.

Sorry about that this is the lesson

https://www.codecademy.com/paths/data-analyst/tracks/dacp-data-acquisition/modules/dscp-sql-challenge-projects/projects/reputable-company

Ta. There are far too many lessons to remember off-hand and there an awful lot of lessons to try and look through otherwise.

Do you have the details for challenge 7?

Is it written as a comment or something? I don’t believe it was there when I did the course and I can only see the 6 main targets and the solution I wrote.

Yes it is written as a comment.

– 7

– Challenge

– Return only those customers residing in GA. Use the list of ZIP CODE prefixes

– (List of ZIP Code prefixes - Wikipedia)

– to determine the best query for zip codes belonging to Georgia(GA).

1 Like

The original zip column is an integer type so you could shift this query a little to get a similar output-

BETWEEN 30000 AND 31999

So I don’t think you have to add additional columns; perhaps a simpler route if you want the string patterns instead of the explicit integers might be to CAST the type before the filtering (though it’d need something like a CTE/subquery or HAVING due to the execution order of WHERE).

I’m not all that familiar with US zip codes but so far as I know they aren’t all perfectly grouped, your links suggests there are codes for “GA”-Georgia at 398** and 399**. I’m not really sure if the 399** range is public but perhaps the original poster is more familiar with them and the 398** range are valid Georgia zip codes.

3 Likes

I had a facepalm moment when I read your answer. Thanks so much! :smiley:

2 Likes

Nice answer, @tgrtim.
But what if we don’t know how many digits we have after 300n?
I guess that’s what @emersonla was asking about. Wildcard % can be really useful here but it doesn’t work. Is there a way to use it like this: WHERE zip_code BETWEEN ‘300%’ AND ‘319%’

You’re quite right that the LIKE operator and wildcards can be be very useful and it’s well worth knowing how to use them. In this particular example the zip_code column is an INTEGER type and you do know how many digits are involved (so far as I’m aware basic zip codes are 5 digits) which I think makes the use of BETWEEN quite effective (readable and should be relatively efficient).

You could write BETWEEN "30000" AND "31999" and sqlite will happily interpret it but it seems odd to try and use strings when the data type is already an INTEGER.

If you’re curious about a general solution to this for something other than zip codes then the original post shows how you’d use wildcards by first converting the data type to a string which seems like a decent option.

Arguably I think you can skip this step, I don’t know exactly how SQLite uses the LIKE operator but it seems like you can try to match against any data type (I’d assume there’s some hidden type casting involved but don’t quote me on it) so you could try to use the wildcard solution without first converting the integer to text to save the creation of a new column.

Bear in mind there are more powerful text matching operations than LIKE if you run into a pattern matching problem in the future so have a web search if you’re curious.

Since this post is quite old in the future rather than reviving an old thread it’d be worth starting a new question and simply adding a URL back to the original question if it was relevant.

3 Likes

Thanks for the solution, but do we know why we created a new column?

I’m guessing it’s something to do with it being an INTEGER

1 Like

SELECT full_name, zip

FROM transaction_data

WHERE zip BETWEEN ‘30000’ AND ‘31999’

ORDER BY zip DESC;

An approach I took.

1 Like

I see your issue, and it’s great to see you reaching out for help and trying to find a solution.I think adding a new column and copying the values from the zip column into the zip_code column ensures that you have a correct data type for the zip_code column which can be used for string comparison operations like the one in your query. You could also use zaptest free edition, maybe it’ll work better there.
If you still have any questions or need clarification, don’t hesitate to ask!

I just finished this and also used another combo. Also, ZIP is set as an INTEGER so it doesn’t need to be in quotes.

SELECT full_name, zip

FROM transaction_data

WHERE zip >= 30000

AND zip <= 31999;