FAQ: Overview of Data Acquisition Methods - Making API request in Python

This community-built FAQ covers the “Making API request in Python” exercise from the lesson “Overview of Data Acquisition Methods”.

Paths and Courses
This exercise can be found in the following Codecademy content:

[Beta] Data Science Foundations

FAQs on the exercise Making API request in Python

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!
You can also find further discussion and get answers to your questions over in Language Help.

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head to Language Help and Tips and Resources. If you are wanting feedback or inspiration for a project, check out Projects.

Looking for motivation to keep learning? Join our wider discussions in Community

Learn more about how to use this guide.

Found a bug? Report it online, or post in Bug Reporting

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

The exercise asks:

Can you see the advantage working with r_json has over r_text ?

I’m still wondering what the advantage is. I printed both the text version and the json version right next to each other, and the only difference I could see was that the r_text used double-quotes around each string, while r_json used single-quotes. But I thought that was a stylistic choice in Python?

I feel like I must be missing something, but hitting the hint button only showed the code you were supposed to type, which I had already done correctly.

Any insights would be greatly appreciated.

7 Likes

Hi emly3430,

My (newbie) take on the difference is size and speed. The JSON response used single quotes but also saved space by chaining the sublists together; the text response was ordered (each entry appeared on a new line). So, I think this implies that the JSON response will be quicker than a text response (as it has fewer formatting), which could be very helpful when these requests scale up.

2 Likes

Hello there, I’m having some trouble with this exercise. When I initially started making the request, doing the part with .text worked fine, but I could see after printing that I got the message “I exceeded my attempts using the census API.” Whilst it was a weird thing to see I could put two and two together and see how the concept works. However, I cannot complete the exercise now because the.json part keeps throwing up an error and after checking my code with the solution code it’s the same. Any ideas? Is there a time period where I can make another request on the census website?

Any help would be really appreciated. Thanks.

Edit: I’ve also realized it affects the rest of the exercises that are part of the lesson. Please help :slight_smile:

Hi! Having the exact same problem. Using the .json() method results in a huge error:

Traceback (most recent call last):
  File "script.py", line 9, in <module>
    r_json = r.json()
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I also got this error about exceeding the attempts with the census API.

I went ahead and requested an API key through the https://www.census.gov/data/developers.html site and incorporated that key in my GET request by adding &key= and then the API key to the end of the API call. However, the automated grader is identifying the use of the key portion of the API call as incorrect despite being able to pull the data. Unsure of how to proceed from here

Eventually I just used VPN to reset my attempts

I tried using a VPN as well but it didn’t work for me. Kept getting the same message.

I’m also having this problem. I even copied the solution and pasted it into the code area and I still don’t get the right answer

Hi, I don’t understand how to pull the appropriate data in step 2. Can anyone help?

I had the same question. I found this video. It gave me some insights Working with APIs in Python - Code in 10 Minutes - YouTube

Hi there code0673255412!

I don’t know precisely what your question is about. But since I see that you posted it 30 days ago, I don’t want you to wait any longer. So I try to be as complete as possible. If this doesn’t answer your question or you have remaining questions, let me know!

Question 2 is as follows:

Make a GET request to the Census API to request county-level data containing

  • the NAME variable,
  • the total commuters count, and
  • the count for commuters who travel 90 or more minutes
  • for all counties
  • within the state of New York.

Lets reverse engineer the URL needed for the solution: https://api.census.gov/data/2020/acs/acs5?get=NAME,B08303_001E,B08303_013E&for=county:*&in=state:36

The first part is https://api.census.gov. When you look this URL up with your browser, it will change into https://www.census.gov/data/developers.html.

When you go to this website, you see in the left corner “Available APIs.” When you click this, the URL will change to https://www.census.gov/data/developers/data-sets.html. As you can see in this URL, the “/data” got added, which is also in our URL solution.

Now we are looking at the “/2020/acs/acs5?” part. 2020 is the year of the data, but I don’t see 2020 on the webpage. What is do see is ACS (first row in the table, also called American Community Survey).

When you click on the ACS, there will be multiple options. But we are looking for ACS 5 and the year 2020. The “American Community Survey 5-Year Data (2009-2020)” is our best option.

After clicking on this option, you get to the page where all the information regarding the database is listed. We are looking for a table that describes all the different variables in the dataset. This will be under the “Detailed Information”-header on row “2020 ACS Detailed Tables Variables”. Click on the HTML link to see what is in there.

The next part of de URL-solution is get=NAME. As you can see in the table, the first column’s header is NAME. This will be the column where it will look for the other URL elements.

Go all the way down to B08303_001E in column NAME. Tip: use cntrl + f to search for B08303_001E. In the second column, you see “Estimate!!Total:” and in the third, “TRAVEL TIME TO WORK.” Now the API will get the information related to the second bullet: “the total commuters count.”

This is how you can find all the different elements for the bullets in question 2.

Hopefully, this was of any help.

Kind regards,
Benjamin

3 Likes

Hi emly3430,

JSON response uses key and value pairings, which makes it very similar to the dictionary object in python. This means that the JSON response has the similar benefits in different use cases, for example:

  • Extract data by key (i.e. name of field, rather than position)
  • Easier (to work into dataframes for large datasets
  • Requires less formatting of data (compared to a long text string, which you would have to parse to extract individual elements - which could be rather complicated)

Couple of other resources:
https://www.codecademy.com/learn/learn-python/modules/learn-python-lists--dictionaries-u-3
https://docs.python.org/3/tutorial/datastructures.html

2 Likes

Hi All,

I hit a couple of issues with this step that I thought I’d share:

  1. Creating request URL: if you’re wondering where on earth to start, the previous few pages had info on the API that lets you build the request url for each of the required criteria. Best place to start is the documentation that was linked to: Census Data API: /data/2020/acs/acs5/groups/B08303

  2. JSONDecodeError: This error, in my case, comes from the request being rejected by census.gov. I get a 429 response, “too many attempts” (print(r)) - this is probably because there are lots of people using the codecademy service and all of them making requests to census dot gov is causing the IP to be restricted. You’re seeing this error because the response from the API service is returning HTML, which you can see when you call print(r.text()), but is not in json format so when you attempt to decode it as json using r.json() your code falls over.

  3. Stuck getting API key: One way to get around the “too many attempts” is to register for a developer key and include it in the request URL. I can’t find where to register on the census dot gov site (despite the error message including a link), and as method4035792217 commented above the auto-grader marks it as incorrect, so I’m totally stuck.

I’ve raised a bug using the link in the opening admin post at the top of this thread and suggest anyone else who is stuck does the same

2 Likes

I found that r_text is just a string while r_json is a 2D list.

print(type(r_text))        # <class 'str'>
print(type(r_json))        # <class 'list'>
print(type(r_json[0]))     # <class 'list'>

So I think one advantage of working with r_json over r_text is that r_json is structured data. For example, you can convert it to a pandas DataFrame (which we’ll learn in later lessons) for further usage.

import pandas as pd

r_df = pd.DataFrame(r_json[1:], columns=r_json[0])
4 Likes

The only difference I saw between the two were that r.text is has the list of lists on a new line for every value while r.json() has the list of lists all together. See below:

r.text results:

[["NAME","B08303_001E","B08303_013E","state","county"], ["Allegany County, New York","18308","497","36","003"], ["Cattaraugus County, New York","31039","629","36","009"],...]

r.json() results:

[['NAME', 'B08303_001E', 'B08303_013E', 'state', 'county'], ['Allegany County, New York', '18308', '497', '36', '003'], ['Cattaraugus County, New York', '31039', '629', '36', '009'], ['Chemung County, New York', '35006', '736', '36', '015'], ['Columbia County, New York', '25990', '813', '36', '021'], ['Dutchess County, New York', '132346', '10044', '36', '027'],...]

There aren’t any indications of a dictionary.

Can someone please explain to me what the advantage is if the only differences I find are: one uses single ’ vs double " and that r.text shows every list within the list on a new line vs r.json() grouping the list of lists?

Thank you for help.

r.text is a string, not a list. It’s hard to tell just by printing values, but you can figure it out by checking types.

Hi,

I agree it’s not immediately obvious why one is better than the other by simply printing, in fact the r_text is easier to read by humans when printed.
It would be useful to add additional instructions - to print the class of the r_text and r_json objects and to try accessing different parts of them using an index (e.g. r_text[0:10] vs r_json[0:10]). the json is a list, while the text is a string, and therefore the advantage of the json is that it is already structured and data can be accessed/analysed easily. It seems like the text is just a version of the json that isn’t “parsed” yet.

2 Likes

Fun note: with the r_json I was able to do this little bit of ‘cleanup’:

import requests r = requests.get('https://api.census.gov/data/2020/acs/acs5?get=NAME,B08303_001E,B08303_013E&for=county:*&in=state:36') r_text=r.text r_json=r.json() r_json[0][1]='Total number of commuters' r_json[0][2]='Number of commuters who take >90min' print(r_json)

which you can’t do with r_string.

I was gonna ask the same question. I don’t see the advantage either.