FAQ: Web Scraping with Beautiful Soup - Select for CSS Selectors

This community-built FAQ covers the “Select for CSS Selectors” exercise from the lesson “Web Scraping with Beautiful Soup”.

Paths and Courses
This exercise can be found in the following Codecademy content:

FAQs on the exercise Select for CSS Selectors

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

1 Like

Looks like this issue was due to a browser based bug. Refreshing the page allowed me to proceed.

when writing turtle_name = turtle.select(".name")[0]
why are we using the [0]?

Thanks

Hi @teoxd,

Link to Exercise: Learn Web Scraping with Beautiful Soup: Select for CSS Selectors

The expression turtle.select(".name") without the [0] gives us a list of all the tags with the class "name" on the page to which we have linked. It turns out that each page that we access within the for loop has only one tag assigned to the "name" class, but still, that single tag is contained within a list. To retrieve that tag from the list, we index the result of the expression with [0].

Edited on August 1, 2019 to add a link to the exercise

Hello all,

I am writing a response to this step:
(
First, before the loop that goes through the turtle_links , create an empty dictionary called turtle_data .
)

I am wondering if others are catching the exception that running the code more than once will throw.

I like this very pythonic looking approach:
(
def mkdir_p(path):
try:
os.makedirs(path)
except OSError as exc: # Python >2.5
if exc.errno == errno.EEXIST and os.path.isdir(path):
pass
else:
raise
)

The weird part is that when I run not catching the exception, just calling os.mkdir, then the error still pops up…even when I specifically cleared the dir away before running.

No worries. I have my work-around…NOPE…just checked…

  1. Using the exception handler, when the dir already exists…I see:

If I clobbered the dir and rerun i see:


(note that there was no output to the console…the dir was not found to already exist…and it was indeed created, as I can see, but I still see that error and cannot proceed.

Well, all apologies for one this mess!

Oh goodness…

I got a wild hair and thought that maybe it just wants to see something called turtle_data to be created…i.e. a variable name…

So I tried running:

#Define turtle_data:
turtle_data = “turtle_data”
mkdir_p(turtle_data )

but that errd with “expected turtle_data to be a dict” (!)

So, I added:

#Define turtle_data:
turtle_data = {}
mkdir_p(“turtle_data”)

…and I am All Green. LOLOL!

How does one typically approach an error that does not happen when one runs code outside of the web interface?

ie. this error “invalid syntax (”

But the code runs fine from Visual Studio and the command line.

chahn@DESKTOP-GE7BAOA MINGW64 /c/ROOT/study/python/scratch
$ python soupy4.py

|AGE: 1.5 Years Old|
|WEIGHT: 4.5 lbs|
|SEX: Female|
|BREED: African Aquatic Sideneck Turtle|
|SOURCE: found in Lake Erie|

chahn@DESKTOP-GE7BAOA MINGW64 /c/ROOT/study/python/scratch
$