Could anyone tells me what is the difference between using .select() and .find_all when doing Web Scraping? And what are the pros and cons. of each one.
Welcome to the forums.

If you think of the structure of an HTML document, which is like a tree (from top to bottom)…you have parent tags, children, siblings, descendants. :evergreen_tree:

.find_all method will find all instances of whatever you’re searching for. You can pass filters through that method as well (strings, regular expressions, lists for example).

Ex: soup.find_all("p", "title")

Will find all instances of p tags in title
See the documentation: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all

You can the .select() method to locate all elements of a particular CSS class. You can find elements by attr, ID, etc.

soup.select('a[href]') which will find all <href> from a specified class.


It’s basically a way to be more specific for whatever it is you’re looking for in your search of the html document. It can be a little confusing but I think the documentation really helps.