from urllib.request import urlopen as urq # webclient
from bs4 import BeautifulSoup as bs #html strutucre
# url to scrape from
url="https://www.newegg.com/global/ph-en/p/pl?d=GTX&N=-1&IsNodeId=1&bop=And&Page=1&PageSize=36&order=BESTMATCH"
# open the connect and download url
uclient = urq(url)
pghtml = uclient.read()
#parser the html on uclient
soup=bs(pghtml,"html.parser")
uclient.close()
# find product in store page...
containers=soup.findAll("div",{"class":"item-container"})
for container in containers:
brand = container.div.div.img["title"] # grab brand name
print("Manufacturer:" + brand)
Hello, your code is not formatted so it’s hard to answer your question. Please read the article How to ask good questions (and get good answers) .
Description of the problem suggests that your line that prints the brand is probably placed out of the loop, something like this:
for container in containers:
brand = container.div.div.img["title"]
print("Manufacturer:" + brand)
The last line of this code will be executed only one time after the loop ends. So in result you will see the brand name of the last product that was in the loop.
This is how the code should be formatted:
from urllib.request import urlopen as urq
from bs4 import BeautifulSoup as bs
url="https://www.newegg.com/global/ph-en/p/pl?d=GTX&N=-1&IsNodeId=1&bop=And&Page=1&PageSize=36&order=BESTMATCH"
uclient = urq(url)
pghtml = uclient.read()
soup=bs(pghtml,"html.parser")
uclient.close()
containers=soup.findAll("div",{"class":"item-container"})
for container in containers:
brand = container.div.div.img["title"]
print("Manufacturer:" + brand)
Execution of this code will result in the list of manufacturers.
1 Like