I’m can’t figure out how to get Scrapy to crawl multiple links on the same url webpage. This is for data mining congressional legislation on the Government Printing Office, specifically this webpage: https://www.gpo.gov/fdsys/bulkdata/BILLSTATUS/115/hr.
In any given Congress there are around 10,000 bills introduced, so I need Python language that looks for a bill number beyond 10,000 to ensure that all possible bills are mined. Putting the number at 20,000 would ensure that happens.
Notice the end of the url. I’d need it to go from /115/hr1 to /115/hr20000.