Skip to content Skip to sidebar Skip to footer

How To Scrape Paginated Links In Scrapy?

The code for my scraper is : import scrapy class DummymartSpider(scrapy.Spider): name = 'dummymart' allowed_domains = ['www.dummymart.com/product'] start_urls = ['htt

Solution 1:

You can try this method (I'm trying to find a link after current page):

next_page_url = response.xpath('//li[ ./a[@class="curr"] ]/following-sibling::li[1]/a/@href').extract_first()
#next_page_url = response.urljoin(next_page_url)if next_page_url:

    yield scrapy.Request(url = next_page_url, callback = self.parse)

UPDATE According to your new HTML you need this code:

next_page_url = response.xpath('//li/a[@aria-label="Next"]/@href').extract_first()
if next_page_url:
    yield scrapy.Request(url = next_page_url, callback = self.parse)

Post a Comment for "How To Scrape Paginated Links In Scrapy?"