Skip to content Skip to sidebar Skip to footer

Python: Scrapy Csv Exports Incorrectly?

I am simply trying to write to a csv. However I have two separate for-statements, therefore the data from each for-statement exports independently and breaks order. Suggestions? de

Solution 1:

Your order of exporting element is logical to what you find in CSV file, first you exported all the titles then all subtext elements. I guess you are trying to scrap HN articles, here is my suggestion:

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    titles = hxs.select('//td[@class="title"]')
    items = []
    for title in titles:
        item = HackernewsItem()
        item["title"] = title.select("a/text()").extract()
        item["url"] = title.select("a/@href").extract()
        item["score"] = title.select('../td[@class="subtext"]/span/text()').extract()
        items.append(item)
    return items

I didn't test it, but it will give you an idea.

Solution 2:

The CSV module from Python 2.7 does not support Unicode, so it's suggested to use unicodecsv instead.

$pip install unicodecsv

The unicodecsv is a drop-in replacement for Python 2's csv module which supports unicode strings without a hassle.

And then use this instead of import csv

import unicodecsv as csv

Post a Comment for "Python: Scrapy Csv Exports Incorrectly?"