Python: Scrapy Csv Exports Incorrectly?
I am simply trying to write to a csv. However I have two separate for-statements, therefore the data from each for-statement exports independently and breaks order. Suggestions? de
Solution 1:
Your order of exporting element is logical to what you find in CSV file, first you exported all the titles then all subtext elements. I guess you are trying to scrap HN articles, here is my suggestion:
def parse(self, response):
hxs = HtmlXPathSelector(response)
titles = hxs.select('//td[@class="title"]')
items = []
for title in titles:
item = HackernewsItem()
item["title"] = title.select("a/text()").extract()
item["url"] = title.select("a/@href").extract()
item["score"] = title.select('../td[@class="subtext"]/span/text()').extract()
items.append(item)
return items
I didn't test it, but it will give you an idea.
Solution 2:
The CSV module from Python 2.7 does not support Unicode, so it's suggested to use unicodecsv instead.
$pip install unicodecsv
The unicodecsv is a drop-in replacement for Python 2's csv module which supports unicode strings without a hassle.
And then use this instead of import csv
import unicodecsv as csv
Post a Comment for "Python: Scrapy Csv Exports Incorrectly?"