Skip to content Skip to sidebar Skip to footer

How Can I Check If Either Xpath Exists And Then Return The Value If Text Is Present?

I'm having trouble with the second r.html.xpath request. When there is a special deal on an item, the second Xpath changes from //*[@id='priceblock_ourprice'] to //*[@id='priceb

Solution 1:

if r.html.xpath('//*[boolean(@id="priceblock_ourprice"):
    productprice = str(r.html.xpath('//*[boolean(@id="priceblock_ourprice")]', first=True).text)
elif r.html.xpath('//*[boolean(@id="priceblock_dealprice"):
    productprice = str(r.html.xpath('//*[boolean(@id="priceblock_dealprice")]', first=True).text)      

product = {
        'title': str(r.html.xpath('//*[@id="productTitle"]', first=True).text),
        'price': productprice,
        'details': str(r.html.xpath('//*[@id="detailBulletsWrapper_feature_div"]', first=True).text)
    }


Something like that. I am not exactly sure if the syntax is totally correct.


Solution 2:

Why don't you use the try and except command to check if the value exists. You get the error because the value you are trying to get has no text in it.

I haven't got requests_html, but I will show the code using the selenium module.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep, strftime
import pandas as pd

urls = ['http://amazon.com/dp/B01KZ6V00W', 'http://amazon.com/dp/B089FBPFHS']

webdriver = webdriver.Chrome()
old_price = ""


def getPrice(url):
    global old_price
    global webdriver

    webdriver.get(url)

    sleep(5)

    title = webdriver.find_element_by_xpath("/html/body/div[2]/div[2]/div[7]/div[5]/div[4]/div[1]/div/h1/span").text

    try:
        old_price = webdriver.find_element_by_xpath("/html/body/div[2]/div[2]/div[7]/div[5]/div[4]/div[10]/div[1]/div/table/tbody/tr[1]/td[2]/span[1]").text
        price = webdriver.find_element_by_xpath("/html/body/div[2]/div[2]/div[7]/div[5]/div[1]/div[5]/div/div/div/div/div/form/div/div/div/div/div[1]/div/span[1]").text
        if old_price[1:] == price[1:]:
            deal_type = "normal"
        else:
            deal_type = "deal"
    
    except:
        price = webdriver.find_element_by_xpath("/html/body/div[2]/div[2]/div[7]/div[5]/div[1]/div[5]/div/div/div/div/div/form/div/div/div/div/div[1]/div/span[1]").text
        deal_type = "normal"
    
    print(old_price)
    print(title)
    print(price)
    print(deal_type)

    return price

prices = []

for url in urls:
    prices.append(getPrice(url))

print(prices)

df = pd.DataFrame(prices)
print(df.head(15))
df.to_csv("testfile.csv",index=False)
print(len(prices))

Let me explain:

The first 4 lines import the necessary modules such as selenium and pandas. The next line saves the URLs. After, webdriver = webdriver.Chrome() sets the brower to chrome.

After, in getPrice, we open the url using webdriver.get(url).

Then, we get the title from the xpath variable.

The try command checks to see if the xpath which shows the deal exists. if it does, it gets the old and new price, and saves the product as a deal. If the xpath for a deal does NOT exist, it moves onto the except and saves the prodcut as a normal one.

It then prints the price, title and deal type.

Finally, it runs the function for every URL, and saves it to a CSV file.

I explained the code so that you could turn it into requests_html.


Post a Comment for "How Can I Check If Either Xpath Exists And Then Return The Value If Text Is Present?"