How To Tell Beautifulsoup To Extract The Content Of A Specific Tag As Text? (without Touching It)
I need to parse an html document which contains 'code' tags I'm getting the code blocks like this: soup = BeautifulSoup(str(content)) code_blocks = soup.findAll('code') The proble
Solution 1:
Add the code tag to the QUOTE_TAGS dictionary.
from BeautifulSoup import BeautifulSoup
content = "<code class='csharp'>List<Person> persons = new List<Person>();</code>"
BeautifulSoup.QUOTE_TAGS['code'] = None
soup = BeautifulSoup(str(content))
code_blocks = soup.findAll('code')
Output:
[<codeclass="csharp"> List<Person> persons = new List<Person>(); </code>]
Post a Comment for "How To Tell Beautifulsoup To Extract The Content Of A Specific Tag As Text? (without Touching It)"