Add Own Text Inside Nested Braces
Solution 1:
Pyparsing has a helper method called nestedExpr
that makes it easy to match strings of nested open/close delimiters. Since you have nested PHP tags within your <h1>
tag, then I would use a nestedExpr
like:
nested_angle_braces = nestedExpr('<', '>')
However, this will match every tag in your input HTML source:
formatchin nested_angle_braces.searchString(html):
printmatch
gives:
[['html']][['head']][['title']][['?php', 'echo', '"title here"', ';', '?']][['/title']][['head']][['body']][['h1', ['?php', 'echo', '"class=\'big\'"', '?']]]
[['/h1']][['/body']][['/html']]
You want to match only those tags whose opening text is 'h1'. We can add a condition to an expression in pyparsing using addCondition
:
nested_angle_braces_with_h1 = nested_angle_braces().addCondition(
lambda tokens: tokens[0][0].lower() == 'h1')
Now we will match only the desired tag. Just a few more steps...
First of all, nestedExpr
gives back nested lists of matched items. We want the original text that was matched. Pyparsing includes another helper for that, unimaginatively named originalTextFor
- we combine this with the previous definition to get:
nested_angle_braces_with_h1 = originalTextFor(
nested_angle_braces().addCondition(lambda tokens: tokens[0][0].lower() == 'h1')
)
Lastly, we have to add one more parse-time callback action, to append "MY_TEXT" to the tag:
nested_angle_braces_with_h1.addParseAction(lambda tokens: tokens[0] + 'MY_TEXT')
Now that we are able to match the desired <h1>
tag, we can use transformString
method of the expression to do the search-and-replace work for us:
print(nested_angle_braces_with_h1.transformString(html))
With your original sample saved as a variable named html
, we get:
<html><head><title><?phpecho"title here"; ?></title><head><body><h1<?phpecho"class='big'"?>>MY_TEXTfoo</h1></body></html>
Note: this will add "MY_TEXT" after every<h1>
tag. If you want this to be applied only after <h1>
tags containing PHP, then write the appropriate condition and add it to nested_angle_braces_with_h1
.
Post a Comment for "Add Own Text Inside Nested Braces"