Define Default Namespace (unprefixed) In Lxml
Solution 1:
Use ElementMaker
and give it an nsmap
that maps None
to your default namespace.
#!/usr/bin/env python# dogeml.pyfrom lxml.builder import ElementMaker
from lxml import etree
E = ElementMaker(
nsmap={
None: "http://wow/"# <--- This is the special sauce
}
)
doge = E.doge(
E.such('markup'),
E.many('very namespaced', syntax="tricks")
)
options = {
'pretty_print': True,
'xml_declaration': True,
'encoding': 'UTF-8',
}
serialized_bytes = etree.tostring(doge, **options)
print(serialized_bytes.decode(options['encoding']))
As you can see in the output from this script, the default namespace is defined, but the tags do not have a prefix.
<?xml version='1.0' encoding='UTF-8'?><dogexmlns="http://wow/"><such>markup</such><manysyntax="tricks">very namespaced</many></doge>
I have tested this code with Python 2.7.6, 3.3.5, and 3.4.0, combined with lxml 3.3.1.
Solution 2:
This XSL transformation removes all prefixes from content
, while maintaining namespaces defined in the root node:
import lxml.etree as ET
content = '''\
<?xml version='1.0' encoding='utf-8'?><!DOCTYPE html><h:htmlxmlns:h="http://www.w3.org/1999/xhtml"xmlns:ml="http://foo"><h:head><h:title>MathJax Test Page</h:title><h:scripttype="text/javascript"><![CDATA[
function test() {
alert(document.getElementsByTagName("p").length);
};
]]></h:script></h:head><h:bodyonload="test();"><h:p>test</h:p><ml:foo></ml:foo></h:body></h:html>
'''
dom = ET.fromstring(content)
xslt = '''\
<xsl:stylesheetversion="1.0"xmlns="http://www.w3.org/1999/xhtml"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:outputmethod="xml"indent="no"/><!-- identity transform for everything else --><xsl:templatematch="/|comment()|processing-instruction()|*|@*"><xsl:copy><xsl:apply-templates /></xsl:copy></xsl:template><!-- remove NS from XHTML elements --><xsl:templatematch="*[namespace-uri() = 'http://www.w3.org/1999/xhtml']"><xsl:elementname="{local-name()}"><xsl:apply-templatesselect="@*|node()" /></xsl:element></xsl:template><!-- remove NS from XHTML attributes --><xsl:templatematch="@*[namespace-uri() = 'http://www.w3.org/1999/xhtml']"><xsl:attributename="{local-name()}"><xsl:value-ofselect="." /></xsl:attribute></xsl:template></xsl:stylesheet>
'''
xslt_doc = ET.fromstring(xslt)
transform = ET.XSLT(xslt_doc)
dom = transform(dom)
print(ET.tostring(dom, pretty_print = True,
encoding = 'utf-8'))
yields
<htmlxmlns="http://www.w3.org/1999/xhtml"><head><title>MathJax Test Page</title><scripttype="text/javascript">functiontest() {
alert(document.getElementsByTagName("p").length);
};
</script></head><bodyonload="test();"><p>test</p><ml:fooxmlns:ml="http://foo"/></body></html>
Solution 3:
To expand on @neirbowj's answer, but using ET.Element and ET.SubElement, and rendering a document with a mix of namespaces, where the root happens to be explicitly namespaced and a subelement (channel
) is the default namespace:
# I set up but don't use the default namespace:
root = ET.Element('{http://www.w3.org/1999/02/22-rdf-syntax-ns#}RDF', nsmap={None: 'http://purl.org/rss/1.0/'})
# I use the default namespace by including its URL in curly braces:
e = ET.SubElement(root, '{http://purl.org/rss/1.0/}channel')
print(ET.tostring(root, xml_declaration=True, encoding='utf8').decode())
This will print out the following:
<?xml version='1.0' encoding='utf8'?><rdf:RDFxmlns="http://purl.org/rss/1.0/"xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><channel/></rdf:RDF>
It automatically uses rdf for the RDF namespace. I'm not sure how it figures it out. If I want to specify it I can add it to my nsmap in the root element:
nsmap = {None: 'http://purl.org/rss/1.0/',
'doge': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'}
root = ET.Element('{http://www.w3.org/1999/02/22-rdf-syntax-ns#}RDF', nsmap=nsmap)
e = ET.SubElement(root, '{http://purl.org/rss/1.0/}channel')
print(ET.tostring(root, xml_declaration=True, encoding='utf8').decode())
...and I get this:
<?xml version='1.0' encoding='utf8'?><doge:RDFxmlns:doge="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://purl.org/rss/1.0/"><channel/></doge:RDF>
Post a Comment for "Define Default Namespace (unprefixed) In Lxml"