Converting Between XML and JSON
This is a typical UNIX filter program: it reads file (or stdin), processes it in some way (convert XML to JSON in this case), then prints it to stdout (or file). Example with pipe: $ some-xml-producer python -m xmljson some-json-processor There is also pip's consolescript entry-point, you can call this utility as xml2json. Writing to JSON File in Python. The same table will now be used to convert python data types to json equivalents. To convert a python dict to a json object we will use the method dumps from the json module. It will return a string which will be converted into json format. The python code looks as below. In this tutorial, we will see How To Convert Python List To JSON Example. Convert Python List to JSON. You can save the Python list into JSON files using an inbuilt module json. Using Python json.dump and json.dumps method, we can convert Python types such as dict, list, str, int, float, bool, None into JSON. Converting Between XML and JSON. Stefan Goessner. More and more web service providers seem to be interested in offering JSON APIs beneath their XML APIs. One considerable advantage of using a JSON API is its ability to provide cross-domain requests while bypassing the restrictive same domain policy of the XmlHttpRequest object. Copy, Paste and Convert. What can you do with XML to JSON Converter? It helps to convert your XML data to JSON format. This tool allows loading the XML URL, which loads XML and converts to String. Click on the URL button, Enter URL and Submit. Users can also convert XML File to JSON by uploading the file. Once you are done with XML to JSON converting. You can download as a file or Create a link.
May 31, 2006
More and more web service providers seem to be interested inofferingJSONAPIs beneath their XML APIs. One considerable advantage of using a JSON API is its ability to provide cross-domain requests while bypassing the restrictive same domain policy of the XmlHttpRequest object. On the client-side, JSON comes with a native language-compliant data structure, with which it performs much better than corresponding DOM calls required for XML processing. Finally, transforming JSON structures to presentational data can be easily achieved with tools such as JSONT.
So if you're working in this space, you probably need to convert an existing XML document to a JSON structure while preserving the following:
- structure
- order
- information
In an ideal world, the resulting JSON structure can be converted back to its original XML document easily. Thus it seems worthwhile to discuss some common patterns as the foundation of a potentially bidirectional conversion process between XML and JSON. A similar discussion can be found at BadgerFish and Yahoo-- without the reversibility aspect though.
A Pragmatic Approach
A single structured XML element might come in seven flavors:
- an empty element
- an element with pure text content
- an empty element with attributes
- an element with pure text content and attributes
- an element containing elements with different names
- an element containing elements with identical names
- an element containing elements and contiguous text
The following table shows the corresponding conversion patterns between XML and JSON.
Pattern | XML | JSON | Access |
1 |
| 'e': null | o.e |
2 | text | 'e': 'text' | o.e |
3 |
| 'e':{'@name': 'value'} | o.e['@name'] |
4 | text | 'e': { '@name': 'value', '#text': 'text' } | o.e['@name'] o.e['#text'] |
5 | text text | 'e': { 'a': 'text', 'b': 'text' } | o.e.a o.e.b |
6 | text text | 'e': { 'a': ['text', 'text'] } | o.e.a[0] o.e.a[1] |
7 | text text | 'e': { '#text': 'text', 'a': 'text' } | o.e['#text'] o.e.a |
Please note that all patterns are considered to describe structured elements, despite the fact that the element of pattern 7 is commonly understood as a semistructured element. A pragmatic approach to convert an XML document to a JSON structure and vice versa can be based on the seven patterns above. It always assumes a normalized XML document for input and doesn't take into consideration the following:
- XML declaration
- processing instructions
- explicit handling of namespace declarations
- XML comments
Preserving order
JSON is built on two internal structures:
- A collection of name/value pairs with unique names (associative array)
- An ordered list of values (array)
An attempt to map a structured XML element..
..to the following JSON object:
yields an invalid result, since the name 'a'
is not unique in the associative array. So we need to collect all elements of identical names in an array. Using the patterns 5 and 6 above yields the following result:
Now we have a structure that doesn't preserve element order. This may or may not be acceptable, depending on whether the above XML element order matters.
So, our general rules of thumb are:
A structured XML element can be converted to a reversible JSON structure, if
- all subelement names occur exactly once, or …
- subelements with identical names are in sequence.
and
A structured XML element can be converted to an irreversible but semantically equivalent JSON structure, if
- multiple homonymous subelements occur nonsequentially, and …
- element order doesn't matter.
If none of these two conditions apply, there is no pragmatic way to convert XML to JSON using the patterns above. Here, SVG and SMIL documents, which implicitly rely on element order, come to mind.
Semi-Structured XML
XML documents can contain semi-structured elements, which are elements with mixed content of text and child elements, usually seen in documentation markup. If the textual content is contiguous, as in:
we can apply pattern 7 and yield the following for this special case:
But how do we convert textual content mixed up with elements? For example:
It obviously doesn't make sense in most cases to collect all text nodes in an array,
that doesn't preserve order or semantics.
So the best pragmatic solution is to treat mixed semi-structured content in JSON the same way as XML treats CDATA sections -- as unknown markup.
Another rule is that XML elements with
- mixed content of text and element nodes and
- CDATA sections
are converted to a reversible JSON string containing the complete XML markup according to pattern 2 or 4.
Examples
Now let's look at two examples using the insight we've gained thus far. Microformats are well suited because they are an open standard and short enough for a brief discussion.
XOXO, as a simple XHTML-based outline format, is one of several microformats. The slightly modified sample from the Draft Specification reads:
Now we apply the patterns above to convert this XML document fragment to a JSON structure.
- The outer list with two list items is converted using pattern 6.
- The first list item contains a single textual content 'Subject 1' and an inner list element. So, it can be treated according to pattern 7.
- The first inner list is converted with pattern 6 again.
- Pattern 5 is applied to the second item of the outer list.
- The second inner list is converted using a combination of patterns 3 and 6.
Here is the resulting JSON structure, which is reversible without losing any information.
hCalendar is another microformat based on the iCalendar standard. We'll just ignore the fact that the iCalendar format could be more easily converted to JSON, and will look at an hCalendar event example, which is also slightly modified so that it is a structured, rather than mixed, semi-structured document fragment.
Here, patterns 2, 3, 4, 5 and 6 are used to generate the following JSON structure:
This example demonstrates a conversion that does not preserve the original element order. Even if this may not change semantics here, we can do the following:
- state that a conversion isn't sufficiently possible.
- tolerate the result if order doesn't matter.
- try to make our XML document more JSON-friendly.
In many cases the last point may be not acceptable, at least when the XML document is based on existing standards. But in other cases, it may be worth the effort to consider some subtle XML changes, which can make XML and JSON play nicely together. Changing the elements to
elements in the hCalendar example would be an improvement.
XML is a document-centric format, while JSON is a format for structured data. This fundamental difference may be irrelevant, as XML is also capable of describing structured data. If XML is used to describe highly structured documents, these may play very well together with JSON.
Problems may arise, if XML documents do the following:
- implicitly rely on element order
- contain a lot of semi-structured data
As proof of this concept, I have implemented two Javascript functions -- xml2json
and json2xml
-- based on the six patterns above, which can be used for the following:
- client-side conversion
- a parsed XML document via DOM to a JSON structure
- a JSON structure to a (textual) XML document
- implementing converters in other server side languages
Future XML document design may be influenced by these or similar patterns in order to get the best of both the XML and JSON worlds.
Latest versionReleased:
Converts XML into JSON/Python dicts/arrays and vice-versa.
Project description
This library is not actively maintained. Alternatives are xmltodict and untangle.Use only if you need to parse using specific XML to JSON conventions.
xmljson converts XML into Python dictionary structures (trees, like in JSON) and vice-versa.
About
XML can be converted to a data structure (such as JSON) and back. For example:
can be converted into this data structure (which also a valid JSON object):
This uses the BadgerFish convention that prefixes attributes with @.The conventions supported by this library are:
- Abdera: Use 'attributes' for attributes, 'children' for nodes
- BadgerFish: Use '$' for text content, @ to prefix attributes
- Cobra: Use 'attributes' for sorted attributes (even when empty), 'children' for nodes, values are strings
- GData: Use '$t' for text content, attributes added as-is
- Parker: Use tail nodes for text content, ignore attributes
- Yahoo Use 'content' for text content, attributes added as-is
Convert data to XML
To convert from a data structure to XML using the BadgerFish convention:
This returns an array of etree.Element structures. In this case, theresult is identical to:
The result can be inserted into any existing root etree.Element:
This includes lxml.html as well:
For ease of use, strings are treated as node text. For example, both thefollowing are the same:
By default, non-string values are converted to strings using Python's str,except for booleans – which are converted into true and false (lowercase). Override this behaviour using xml_fromstring:
If the data contains invalid XML keys, these can be dropped viainvalid_tags='drop' in the constructor:
Convert XML to data
To convert from XML to a data structure using the BadgerFish convention:
To convert this to JSON, use:
To preserve the order of attributes and children, specify the dict_type asOrderedDict (or any other dictionary-like type) in the constructor:
- state that a conversion isn't sufficiently possible.
- tolerate the result if order doesn't matter.
- try to make our XML document more JSON-friendly.
In many cases the last point may be not acceptable, at least when the XML document is based on existing standards. But in other cases, it may be worth the effort to consider some subtle XML changes, which can make XML and JSON play nicely together. Changing the elements to
elements in the hCalendar example would be an improvement.
XML is a document-centric format, while JSON is a format for structured data. This fundamental difference may be irrelevant, as XML is also capable of describing structured data. If XML is used to describe highly structured documents, these may play very well together with JSON.
Problems may arise, if XML documents do the following:
- implicitly rely on element order
- contain a lot of semi-structured data
As proof of this concept, I have implemented two Javascript functions -- xml2json
and json2xml
-- based on the six patterns above, which can be used for the following:
- client-side conversion
- a parsed XML document via DOM to a JSON structure
- a JSON structure to a (textual) XML document
- implementing converters in other server side languages
Future XML document design may be influenced by these or similar patterns in order to get the best of both the XML and JSON worlds.
Latest versionReleased:
Converts XML into JSON/Python dicts/arrays and vice-versa.
Project description
This library is not actively maintained. Alternatives are xmltodict and untangle.Use only if you need to parse using specific XML to JSON conventions.
xmljson converts XML into Python dictionary structures (trees, like in JSON) and vice-versa.
About
XML can be converted to a data structure (such as JSON) and back. For example:
can be converted into this data structure (which also a valid JSON object):
This uses the BadgerFish convention that prefixes attributes with @.The conventions supported by this library are:
- Abdera: Use 'attributes' for attributes, 'children' for nodes
- BadgerFish: Use '$' for text content, @ to prefix attributes
- Cobra: Use 'attributes' for sorted attributes (even when empty), 'children' for nodes, values are strings
- GData: Use '$t' for text content, attributes added as-is
- Parker: Use tail nodes for text content, ignore attributes
- Yahoo Use 'content' for text content, attributes added as-is
Convert data to XML
To convert from a data structure to XML using the BadgerFish convention:
This returns an array of etree.Element structures. In this case, theresult is identical to:
The result can be inserted into any existing root etree.Element:
This includes lxml.html as well:
For ease of use, strings are treated as node text. For example, both thefollowing are the same:
By default, non-string values are converted to strings using Python's str,except for booleans – which are converted into true and false (lowercase). Override this behaviour using xml_fromstring:
If the data contains invalid XML keys, these can be dropped viainvalid_tags='drop' in the constructor:
Convert XML to data
To convert from XML to a data structure using the BadgerFish convention:
To convert this to JSON, use:
To preserve the order of attributes and children, specify the dict_type asOrderedDict (or any other dictionary-like type) in the constructor:
By default, values are parsed into boolean, int or float where possible (exceptin the Yahoo method). Override this behaviour using xml_fromstring:
xml_fromstring can be any custom function that takes a string and returns avalue. In the example below, only the integer 1 is converted to an integer.Everything else is retained as a float:
Conventions
To use a different conversion method, replace BadgerFish with one of theother classes. Currently, these are supported:
Options
Conventions may support additional options.
The Parker convention absorbs the root element by default.parker.data(preserve_root=True) preserves the root instance:
Installation
This is a pure-Python package built for Python 2.7+ and Python 3.0+. To set up:
Simple CLI utility
Download Convert Xml To Json With Python Source Code
After installation, you can benefit from using this package as simple CLI utility. By now only XML to JSON conversion supported. Example:
This is a typical UNIX filter program: it reads file (or stdin), processes it in some way (convert XML to JSON in this case), then prints it to stdout (or file). Example with pipe:
There is also pip's console_script entry-point, you can call this utility as xml2json:
Roadmap
- Test cases for Unicode
- Support for namespaces and namespace prefixes
- Support XML comments
History
0.2.1 (25 Apr 2020)
- Bugfix: Don't strip whitespace in xml text values (@imoore76)
- Bugfix: Yahoo convention should convert 0 into {x: 0}. Empty elements become ' not {}
- Suggest alternate libraries in documentation
0.2.0 (21 Nov 2018)
- xmljson command line script converts from XML to JSON (@tribals)
- invalid_tags='drop' in the constructor drops invalid XML tags in .etree() (@Zurga)
- Bugfix: Parker converts {'x': null} to instead of None (@jorndoe #29)
0.1.9 (1 Aug 2017)
- Bugfix and test cases for multiple nested children in Abdera convention
Thanks to @mukultaneja
0.1.8 (9 May 2017)
- Add Abdera and Cobra conventions
- Add Parker.data(preserve_root=True) option to preserve root element inParker convention.
Thanks to @dagwieers
0.1.6 (18 Feb 2016)
- Add xml_fromstring= and xml_tostring= parameters to constructor tocustomise string conversion from and to XML.
0.1.5 (23 Sep 2015)
- Add the Yahoo XML to JSON conversion method.
0.1.4 (20 Sep 2015)
- Fix GData.etree() conversion of attributes. (They were ignored. Theyshould be added as-is.)
0.1.3 (20 Sep 2015)
- Simplify {'p': {'$':'text'}} to {'p': 'text'} in BadgerFish and GDataconventions.
- Add test cases for .etree() – mainly from the MDN JXON article.
- dict_type/list_type do not need to inherit from dict/list
0.1.2 (18 Sep 2015)
- Always use the dict_type class to create dictionaries (which defaults toOrderedDict to preserve order of keys)
- Update documentation, test cases
- Remove support for Python 2.6 (since we need collections.Counter)
- Make the Travis CI build pass
0.1.1 (18 Sep 2015)
- Convert true, false and numeric values from strings to Python types
- xmljson.parker.data() is compliant with Parker convention (bugs resolved)
0.1.0 (15 Sep 2015)
- Two-way conversions via BadgerFish, GData and Parker conventions.
- First release on PyPI.
Release historyRelease notifications | RSS feed
0.2.1
0.2.0
0.1.9
0.1.8
0.1.7
0.1.6
0.1.5
0.1.4
Convert Xml To Html Python
0.1.3
0.1.2
0.1.1
0.1.0
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size xmljson-0.2.1-py2.py3-none-any.whl (10.1 kB) | File type Wheel | Python version py2.py3 | Upload date | Hashes |
Filename, size xmljson-0.2.1.tar.gz (29.2 kB) | File type Source | Python version None | Upload date | Hashes |
Hashes for xmljson-0.2.1-py2.py3-none-any.whl
Download Convert Xml To Json With Python File
Algorithm | Hash digest |
---|---|
SHA256 | 8f1d7aba2c0c1bfa0203b577f21a1d95fde4485205ff638b854cb4d834e639b0 |
MD5 | 527685fc40c28fd696124737840389ca |
BLAKE2-256 | 912d7191efe15406b8b99e2b5905ca676a8a3dc2936416ade7ed17752902c250 |
Hashes for xmljson-0.2.1.tar.gz
Download Convert Xml To Json With Python Ide
Algorithm | Hash digest |
---|---|
SHA256 | b4158e66aa1e62ee39f7f80eb2fe4f767670ba3c0d5de9804420dc53427fdec8 |
MD5 | fc4df2390ad209928ee4311a3540cb17 |
BLAKE2-256 | e86fd9f109ba19be510fd3098bcb72143c67ca6743cedb48ac75aef05ddfe960 |