![]()
XML (eXtensible Markup Language) is a widely used format for storing and transporting structured data. Python provides several libraries to read, write, modify, and parse XML files.
Why Use XML?
- Stores structured data in a human-readable format.
- Used in web services, APIs, and configuration files.
- Easily integrates with databases and data exchange formats.
Python Libraries for XML Handling:
xml.etree.ElementTree– Built-in, easy-to-use, lightweight.lxml– Faster, supports XPath & XSLT (Requires installation).xml.dom.minidom– Provides DOM-like XML manipulation.
1. Reading an XML File (xml.etree.ElementTree)
Sample XML File (data.xml):
<employees>
<employee id="101">
<name>John Doe</name>
<position>Software Engineer</position>
<salary>80000</salary>
</employee>
<employee id="102">
<name>Jane Smith</name>
<position>Data Analyst</position>
<salary>75000</salary>
</employee>
</employees>
Reading and Parsing XML:
import xml.etree.ElementTree as ET
# Load XML file
tree = ET.parse("data.xml")
root = tree.getroot()
# Print root tag
print(f"Root element: {root.tag}")
# Iterate through child elements
for employee in root.findall("employee"):
emp_id = employee.get("id") # Get attribute value
name = employee.find("name").text
position = employee.find("position").text
salary = employee.find("salary").text
print(f"ID: {emp_id}, Name: {name}, Position: {position}, Salary: {salary}")
Key Methods:
.parse("file.xml")→ Loads XML file..getroot()→ Returns the root element..find("tag")→ Finds the first occurrence of a tag..findall("tag")→ Finds all occurrences of a tag..get("attribute")→ Gets an attribute value.
2. Writing an XML File
We can create an XML structure and save it to a file.
Creating an XML File (employees.xml):
import xml.etree.ElementTree as ET
# Create root element
root = ET.Element("employees")
# Create child elements
emp1 = ET.SubElement(root, "employee", id="101")
ET.SubElement(emp1, "name").text = "John Doe"
ET.SubElement(emp1, "position").text = "Software Engineer"
ET.SubElement(emp1, "salary").text = "80000"
emp2 = ET.SubElement(root, "employee", id="102")
ET.SubElement(emp2, "name").text = "Jane Smith"
ET.SubElement(emp2, "position").text = "Data Analyst"
ET.SubElement(emp2, "salary").text = "75000"
# Convert to XML tree
tree = ET.ElementTree(root)
tree.write("employees.xml")
print("XML file created successfully!")
Key Functions:
ET.Element("tag")→ Creates a root element.ET.SubElement(parent, "tag")→ Adds child elements..text = "value"→ Sets element text content.ET.ElementTree(root).write("file.xml")→ Saves XML file.
3. Modifying an XML File
Modify an existing XML file by updating element values.
Example: Updating Salary of an Employee
import xml.etree.ElementTree as ET
tree = ET.parse("employees.xml")
root = tree.getroot()
# Find employee with id=101 and update salary
for emp in root.findall("employee"):
if emp.get("id") == "101":
emp.find("salary").text = "85000" # Update salary
# Save changes
tree.write("employees.xml")
print("XML file updated successfully!")
Use .find(tag).text to update element values.
4. Deleting an Element
Remove an employee record from XML.
import xml.etree.ElementTree as ET
tree = ET.parse("employees.xml")
root = tree.getroot()
# Find and remove employee with id=102
for emp in root.findall("employee"):
if emp.get("id") == "102":
root.remove(emp)
# Save changes
tree.write("employees.xml")
print("Employee removed successfully!")
Use .remove(element) to delete a node.
5. Searching and Filtering XML Data
Find employees earning more than $75,000.
import xml.etree.ElementTree as ET
tree = ET.parse("employees.xml")
root = tree.getroot()
for emp in root.findall("employee"):
salary = int(emp.find("salary").text)
if salary > 75000:
print(f"Employee: {emp.find('name').text}, Salary: {salary}")
Convert text to integer before numerical comparisons.
6. Pretty Printing XML (xml.dom.minidom)
The default XML output is compact. Use xml.dom.minidom for formatted output.
import xml.etree.ElementTree as ET
import xml.dom.minidom
tree = ET.parse("employees.xml")
xml_str = ET.tostring(tree.getroot(), encoding="utf-8")
pretty_xml = xml.dom.minidom.parseString(xml_str).toprettyxml()
print(pretty_xml)
Use toprettyxml() to format XML for better readability.
7. Parsing Large XML Files (iterparse)
For large XML files, use iterparse() to process data incrementally.
import xml.etree.ElementTree as ET
for event, elem in ET.iterparse("large_file.xml", events=("start", "end")):
if event == "end" and elem.tag == "employee":
print(f"Employee: {elem.find('name').text}")
elem.clear() # Free memory
Use .clear() to release memory after processing each node.
8. Using lxml for XPath Queries
lxml supports XPath, allowing advanced XML queries.
from lxml import etree
tree = etree.parse("employees.xml")
employees = tree.xpath("//employee[salary>75000]/name/text()")
print("High salary employees:", employees)
Install lxml:
pip install lxml
XPath Example Queries:
//employee/name→ Selects all<name>elements.//employee[@id='101']→ Selects employee withid=101.//employee[salary>75000]→ Selects employees earning more than 75,000.
9. Handling XML Errors (try-except)
Catch errors when working with XML files.
import xml.etree.ElementTree as ET
try:
tree = ET.parse("employees.xml")
root = tree.getroot()
except FileNotFoundError:
print("Error: XML file not found!")
except ET.ParseError:
print("Error: XML file is malformed!")
Common Errors:
FileNotFoundError→ XML file does not exist.ET.ParseError→ XML is malformed or corrupted.
10. Summary Table
| Operation | Method | Example |
|---|---|---|
| Read XML File | ET.parse("file.xml") | root = tree.getroot() |
| Find Element | .find("tag") | name = emp.find("name").text |
| Find All Elements | .findall("tag") | for emp in root.findall("employee") |
| Modify Element | .text = "new value" | emp.find("salary").text = "85000" |
| Delete Element | .remove(element) | root.remove(emp) |
| Write XML | ET.ElementTree(root).write("file.xml") | tree.write("employees.xml") |
| Pretty Print | xml.dom.minidom.parseString() | toprettyxml() |
| Parse Large XML | ET.iterparse("file.xml") | Efficient memory usage |
