Mapping XML to Objects With NSXMLParser

Event-based XML parsers, like NSXMLParser, have great performance but they can be difficult to work with because the parser itself discards nearly all state while parsing the document. This leaves all that state handling up to you, the programmer. Even a moderately complicated XML document can easily result in a mess of conditionals, state flags, and temporary variables. So while it's probably not applicable to every XML scenario, I've found the following object-mapper pattern to be an effective way to map XML data to arbitrary objects such as Core Data models.

SAX Is Better With a Threesome

There are 3 parts to this pattern. The first is the mapper itself. This object is reusable and not specific to any XML source. The second is the mapper delegate. This object is designed around the XML source and controls how individual elements are mapped to objects. Finally we have our objects themselves. These can be any class that conforms to the mapped object protocol.

The Mapper

The XMLObjectMapper class acts as our NSXMLParser delegate. It's responsible for setting up the parser and responding to events. It also handles accumulating text and maintaining two stacks, one for elements and one for objects. These two stacks are the heart of this solution. When the parser starts a new element the mapper adds that element to the end of the elements stack and then asks its delegate what, if any, object that element maps to. The delegate then returns either a new object or the current object which is pushed onto the end of the objects stack. When an element ends, the object at the end of the stack is passed the text contents of the element (if any) and then both the element and the object are popped off the end.

Here's the code for when a new element is encountered:

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
    // First we add the element to our elements stack...
    [self.elements addObject:elementName];

    // Next we ask our delegate for an object to map this element to...
    id <XMLMappedObject> object = [self.delegate mapper:self startElementNamed:elementName withAttributes:attributeDict currentObject:[self.objects lastObject]];

    // We may enter several elements before we reach the first element that maps to an object in our domain
    if (object) {
        // Add the object to the end of the objects stack
        [self.objects addObject:object];
    }
}

One thing to note here: We pass the current object to our delegate along with the new element name. A default response from the delegate would be to just return that object. This is because each and every element is typically not going to map to a different object. Instead certain elements and their children will map to a single object. By returning the same object and pushing it onto the stack we're allowing a single object to map to multiple levels of elements.

Now let's look at what happens when we encounter the end of an element:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
    // First we pop the element off our stack
    [self.elements removeLastObject];

    // Next we grab the current object and if we've accumulated any text
    // we pass that to the object through the `mapper:foundString:forElementName:` protocol method
    id <XMLMappedObject> currentObject = [self.objects lastObject];
    NSString *str = strip(self.characters);
    if (str && [str length]) {
        [currentObject mapper:self foundString:str forElementNamed:elementName];
    }

    // Do a little housecleaning
    [self clearString];
    [self.objects removeLastObject];

    // And let our delegate know we've closed an element
    [self.delegate mapper:self endElementNamed:elementName currentObject:currentObject];
}

As you can see when an element is closed we just pop both the element and the object from our stacks, give the current object a chance to store the contents of the element and notify our delegate.

A Contrived Example

Let's say we have the following bit of (simplified) XML:

<people>
    <person id="1">
        <name>John Doe</name>
        <age>35</age>
    </person>
</people>

And we also have a Person model:

@interface Person : NSObject <XMLMappedObject>
@property (readonly, assign) int personID;
@property (copy) NSString *name;
@property (assign) int age;
- (instancetype)initWithID:(int)personID;
@end

The Mapper Delegate

It's the mapper delegate's job to know how the XML elements in the document map to objects in the application. Here's the protocol:

@protocol XMLObjectMapperDelegate
- (id <XMLMappedObject>)mapper:(XMLObjectMapper *)mapper startElementNamed:(NSString *)elementName withAttributes:(NSDictionary *)attributes currentObject:(id <XMLMappedObject>)currentObject;
- (void)mapper:(XMLObjectMapper *)mapper endElementNamed:(NSString *)elementName currentObject:(id <XMLMappedObject>)currentObject;
@end

In simple cases you may not even need the second method. It's mostly there to allow the mapper to maintain its own state. For example, you might have an XML document with Products nested within a Category. The delegate could store the current Category and automatically assign that to the Product objects it initializes before passing them back to the mapper.

Here's an example of what the first method might look like given our sample data:

- (id <XMLMappedObject>)mapper:(XMLObjectMapper *)mapper startElementNamed:(NSString *)elementName withAttributes:(NSDictionary *)attributes currentObject:(id <XMLMappedObject>)currentObject
{
    if ([elementName isEqualToString:@"person"]) {
       return [[Person alloc] initWithID:[attributes[@"id"] integerValue]];
    }

    return currentObject;
}

The Mapped Object

The mapped objects themselves only need to conform to a simple protocol. It's up to the object to know what to do with the data.

@protocol XMLMappedObject
- (void)mapper:(XMLObjectMapper *)mapper foundString:(NSString *)string forElementNamed:(NSString *)elementName;
@end

And we might implement this method on our Person object like this:

- (void)mapper:(XMLObjectMapper *)mapper foundString:(NSString *)string forElementNamed:(NSString *)elementName
{
    if ([elementName isEqualToString:@"name"]) {
        self.name = string;
    }
    else if ([elementName isEqualToString:@"age"]) {
        self.age = [string integerValue];
    }
}

Divide and Conquer

We've now broken up the job of the NSXMLParser delegate into three parts: handling events, deciding which elements map to what objects and how those objects use the contents of those elements. I believe this way we stand a much better chance of keeping our code maintainable. We can also do things like map the same XML to different objects by swapping out the mapper delegate.

Here's the full source for the mapper object. It's just a single class and two protocols. It was not originally written as a generic solution so you may very well need to modify it to fit your requirements (if it fits at all). To support encoded data for instance.

"Mapping XML to Objects With NSXMLParser" was originally published on 26 Apr 2014.