=head1 NAME XML::Compile::Schema::XmlReader - bricks to translate XML to HASH =head1 INHERITANCE =head1 SYNOPSIS my $schema = XML::Compile::Schema->new(...); my $code = $schema->compile(READER => ...); =head1 DESCRIPTION The translator understands schemas, but does not encode that into actions. This module implements those actions to translate from XML into a (nested) Perl HASH structure. =head1 DETAILS =head2 Processing Wildcards If you want to collect information from the XML structure, which is permitted by C and C specifications in the schema, you have to implement that yourself. The problem is C has less knowledge than you about the possible data. =head3 anyAttribute By default, the C specification is ignored. When C is given, all attributes which are fulfilling the name-space requirement added to the returned data-structure. As key, the absolute element name will be used, with as value the related unparsed XML element. In the current implementation, if an explicit attribute is also covered by the name-spaces permitted by the anyAttribute definition, then it will also appear in that list (and hence the handler will be called as well). Use L to write your own handler, to influence the behavior. The handler will be called for each attribute, and you must return list of pairs of derived information. When the returned is empty, the attribute data is lost. The value may be a complex structure. example: anyAttribute in XmlReader Say your schema looks like this: Then, in an application, you write: my $r = $schema->compile ( READER => pack_type('http://mine', 'el') , anyAttribute => 'ALL' ); # or lazy: READER => '{http://mine}el' my $h = $r->( <<'__XML' ); 42 everything __XML use Data::Dumper 'Dumper'; print Dumper $h; __XML__ The output is something like $VAR1 = { a => 42 , '{http://mine}a' => ... # XML::LibXML::Node with 42 , '{http://mine}b' => ... # XML::LibXML::Node with everything }; You can improve the reader with a callback. When you know that the extra attribute is always of type C, then you can do my $read = $schema->compile ( READER => '{http://mine}el' , anyAttribute => \&filter ); my $anyAttRead = $schema->compile ( READER => '{http://mine}non-empty' ); sub filter($$$$) { my ($fqn, $xml, $path, $translator) = @_; return () if $fqn ne '{http://mine}b'; (b => $anyAttRead->($xml)); } my $h = $r->( see above ); print Dumper $h; Which will result in $VAR1 = { a => 42 , b => 'everything' }; The filter will be called twice, but return nothing in the first case. You can implement any kind of complex processing in the filter. =head3 any element By default, the C definition in a schema will ignore all elements from the container which are not used. Also in this case C is required to produce C results. C will ignore all results, although this are being processed for validation needs. The C and C of C are ignored: the amount of elements is always unbounded. Therefore, you will get an array of elements back per type. =head2 Schema hooks =head3 hooks executed before the XML is being processed The C hooks receives an XML::LibXML::Node object and the path string. It must return a new (or same) XML node which will be used from then on. You probably can best modify a node clone, not the original as provided by the user. When C is returned, the whole node will disappear. This hook offers a predefined C. example: to trace the paths $schema->addHook(path => qr/./, before => 'PRINT_PATH'); =head3 hooks executed as replacement Your C hook should return a list of key-value pairs. To produce it, it will get the XML::LibXML::Node, the translator settings as HASH, the path, and the localname. This hook has a predefined C, which will not process the found element, but simply return the string C as value. This way, a whole tree of unneeded translations can be avoided. =head3 hooks for post-processing, after the data is collected The data is collect, and passed as second argument after the XML node. The third argument is the path. Be careful that the collected data might be a SCALAR (for simpleType). This hook also offers a predefined C. Besides, it has C, C, and C, which will result in additional fields in the HASH, respectively containing the CODE which was processed, the element names and the attribute names. The keys start with an underscore C<_>. =head1 SEE ALSO This module is part of XML-Compile distribution version 0.55, built on September 26, 2007. Website: F =head1 LICENSE Copyrights 2006-2007 by Mark Overmeer. For other contributors see ChangeLog. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See F