[PHPTAL] back to the dom parser idea

David Zülke dz at bitxtender.com
Mon Jan 28 12:22:00 CET 2008


Am 27.01.2008 um 23:08 schrieb Kornel Lesiński:

>>>
>>> There are issues with DOM itself:
>>> - it doesn't report line numbers for elements, so error messages  
>>> (other than parse errors) can't include them. Not very user- 
>>> friendly :(
>>
>> It does! Just need to disable libxml error reporting, so no  
>> warnings etc are generated, then pull the errors by hand.
>
> But is there a way to get line number of any element if there were  
> no parse errors? Things like invalid TALES expressions need to  
> report line number, but aren't libxml's concern.

Then we validate using XML Schema or RelaxNG ;)


>>> - non-validating parser can't support named entities (like  
>>>  ), but users will expect them to work. Switching to  
>>> validating parser is not an option, because it will reject  
>>> incomplete HTML fragments and PHPTAL elements.
>>
>> That is possible as well. Just need to tell DOM to resolve  
>> externals and validate against them (= doctype). I think we can  
>> expect people to write well-formed XML.
>
> What DOCTYPE do you suggest? Document with XHTML DTD and TAL  
> attributes won't be valid, so something else is necessary.

No... well... could in theory append an HTML doctype to the document,  
or an inline DTD with the entitiy definitions. Not a big problem :)


>>> - parser decodes numeric entities and represents them as literal  
>>> characters, possibly exposing encoding issues.
>>
>> Eh? what do you mean. As long as the charset info is correct,  
>> everything is fine.
>
> I meant that if someone didn't care about encoding, but just used  
> named entities like £ would have to add/correct all necessary  
> declarations. It's not a big problem.

In XML, you need to care about encodings ;) Tell the users that they  
should write proper stuff, and problem solved <:


>> IIRC libxml will even read a charset value from a meta tag in an  
>> XML document that it recognizes as XHTML.
>
> Fortunately it does it only for HTML. XHTML user-agents must not  
> read charset from <meta> element.

Correct, but if someone delivers XHTML as application/xhtml+xml, he'll  
also have the XML prolog with the charset, so again, no issue!


>> On a different topic, I really think we should use this DOM parser  
>> approach as the foundation for a completely new PHPTAL. There are  
>> some things that I think would be worth changing, for instance we  
>> should mandate the use of braces in function calls. That not only  
>> allows passing of arguments properly, it also eliminates the need  
>> of runtime evaluation of the type of the given element (array  
>> index, object property, object method). Not sure why it is as it is  
>> right now, but the "original" TAL uses braces for method calls,  
>> too, IIRC.
>
>
> I'm pretty sure Zope's TAL does not require use of braces/parens. It  
> even has nocall: modifier that prevents automatic calling of  
> functions, which would be unnecessary if calls required special  
> syntax.
>
> However optional parentheses with arguments for function calls sound  
> like a good idea. This can be done without a complete rewrite of  
> PHPTAL :)

But we should still do that! ;)

Seriously, I'll be able to help with this stuff. Can dedicate a good  
amount of time if we decide to start over, making proper plans,  
roadmaps, feature sets etc. It would be an excellent effort I think.


David




More information about the PHPTAL mailing list