Subscribe: Comments on: Your forthcoming feed errors
Added By: Feedage Forager Feedage Grade B rated
Language: English
code  don  end delimiter  end  entity escaping  escaping  feed  make  might  phil ringnalda  section  user  xhtml xml  xhtml  xml 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Comments on: Your forthcoming feed errors

Comments on Your forthcoming feed errors

a digital magpie

Updated: 2016-10-24T13:44:47Z


By: Geof’s Relentless Kvetching About WordPress » Blog Archive » Owen on Testing


[…] Owen, I don’t know enough about the ins and outs of these feeds to be of much use, but there are people who do. [Phil Ringnalda seems to care, and you could probably talk Sam Ruby into helping with test cases—and if nothing else, Sam’s focus on it would get some interesting comments from Mark Pilgrim on the subject, and that would make enough of a kerfluffle to get some attention. ] But I am certainly arguing that we need to take time and figure out what testing needs to be done—fishbone diagrams, whatever. […]

By: wordpress wank » 57 varieties


[…] Apparently, 2.0 is going to ship with an invalid feed on account of Atom 0.3 is no longer a valid or supported spec and Matt is refusing to support the new spec in the next release, or explain his reasons for this decision. […]

By: Manuzhai


Well, yes, the coupling of email-addresses to sessions in Trac should provide for enough information, except that maybe not everyone commenting on a ticket might want to also receive email. I’ve done some Trac development, though; I might look into getting this fixed. It’s enough of a problem that it bugs me too, sometimes.

By: Roger Benningfield


Kellan: I feel your pain, brother. Contrary to some folks’ predictions, I’m actually feeling pretty positive about Atom these days… but supporting it ain’t as simple as supporting RSS.

For example, my xml:base implementation in Coldfusion… it’s around 150 lines of code, and only allows the publisher to set the base on feed, entry, content, and summary elements. Just to make things more difficult, it won’t work at all for users whose webhosts have their accounts sandboxed, since sandboxing locks out access to the JVM, and I’m relying on to make it possible at all. Fun fun.

By: Phil Ringnalda


Nearly right, it’s not well-formed or it would have gone through as type="xhtml", and it includes an unescaped ]]>, for an unknown reason.

It might have that ]]> not as an actual CDATA section end delimiter, just as an accidental part of some ASCII art or as an example: in that case, it should have been escaped for the user some time prior to feed generation, because that’s invalid in anything descended from SGML, and escaping it now won’t hurt.

It might have that ]]> as the end of an intentional CDATA section. If so, WTF is the user up to? If he serves application/xhtml+xml then that gets treated like CDATA should, but nobody knows whether his XHTML fragment will eventually be served as text/html or not, so his CDATA section should have already been replaced with entity escaping so it would work either way (plus, remember, he’s not well-formed, so he’s got bigger problems already, that we expect will soon be fixed). Escaping his end delimiter and then wrapping the post in a CDATA section will break it, but it’s already broken, and when he unbreaks it it will go through as type="xhtml" again. Or, he’s serving text/html, where using a CDATA section is a rather odd way of commenting things out, and escaping his end delimiter ought to alert him to how broken his behavior is (particularly since if his fragment is later Tidied and served as application/xhtml+xml, his CDATA-as-comments suddenly becomes CDATA-as-text). Or, in either mime-type, he’s using a combination of SGML comments, JavaScript comments, and a CDATA section to hide JavaScript, in which case he should be looked at sternly while his entire is stripped out, because one of the reasons it shouldn’t appear in a feed is that you don’t know what will happen to it once it’s neutralized on the other end (Gregarius seems to display it, so the result of those Structured Blogging plugins that try to tunnel XML in is a blob of crap at the end of the post).

I don’t really see a case where you would get down to the entity escaping alternative that doesn’t involve utterly broken garbage that should have already been dealt with anyway.

By: Aristotle Pagaltzis


I’m still not sure I follow. Is the following case what you’re trying to cover?

  • The user wants to include a CDATA section in the content.
  • The content is not well-formed.

By: Ben de Groot


No what’s that supposed to mean?



Ben, you could at least try working with us, rather than against us.

By: Robert Sayre


I thought I presented an uncontroversial list of add complexity, at which point my comptence was questioned.

I’m sorry you got offended, but you’re the one turned the discussion to the parser you wrote. After looking it over, I think it’s a very nice library with a parser that reached the limits of its current design a long time ago. Bad software with bad Atom support doesn’t worry me, but Magpie is good software, so it needs fixing. Expect patches.

By: Phil Ringnalda


I think maybe you are: there are lots of ways you can make a valid, well-formed feed, some of which (simply remove every <, >, and &) are less likely to please the user than others (use CDATA sections so they can pretend there’s no escaping).

If your goal is to use inline XML when you can, and when you can’t to prefer CDATA escaping to entity escaping, I don’t think there’s a situation where you can’t use CDATA escaping, because either your user has already blown his chance of using a CDATA section in the post, so you won’t be illegally nesting CDATA sections, or he wasn’t trying to use one at all, just happened to use the end delimiter unescaped, which he’s not allowed to do in HTML, or XHTML, or XML, or SGML, so you can get away with escaping it even though it won’t then be unescaped in the output. If you are code in a protocol client (or editing code on the server) that’s probably overstepping your bounds, but for feed-generating code I don’t think that lack of fidelity can ever make things worse.