RSS, OPML and the XML platform.
Copyright 2003-5 Randy Charles Morin
Mihai Parparita: Here are the top XML errors that we have encountered when parsing all of the feeds that our users have added to Reader.
|% of errors||Error description|
|15.6%||Input claims to be UTF-8 but contains invalid characters.|
|14.9%||Opening and ending tags mismatch|
|13.9%||An undefined entity is used (e.g. |
|7.8%||Documented expected to begin with a start tag, but no |
|5.7%||Disallowed control characters present|
|5.5%||Extra content at the end of the document|
|4.2%||Unterminated entity reference (missing semi-colon)|
|4.2%||Unquoted attribute value|
|3.8%||Premature end of data in tag (truncated feed)|
|3.3%||Naked ampersand (should be represented as |
|2.1%||XML declaration allowed only at the start of the document|
|1.8%||Namespace prefix is used but not defined|
|0.75%||Comment not terminated|
|0.64%||Attribute without value|
Randy: Some interesting data would be the percentage chance that a feed has ill-formed XML based on the generator (Blogger, Wordpress, Typepad, MT, etc). Anybody got that data?