151 lines
		
	
	
		
			6.7 KiB
		
	
	
	
		
			Plaintext
		
	
			
		
		
	
	
			151 lines
		
	
	
		
			6.7 KiB
		
	
	
	
		
			Plaintext
		
	
 | 
						|
TODO List
 | 
						|
 | 
						|
= KEY ====================
 | 
						|
    # Flagship
 | 
						|
    - Regular
 | 
						|
    ? Maybe I'll Do It
 | 
						|
==========================
 | 
						|
 | 
						|
If no interest is expressed for a feature that may require a considerable
 | 
						|
amount of effort to implement, it may get endlessly delayed. Do not be
 | 
						|
afraid to cast your vote for the next feature to be implemented!
 | 
						|
 | 
						|
Things to do as soon as possible:
 | 
						|
 | 
						|
 - http://htmlpurifier.org/phorum/read.php?3,5560,6307#msg-6307
 | 
						|
 - Think about allowing explicit order of operations hooks for transforms
 | 
						|
 - Fix "<.<" bug (trailing < is removed if not EOD)
 | 
						|
 - Build in better internal state dumps and debugging tools for remote
 | 
						|
   debugging
 | 
						|
 - Allowed/Allowed* have strange interactions when both set
 | 
						|
 ? Transform lone embeds into object tags
 | 
						|
 - Deprecated config options that emit warnings when you set them (with'
 | 
						|
   a way of muting the warning if you really want to)
 | 
						|
 - Make HTML.Trusted work with Output.FlashCompat
 | 
						|
 - HTML.Trusted and HTML.SafeObject have funny interaction; general
 | 
						|
   problem is what to do when a module "supersedes" another
 | 
						|
   (see also tables and basic tables.)  This is a little dicier
 | 
						|
   because HTML.SafeObject has some extra functionality that
 | 
						|
   trusted might find useful.  See http://htmlpurifier.org/phorum/read.php?3,5762,6100
 | 
						|
 | 
						|
FUTURE VERSIONS
 | 
						|
---------------
 | 
						|
 | 
						|
4.8 release [OMG CONFIG PONIES]
 | 
						|
 ! Fix Printer. It's from the old days when we didn't have decent XML classes
 | 
						|
 ! Factor demo.php into a set of Printer classes, and then create a stub
 | 
						|
   file for users here (inside the actual HTML Purifier library)
 | 
						|
 - Fix error handling with form construction
 | 
						|
 - Do encoding validation in Printers, or at least, where user data comes in
 | 
						|
 - Config: Add examples to everything (make built-in which also automatically
 | 
						|
   gives output)
 | 
						|
 - Add "register" field to config schemas to eliminate dependence on
 | 
						|
   naming conventions (try to remember why we ultimately decided on tihs)
 | 
						|
 | 
						|
5.0 release [HTML 5]
 | 
						|
 # Swap out code to use html5lib tokenizer and tree-builder
 | 
						|
 ! Allow turning off of FixNesting and required attribute insertion
 | 
						|
 | 
						|
5.1 release [It's All About Trust] (floating)
 | 
						|
 # Implement untrusted, dangerous elements/attributes
 | 
						|
 # Implement IDREF support (harder than it seems, since you cannot have
 | 
						|
   IDREFs to non-existent IDs)
 | 
						|
     - Implement <area> (client and server side image maps are blocking
 | 
						|
       on IDREF support)
 | 
						|
 # Frameset XHTML 1.0 and HTML 4.01 doctypes
 | 
						|
 - Figure out how to simultaneously set %CSS.Trusted and %HTML.Trusted (?)
 | 
						|
 | 
						|
5.2 release [Error'ed]
 | 
						|
 # Error logging for filtering/cleanup procedures
 | 
						|
 # Additional support for poorly written HTML
 | 
						|
    - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
 | 
						|
    - Friendly strict handling of <address> (block -> <br>)
 | 
						|
 - XSS-attempt detection--certain errors are flagged XSS-like
 | 
						|
 - Append something to duplicate IDs so they're still usable (impl. note: the
 | 
						|
   dupe detector would also need to detect the suffix as well)
 | 
						|
 | 
						|
6.0 release [Beyond HTML]
 | 
						|
 # Legit token based CSS parsing (will require revamping almost every
 | 
						|
   AttrDef class). Probably will use CSSTidy
 | 
						|
 # More control over allowed CSS properties using a modularization
 | 
						|
 # IRI support (this includes IDN)
 | 
						|
 - Standardize token armor for all areas of processing
 | 
						|
 | 
						|
7.0 release [To XML and Beyond]
 | 
						|
 - Extended HTML capabilities based on namespacing and tag transforms (COMPLEX)
 | 
						|
    - Hooks for adding custom processors to custom namespaced tags and
 | 
						|
      attributes, offer default implementation
 | 
						|
    - Lots of documentation and samples
 | 
						|
 | 
						|
Ongoing
 | 
						|
 - More refactoring to take advantage of PHP5's facilities
 | 
						|
 - Refactor unit tests into lots of test methods
 | 
						|
 - Plugins for major CMSes (COMPLEX)
 | 
						|
    - phpBB
 | 
						|
    - Also, a FAQ for extension writers with HTML Purifier
 | 
						|
 | 
						|
AutoFormat
 | 
						|
 - Smileys
 | 
						|
 - Syntax highlighting (with GeSHi) with <pre> and possibly <?php
 | 
						|
 - Look at http://drupal.org/project/Modules/category/63 for ideas
 | 
						|
 | 
						|
Neat feature related
 | 
						|
 ! Support exporting configuration, so users can easily tweak settings
 | 
						|
   in the demo, and then copy-paste into their own setup
 | 
						|
 - Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
 | 
						|
 - Allow scoped="scoped" attribute in <style> tags; may be troublesome
 | 
						|
   because regular CSS has no way of uniquely identifying nodes, so we'd
 | 
						|
   have to generate IDs
 | 
						|
 - Explain how to use HTML Purifier in non-PHP languages / create
 | 
						|
   a simple command line stub (or complicated?)
 | 
						|
 - Fixes for Firefox's inability to handle COL alignment props (Bug 915)
 | 
						|
 - Automatically add non-breaking spaces to empty table cells when
 | 
						|
   empty-cells:show is applied to have compatibility with Internet Explorer
 | 
						|
 - Table of Contents generation (XHTML Compiler might be reusable). May also
 | 
						|
   be out-of-band information.
 | 
						|
 - Full set of color keywords. Also, a way to add onto them without
 | 
						|
   finalizing the configuration object.
 | 
						|
 - Write a var_export and memcached DefinitionCache - Denis
 | 
						|
 - Built-in support for target="_blank" on all external links
 | 
						|
 - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
 | 
						|
   Also, enable disabling of directionality
 | 
						|
 ? Externalize inline CSS to promote clean HTML, proposed by Sander Tekelenburg
 | 
						|
 ? Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
 | 
						|
    1. Analyzing which tags to remove duplicants
 | 
						|
    2. Ensure attributes are merged into the parent tag
 | 
						|
    3. Extend the tag exclusion system to specify whether or not the
 | 
						|
    contents should be dropped or not (currently, there's code that could do
 | 
						|
    something like this if it didn't drop the inner text too.)
 | 
						|
 ? Make AutoParagraph also support paragraph-izing double <br> tags, and not
 | 
						|
   just double newlines.  This is kind of tough to do in the current framework,
 | 
						|
   though, and might be reasonably approximated by search replacing double <br>s
 | 
						|
   with newlines before running it through HTML Purifier.
 | 
						|
 | 
						|
Maintenance related (slightly boring)
 | 
						|
 # CHMOD install script for PEAR installs
 | 
						|
 ! Factor out command line parser into its own class, and unit test it
 | 
						|
 - Reduce size of internal data-structures (esp. HTMLDefinition)
 | 
						|
 - Allow merging configurations.  Thus,
 | 
						|
        a -> b -> default
 | 
						|
        c -> d -> default
 | 
						|
   becomes
 | 
						|
        a -> b -> c -> d -> default
 | 
						|
   Maybe allow more fine-grained tuning of this behavior. Alternatively,
 | 
						|
   encourage people to use short plist depths before building them up.
 | 
						|
 - Time PHPT tests
 | 
						|
 | 
						|
ChildDef related (very boring)
 | 
						|
 - Abstract ChildDef_BlockQuote to work with all elements that only
 | 
						|
   allow blocks in them, required or optional
 | 
						|
 - Implement lenient <ruby> child validation
 | 
						|
 | 
						|
Wontfix
 | 
						|
 - Non-lossy smart alternate character encoding transformations (unless
 | 
						|
   patch provided)
 | 
						|
 - Pretty-printing HTML: users can use Tidy on the output on entire page
 | 
						|
 - Native content compression, whitespace stripping: use gzip if this is
 | 
						|
   really important
 | 
						|
 | 
						|
    vim: et sw=4 sts=4
 |