<htmlCleaner> is open-source HTML parser written in Java. HTML found on Web is usually dirty, ill-formed and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring the order t
A tool that strips proprietary Microsoft tags and other cruft from Word HTML documents, leaving basic formatting intact. File sizes are greatly reduced, and the returned HTML is easier to read, revise and employ.