Currently, web content is based largely on documents written in HTML, a hypertext markup language coding a body of text that's interspersed with other media (like images, or forms). HTML has limited ability to classify sections of text on a web page so th