The CLEVER search engine incorporates several algorithms that make use of the Web's hyperlink structure for discovering high-quality information. It can be exceedingly difficult to locate resources on the World Wide Web that are both high-quality and relevant to a user's informational needs. Traditional automated search methods for locating information on the Web are easily overwhelmed by low-quality and unrelated content. Second generation search engines have to have effective methods for focusing on the most authoritative documents. The rich structure implicit in hyperlinks among Web documents offers a simple, and effective, means to deal with many of these problems. Additional Information: Publications:
Web content mining is related but different from data mining and text mining. It is related to data mining because many data mining techniques can be applied in Web content mining. It is related to text mining because much of the web contents are texts. H
Using RhNav - Rhizome Navigation I wrote a data aggregator for Technorati's API. The first result is a video which visualizes blog domains by analysing Technorati's Cosmos (the blogs which link to a particular URL). The video is a screencast of RhNav fetc