There are several semantic sources that can be found in the Web that are either explicit, e.g. Wikipedia, or implicit, e.g. derived from Web usage data. Most of them are related to user generated content (UGC) or what is called today the Web 2.0. In this talk we show several applications of mining the wisdom of crowds behind UGC to improve search. We will show live demos to find relations in the Wikipedia or to improve image search as well as our current research in the topic. Our final goal is to produce a virtuous data feedback circuit to leverage the Web itself.
by Andrew Moore (CMU), including tutorials on decision trees, information gain, cross validation, naive bayesian classifiers, hidden markov models, support vector machines, k-means and hierarchical clustering
mendation service which can be called via HTTP by BibSonomy's recommender when a user posts a bookmark or publication. All participating recommenders are called on each posting process, one of them is choosen to actually deliver the results to the user. We can then measure
The web can be represented by a graph with special regions: SCC, IN, OUT and TENDRILS.
Regions are defined by the link-path-reach from one website to others.
The linkage to and from a website (in- and out-degree) seems to conform the power law, which is also mentioned in this document.
?. Proceedings of the Workshop on Third Generation Data Mining: Towards Service-oriented Knowledge Discovery at ECML/PKDD 2008, (2008)Published online..
A. Abraham, und V. Ramos. Proceedings of the 2003 Congress on Evolutionary Computation CEC2003, Seite 1384--1391. Canberra, IEEE Press, (8-12 December 2003)