Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization" [1]. It was primarily developed for language guessing, a task on which it is known to perform with near-pe
The UDC Summary of around 2,000 classes has been online since October 2009 and can now be browsed in 10 languages here.
The UDC summary is fully aligned with the UDC MRF 2009 which is going to be released in the following months.This set is made available for free use under the Creative Commons Attribution Share Alike 3.0 license (CC-BY-SA).
L'origine du Mundaneum remonteà la fin du XIXème siècle. Créé à l’initiative de deux juristes belges,Paul OtletetHenri La Fontaine,le projet visait à rassembler l’ensemble des connaissances du mondeet à les classer selon le système deClassi
KB:s översiktsdokument över Dewey-systemet. Upplösning: 1000 kategorier. Detta är dock inte den fullständiga referensen, som verkar mindre tillgänglig, förmodligen av upphovsrättsliga skäl.
the data here is useful for testing classification / clustering, and the accuracy of indexing techniques. However the datasets are too small to make claims about the efficiency of indexing.
Die Kontextabhängigkeit von Klassifikationssystemen wird in kognitive, soziale, kulturelle und historische Aspekte differenziert und ein anthropologisches Grundverständnis innerhalb der Bibliotheks- und Informationswissenschaft nahegelegt. Die Ausgangsfrage von Emile Durkheim und Marcel Mauss nach einem entwicklungslogischen Zusammenhang historischer Ordnungsformen wird wieder aufgenommen und in Auseinandersetzung mit kulturrelativistischen Standpunkten ein nachklassischer Ansatz zur Strukturgenese des klassifikatorischen Denkens vorgestellt. Als methodologischer Beitrag zur Informationsgeschichte wird aufgezeigt, von welchem Bezugspunkt kulturvergleichende Forschungen zur Wissensorganisation ausgehen können.
Information on the names, taxonomic relationships, continent-wide distributions, and morphological characteristics of all plants native and naturalized found in North America north of Mexico
Visual identification keys for endangered species such as Birds, Crocodilians, Turtles and Tortoises, Butterflies, Sturgeons and Paddlefish, Tropical Wood.
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
Recent explosion in the popularity of large language models like ChatGPT has led to their increased usage in classical NLP tasks like language classification. This involves providing a context…
The British Classification Society exists to encourage the co-operation and exchange of views and information among those interested in principles and practice of classification in any discipline where they are used. Its membership includes anthropologists, archaeologists, astronomers, biologists, chemists, computer scientists, forensic scientists, geologists, information specialists, librarians, psychologists, soil scientists and statisticians. The Society organises meetings, some by itself, but often jointly with societies representing application areas for classification.