The new, completed version of this Data Science Cheat Sheet can be found here. We are now at 20, up from 17. I hope I find the time to write a one-page surviva…
When you pay attention to something (and when you ignore something), data is created. This “attention data” is a valuable resource that reflects your interests, your activities and your values, and it serves as a proxy for your attention. To capture t
When you pay attention to something (and when you ignore something), data is created. This “attention data” is a valuable resource that reflects your interests, your activities and your values, and it serves as a proxy for your attention. To capture t
AideRSS is an intelligent assistant that saves time and keeps you on top of the latest news. We research every story and filter out the noise, allowing you to focus on what matters most
free package for analyzing data from complex samples, especially large-scale assessments, as well as non-assessment survey data. Has sophisticated stats, easy drag & drop interface, and integrated help system that explains the statistics as well as how to
Using RhNav - Rhizome Navigation I wrote a data aggregator for Technorati's API. The first result is a video which visualizes blog domains by analysing Technorati's Cosmos (the blogs which link to a particular URL). The video is a screencast of RhNav fetc
APRIORI algorithm was originally proposed by Agrawal in "Fast Algorithms for Mining Association Rules" in 1994 to find frequent itemsets and association rules in a transaction database. Here you can download a fast, trie-based, command-line implementation
The Bioinformatics Links Directory features curated links to molecular resources, tools and databases. The links listed in this directory are selected on the basis of recommendations from bioinformatics experts in the field. We also rely on input from our community of bioinformatics users for suggestions.
Revealed: The NSA's powerful tool for cataloguing global surveillance data – including figures on US collection • Boundless Informant: mission outlined in four slides • Read the NSA's frequently asked questions document
This project contains Naive and Fishers bayesian classifiers, as described in Toby Segaran's book "Programming Collective Intelligence." The book has python implementations; this is a Java implementation.
CiteSeerX - Document Details (Isaac Councill, Lee Giles): The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the supportvector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.
The semantic web must "explain the meaning of words" to computers. Some semantic technologies use a "bottom up" by embedding semantic annotations (metadata) into web content. "Top down" technologies analyze information without metadata using some form of
The semantic web must "explain the meaning of words" to computers. Some semantic technologies use a "bottom up" by embedding semantic annotations (metadata) into web content. "Top down" technologies analyze information without metadata using some form of
Yesterday, I had dinner with two people from yet another startup that uses tagging and collaborative filtering in the same sentence. So are tags and collaborative filtering a marriage made in heaven? It's a promising approach, but there are challenges in
Yesterday, I had dinner with two people from yet another startup that uses tagging and collaborative filtering in the same sentence. So are tags and collaborative filtering a marriage made in heaven? It's a promising approach, but there are challenges in
Hyperlinking is the foundation of the web. As users add new content, and new sites, it is bound in to the structure of the web by other users discovering the content and linking to it. Much as synapses form in the brain, with associations becoming stronge
Hyperlinking is the foundation of the web. As users add new content, and new sites, it is bound in to the structure of the web by other users discovering the content and linking to it. Much as synapses form in the brain, with associations becoming stronge
Concept mining is a discipline at the nexus of data mining, text mining, and linguistics, drawing on artificial intelligence and statistics. It aims to extract concepts from documents.
Screen scrapper and parser service. Cost: "In the future, non-commercial and small uses will remain free. Pricing structure for bigger applications and for commercial uses will be announced in the future."
This document describes the implementation of a DAQ model. It provides a number of tools to develop a data acquisition system.
To facilitate comunication between different objects, DAQ++ implements a very simple Observer model, in which some of the DAQ++ objects are defined as DAQpp::Observables and some as DAQpp::Observers. Observers subscribe to the messages defined in the Observables and are notified whenever a change occurs.
The basis of the system is the DAQpp::Module object. It represents a detector or DAQ unit. As such, it implements the basic DAQ commands to get ready, start or stop the DAQ, retreive the data, etc.
It used to be you had to get a warrant to monitor a person or a group of people. Today, it is increasingly easy to monitor ideas. And then track them back to people. Most of us don't have access to the databases, software, or computing power of the NSA, F