Build Real-time Big Data Applications on Apache HBase. Open Source. Apache 2.0 Licensed. Gives a natural, intuitive toolkit for predictive modeling with machine learning library.
jWebSocket is a pure Java/JavaScript high speed bidirectional communication solution for the Web - secure, reliable and fast. Provides easy integration into existing Tomcat web applications.
Tangle is a JavaScript library for creating reactive documents. Your readers can interactively explore possibilities, play with parameters, and see the document update immediately. Tangle is super-simple and easy to learn.
sigma.js is an open-source lightweight JavaScript library to draw graphs, using the HTML canvas element. It has been especially designed to display interactively static graphs exported from a graph visualization software like Gephi and to display dynamically graphs that are generated on the fly.
An extendible and configurable PDF manipulation layer. It is a ready to use java library to perform PDF document manipulation without having to deal with the low level API.
The Mozenda Scraper provides web data extraction software, Web Screen Scraping tools that makes it easy to capture nearly any content from the web. See how you can start getting data from the web in minutes.
An example of a toy spelling corrector that achieves 80 or 90% accuracy at a processing speed of at least 10 words per second in less than a page of python code.
We are a community of motherfucking programmers who have been humiliated by software development methodologies for years. We are tired of XP, Scrum, Kanban, Waterfall, Software Craftsmanship (aka XP-Lite) and anything else getting in the way of...Programming, Motherfucker.
JADE (Java Agent DEvelopment Framework) is a software Framework fully implemented in Java language. It simplifies the implementation of multi-agent systems through a middle-ware that complies with the FIPA specifications and through a set of graphical tools that supports the debugging and deployment phases
RequireJS is a JavaScript file and module loader. It is optimized for in-browser use, but it can be used in other JavaScript environments, like Rhino and Node. Using a modular script loader like RequireJS will improve the speed and quality of your code.
Despite the many JavaScript libraries that are available today, I cannot find one that makes it easy to add keyboard shortcuts(or accelerators) to your javascript app. This is because keyboard shortcuts where only used in JavaScript games - no serious web application used keyboard shortcuts to navigate around its interface. But Google apps like Google Reader and Gmail changed that. So, I have created a function to make adding shortcuts to your application much easier.
Source code to repeat the paper evaluation: We present the first unsupervised approach to the problem of learning a semantic parser, using Markov logic. Our USP system transforms dependency trees into quasi-logical forms, recursively induces lambda forms from these, and clusters them to abstract away syntactic variations of the same meaning. The MAP semantic parse of a sentence is obtained by recursively assigning its parts to lambda-form clusters and composing them. We evaluate our approach by using it to extract a knowledge base from biomedical abstracts and answer questions. USP substantially outperforms TextRunner, DIRT and an informed baseline on both precision and recall on this task.
Lately I’ve been working on evaluating and comparing algorithms, capable of extracting useful content from arbitrary html documents. I have made a feature wise comparison of related software and APIs.
Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop.
AWS Elastic Beanstalk is an even easier way for developers to quickly deploy and manage applications in the AWS cloud without having to worry about the physical infrastructure or the resource configuration that accompanies setting up that infrastructure. You simply upload your application and AWS Elastic Beanstalk automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling, and application health monitoring, while allowing you to change configuration settings and deploy new versions.
Trying to find a name for a company, project, algorithm, product? Acronym Creator helps you generate a name that is an acronym or abbreviation. With this acronym builder, abbreviation maker, name generator, label finder - whatever you call it - you can make your own acronyms and have fun!
The professional, open source development tool for the open web. Develop and test your entire web application using a single environment. With support for the latest browser technology specs such as HTML5, CSS3 and JavaScript; and Ruby, Rails, PHP & Python on the server side. We've got you covered!
Node.js provides a its own assert module with some really useful functions for creating basic tests. However, the reporting and running of these assertions can become complicated, especially with asynchronous code. How can you be sure that all assertions ran? Or that they ran in the correct order? This is where nodeunit comes in, a tool for defining and running unit tests in the simplest way possible.
The Javatools are a collection of Java classes for a variety of small tasks, such as parsing, database interaction or file handling. They were developed by Fabian M. Suchanek for the YAGO-NAGA project. The Javatools are licensed under a Creative Commons Attribution 3.0 License by the YAGO-NAGA team.
A simple Web server with only 200 lines of C source code. In this article, Nigel Griffiths provides a copy of this Web server and includes the source code as well. You can see exactly what it can and can't do.
T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
With proper mark-up/logic separation, a POJO data model, and a refreshing lack of XML, Apache Wicket makes developing web-apps simple and enjoyable again.
Markup Language for Temporal and Event Expressions - TimeML is a robust specification language for events and temporal expressions in natural language.
Protégé is a free, open source ontology editor and knowledge-base framework.
The Protégé platform supports two main ways of modeling ontologies via the Protégé-Frames and Protégé-OWL editors. Protégé ontologies can be exported into a variety of formats including RDF(S), OWL, and XML Schema.
Protégé is based on Java, is extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development.
The OntoLT approach aims at a more direct connection between ontology engineering and linguistic analysis. OntoLT is a Protégé plug-in, with which concepts (Protégé classes) and relations (Protégé slots) can be extracted automatically from linguistically annotated text collections. It provides mapping rules, defined by use of a precondition language that allow for a mapping between linguistic entities in text and class/slot candidates in Protégé.
This workshop will gather researchers in a variety of fields that contribute to the automated construction of knowledge bases. It will be held at Xerox Research Centre Europe, near Grenoble (France), May 17-19, 2010.
F. Reichartz, H. Korte, und G. Paass. KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, Seite 773--782. New York, NY, USA, ACM, (2010)
F. Suchanek, G. Ifrim, und G. Weikum. 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), Seite 712--717. New York, NY, USA, ACM, (2006)
P. Pantel, D. Ravichandran, und E. Hovy. Proceedings of the 20th international conference on Computational Linguistics (COLING-04), Seite 771--777. Geneva, Switzerland, Association for Computational Linguistics, (2004)
A. Carlson, J. Betteridge, R. Wang, E. Jr., und T. Mitchell. WSDM '10: Proceedings of the third ACM international conference on Web search and data mining, Seite 101--110. New York, NY, USA, ACM, (2010)