Quantitative Fonds arbeiten mit ausgeklügelten Rechenmodellen und verzichten auf die subjektive Titelauswahl durch menschliche Manager. Doch die Produkte haben ihre Tücken, wie die jüngste Krise beweist.
Die Personen-Datenbank des Munzinger-Archivs umfasst mehr als 20.000 prominente Lebensläufe und wird kontinuierlich aktualisiert. Sie finden dort Porträts von Politikern, Wirtschaftsgrößen, aber auch von Künstlern und Wissenschaftlern.
Emacs is the extensible, customizable, self-documenting real-time display editor. This Info file describes how to edit with Emacs and some of how to customize it; it corresponds to GNU Emacs version 23.1.
SmartGWT is a GWT based framework that allows you to not only utilize its comprehensive widget library for your application UI, but also tie these widgets in with your server-side for data management. SmartGWT is based on the powerful and mature SmartClient library.
Despite the many JavaScript libraries that are available today, I cannot find one that makes it easy to add keyboard shortcuts(or accelerators) to your javascript app. This is because keyboard shortcuts where only used in JavaScript games - no serious web application used keyboard shortcuts to navigate around its interface. But Google apps like Google Reader and Gmail changed that. So, I have created a function to make adding shortcuts to your application much easier.
An example of a toy spelling corrector that achieves 80 or 90% accuracy at a processing speed of at least 10 words per second in less than a page of python code.
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package.
It is a fast real-time parser for real-world HTML. What has attracted most developers to HTMLParser has been its simplicity in design, speed and ability to handle streaming real-world html.
Semantic MediaWiki (SMW) is a free extension of MediaWiki that helps to search, organise, tag, browse, evaluate, and share the wiki's content. While traditional wikis contain only texts which computers can neither understand nor evaluate, SMW adds semantic annotations that bring the power of the Semantic Web to the wiki.
JADE (Java Agent DEvelopment Framework) is a software Framework fully implemented in Java language. It simplifies the implementation of multi-agent systems through a middle-ware that complies with the FIPA specifications and through a set of graphical tools that supports the debugging and deployment phases
This is an overview of the open source NLP and machine learning tools for text mining, information extraction, text classification, clustering, approximate string matching, language parsing and tagging, and more.
PojoCache is an in-memory, transactional, and replicated POJO (plain old Java object) cache system that allows users to operate on a POJO transparently without active user management of either replication or persistency aspects. This tutorial focuses on the usage of the PojoCache API.
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
This software is an extension of the SVMlight software. It provides an interface to kernel functions that are implemented in Java by means of the Java Native Interface (JNI) Invocation API.
Joda-Time provides a quality replacement for the Java date and time classes. The design allows for multiple calendar systems, while still providing a simple API. The 'default' calendar is the ISO8601 standard which is used by XML. The Gregorian, Julian, Buddhist, Coptic, Ethiopic and Islamic systems are also included, and we welcome further additions. Supporting classes include time zone, duration, format and parsing.
It contains a Web Crawler, HTML Parser and ("in the near future") NER and REX.
Additionally, including JWikiDocs, a Java tool for crawling and downloading Wikipedia documents.
jWebSocket is a pure Java/JavaScript high speed bidirectional communication solution for the Web - secure, reliable and fast. Provides easy integration into existing Tomcat web applications.
This web page provides information, errata, as well as about a third of the chapters of the book Learning with Kernels, written by Bernhard Schölkopf and Alex Smola (MIT Press, Cambridge, MA, 2002).
LIBSVM is an integrated software for support vector classification, (C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM ). It supports multi-class classification.
Alle Programme und Resourcen auf der Liste sind frei, d.h. kostenlos (für Forschungszwecke) verfügbar, auf deutschsprachige Texte anwendbar und sofort startklar, d.h. sie müssen nicht erst mit Hilfe von z.B. annotierten Korpora trainiert werden. Die Liste ist natürlich unvollständig (Stand 22.5.2007).
MegaMap is a Java implementation of a map (or hashtable) that can store an unbounded amount of data, limited only by the amount of disk space available. Objects stored in the map are persisted to disk. Good performance is achieved by an in-memory cache. The MegaMap can, for all practical reasons, be thought of as a map implementation with unlimited storage space.
MSTParser is a non-projective dependency parser that searches for maximum spanning trees over directed graphs. Models of dependency structure are based on large-margin discriminative training methods. Projective parsing is also supported.
MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta).
MuNPEx requires a part-of-speech (POS) tagger to work and can additionally use detected named entities (NEs) to improve chunking performance. Please read the documentation (or source code) for more details.
Das NEGRA Korpus Version 2 besteht aus 355.096 Tokens (20.602 Sätzen) deutschen Zeitungstextes aus der Frankfurter Rundschau. Die Texte sind der CD "Multilingual Corpus 1" der European Corpus Initiative entnommen. Es basiert auf ca. 60.000 Tokens, die am Institut für maschinelle Sprachverarbeitung, Stuttgart, mit Parts-of-Speech annotiert wurden. Dieses Korpus wurde erweitert, ebenfalls mit Parts-of-Speech versehen und vollständig mit syntaktischen Strukturen annotiert. Der Aufbau des Korpus wurde in den Projekten NEGRA (DFG Sonderforschungsbereich 378, Projekt C3) und LINC (Universität des Saarlandes) in Saarbrücken durchgeführt.
NestedVM provides binary translation for Java Bytecode. This is done by having GCC compile to a MIPS binary which is then translated to a Java class file. Hence any application written in C, C++, Fortran, or any other language supported by GCC can be run in 100% pure Java with no source changes.
Node.js provides a its own assert module with some really useful functions for creating basic tests. However, the reporting and running of these assertions can become complicated, especially with asynchronous code. How can you be sure that all assertions ran? Or that they ran in the correct order? This is where nodeunit comes in, a tool for defining and running unit tests in the simplest way possible.
A simple Web server with only 200 lines of C source code. In this article, Nigel Griffiths provides a copy of this Web server and includes the source code as well. You can see exactly what it can and can't do.
The OntoLT approach aims at a more direct connection between ontology engineering and linguistic analysis. OntoLT is a Protégé plug-in, with which concepts (Protégé classes) and relations (Protégé slots) can be extracted automatically from linguistically annotated text collections. It provides mapping rules, defined by use of a precondition language that allow for a mapping between linguistic entities in text and class/slot candidates in Protégé.
OpenCyc is the open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine.
OpenLaszlo programs are written in XML and JavaScript and transparently compiled to Flash and, with OpenLaszlo 4, DHTML. The OpenLaszlo APIs provide animation, layout, data binding, server communication, and declarative UI. An OpenLaszlo application can be as short as a single source file, or factored into multiple files that define reusable classes and libraries.
OpenLaszlo is "write once, run everywhere." An OpenLaszlo application developed on one machine will run on all leading Web browsers on all leading desktop operating systems.
PDFjam is a small collection of shell scripts which provide a simple interface to some of the functionality of the excellent pdfpages package (by Andreas Matthias) for pdfLaTeX.
We are a community of motherfucking programmers who have been humiliated by software development methodologies for years. We are tired of XP, Scrum, Kanban, Waterfall, Software Craftsmanship (aka XP-Lite) and anything else getting in the way of...Programming, Motherfucker.
qooxdoo is a comprehensive and innovative framework for creating rich internet applications (RIAs). Leveraging object-oriented JavaScript allows developers to build impressive cross-browser applications. No HTML, CSS nor DOM knowledge is needed.
Our goal is to develop a probabilistic knowledge base that mirrors the content of the web. We are developing a system that uses semi-supervised learning methods to learn to extract symbolic knowledge from unstructured text and HTML. We are exploring methods of continous learning, where our system runs 24x7, continuously learning to read better, and continuously extracting facts from the web.
RelEx, a narrow-AI component of OpenCog, is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser. It can identify subject, object, indirect object and many other dependency relationships between words in a sentence; it generates dependency trees, resembling those of dependency grammars.
RequireJS is a JavaScript file and module loader. It is optimized for in-browser use, but it can be used in other JavaScript environments, like Rhino and Node. Using a modular script loader like RequireJS will improve the speed and quality of your code.
This is the project page for SecondString, an open-source Java-based package of approximate string-matching techniques. This code was developed by researchers at Carnegie Mellon University from the Center for Automated Learning and Discovery, the Department of Statistics, and the Center for Computer and Communications Security.
An extendible and configurable PDF manipulation layer. It is a ready to use java library to perform PDF document manipulation without having to deal with the low level API.
SentiWordNet is a lexical resource for opinion mining. SentiWordNet assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity.
Shalmaneser is a supervised learning toolbox for shallow semantic parsing, i.e. the automatic assignment of semantic classes and roles to text. The system was developed for Frame Semantics; thus we use Frame Semantics terminology and call the classes frames and the roles frame elements. However, the architecture is reasonably general, and with a certain amount of adaption, Shalmaneser should be usable for other paradigms (e.g., PropBank roles) as well. Shalmaneser caters both for end users, and for researchers.
sigma.js is an open-source lightweight JavaScript library to draw graphs, using the HTML canvas element. It has been especially designed to display interactively static graphs exported from a graph visualization software like Gephi and to display dynamically graphs that are generated on the fly.
Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop.
SVM-JAVA, developed for research and educational purpose, is a Java implementation of John C. Platt's sequential minimal optimization (SMO) for training a support vector machine (SVM). This program is based on the pseudocode in "Fast Training of Support Vector Machines using Sequential Minimal Optimization" by John C. Platt and in "Sequential Minimal Optimization for SVM" by Xianping Ge. It currently supports linear and RBF kernels.
Tangle is a JavaScript library for creating reactive documents. Your readers can interactively explore possibilities, play with parameters, and see the document update immediately. Tangle is super-simple and easy to learn.
M. Zhang, J. Zhang, J. Su, and G. Zhou. ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, page 825--832. Morristown, NJ, USA, Association for Computational Linguistics, (2006)
R. Kate. Proceedings of the conference on Empirical Methods in Natural Language Processing (EMNLP-2008), page 400--409. Waikiki, Honolulu, Hawaii, (2008)
D. Roth, M. Sammons, and V. Vydiswaran. Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, page 57--60. Suntec, Singapore, Association for Computational Linguistics, (August 2009)
D. Roth, and W. tau Yih. HLT-NAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004), page 1--8. Boston, Massachusetts, USA, Association for Computational Linguistics, (May 2004)
M. Wang. Proceedings of the Third International Joint Conference on Natural Language Processing, 2, page 841--846. Hyderabad, India, Asian Federation of Natural Language Processing, Association for Computational Linguistics, (January 2008)
R. Bunescu, and R. Mooney. Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT '05), October 6-8, 2005, Vancouver, British Columbia, Canada, page 724--731. Association for Computational Linguistics Morristown, NJ, USA, (2005)
J. Jiang, and C. Zhai. Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics, page 113--120. Rochester, New York, Association for Computational Linguistics, (April 2007)
F. Zanzotto, and A. Moschitti. ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, page 401--408. Morristown, NJ, USA, Association for Computational Linguistics, (2006)
P. Pantel, and M. Pennacchiotti. Ontology Learning and Population: Bridging the Gap between Text and Knowledge, volume 167 of Frontiers in Artificial Intelligence and Applications, IOS Press, (2008)
J. Kleinberg. KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, page 91--101. New York, NY, USA, ACM, (2002)
X. Wan, and J. Yang. SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, page 143--150. New York, NY, USA, ACM, (2007)
F. Suchanek, G. Ifrim, and G. Weikum. 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), page 712--717. New York, NY, USA, ACM, (2006)
T. Zesch, I. Gurevych, and M. Mühlhäuser. Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), page 205--208. (2007)
F. Reichartz, H. Korte, and G. Paass. Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, page 365--368. Suntec, Singapore, Association for Computational Linguistics, (August 2009)