The goal of the PDTB project is to develop a large scale corpus annotated with information related to discourse structure. While there are many aspects of discourse that are crucial to a complete understanding of natural language, the Penn Discourse Treebank (PDTB) focuses on encoding coherence relations associated with discourse connectives. The annotations include the argument structure of the connectives, thus exposing a clearly defined level of discourse structure which will support the extraction of a range of inferences associated with discourse connectives. Some other annotated features associated with discourse connectives and their arguments include sense distinctions for discourse connectives, and attribution-related features for both connectives and their arguments.
J. Yi, T. Nasukawa, R. Bunescu, and W. Niblack. Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM-2003), page 427--434. Melbourne, Florida, (2003)