The Data Fundamental modules provide a solid overview over the workflow with data guiding you from what data is, to how to make your data tell a story. The courses listed below should be seen as a whole, a quick overview of the elements involved in working with data.
Ever needed custom formatted sample / test data, like, bad? Well, that's the idea of this script. It's a free, open source tool written in JavaScript, PHP and MySQL that lets you quickly generate large volumes of custom data in a variety of formats for use in testing software, populating databases, and... so on and so forth. free to try. free download of software. or register for fee.
OONI is the Open Observatory for Network Interference and its aim is to collect high quality data using open methodologies, using Free and Open Source Software (FL/OSS) to share observations and data about the kind, methods and amount of surveillance and censorship in the world
This is a human rights observation project for the Internet. OONI seeks to observe levels of surviellance, censorship, and networked discrimination by networked authoritarian power structures.
CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available.
government-funded and approved agencies such as the Ordnance Survey and UK Hydrographic Office and Highways Agency collect data using our funds should make that data available for free
A Selection of Sources for Professional Reading in Social Science Data IASSIST Joanne Juhnke, Special Librarian Data & Program Library Service (DPLS) University of Wisconsin—Madison
Experimental innovators work by trial and error, and arrive at their major contributions gradually, late in life. In contrast, conceptual innovators make sudden breakthroughs by formulating new ideas, usually at an early age.
hare knowledge on geoinformation in Africa with as wide an audience as possible. This will include the sourcing and dissemination of research articles on different themes and focus areas, highlighting news items in the geoinformation industry and keeping
a data API. an environment in which the data is distributed, discoverable, described and linked just as the text data of documents is now. One would have a data browser and a service like Swivel would be more of an aggregator/search engine rather than a d
The portal provides access to an unprecedented quantity of social sciences quantitative datasets using an easy to use Web interface. It harvests statistical datasets and variables published on the Semantic Web from all the largest European social sciences
The portal provides access to an unprecedented quantity of social sciences quantitative datasets using an easy to use Web interface. It harvests statistical datasets and variables published on the Semantic Web from all the largest European social sciences. CESSDA
a non-profit venture for development and provision of free software that visualise human development. Making sense of the world by having fun with statistics! Gapminder promotes sustainable global development by increased use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels.
The European Parliament could entrench a policy of charging citizens for information they have already paid to collect, enforced by state copyright over geographic information.
Four Hundred Guru--Exporting DB2/400 Dates to Excel. Here is link to article explaining what they label as a well known bug in Excel: It does not do leap year math correctly.
consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person.
HUD USER provides interested researchers with access to the original electronic data sets generated by PD&R sponsored data collection efforts, including the American Housing Survey, HUD median family income limits, as well as microdata from research initiatives on topics such as housing discrimination, the HUD-insured multifamily housing stock, and the public housing population.
wiki devoted to strategies for developing statistical literacy and data services among and between libraries and archives and other service units in academic settings.
This new series looks at contemporary American culture through the austere lens of statistics. Each image portrays a specific quantity of something: fifteen million sheets of office paper (five minutes of paper use); 106,000 aluminum cans (thirty seconds
nabling the repositories of published reports and papers to interact directly with the repositories of source data from which, in general, they are derived.
mashed up Google Maps with World Bank data to give a visual entry point to browse world bank projects, news, statistics and public information center by country.
an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources. includes a search across text-archives.
Wired Magazine issue 16.07. Data Deluge. Crop predictions. Quark. Data mining. tracking news. watching the skies, scanning skeletons. airfares. voting. epidemics. google events. terrorism. visualizing big data
This study is designed to contribute to community understanding of the attitudes and behaviors of United States college and university librarians. This 2006 study builds upon studies that targeted United States faculty members in 2000 and 2003
Stuart Basefsky. Institute for Workplace Studies. latest information related to Labor Relations. International. The service is unique in that it provides the original source documentation, via links, behind the news and research of the day.
listings emphasize the connection between data posted by governments and public institutions and the interfaces people are building to explore that data.
a registry of open knowledge packages and projects (and a few closed ones). CKAN is the place to search for open knowledge resources as well as register your own.
Researchers need to adapt their institutions and practices in response to torrents of new data and need to complement smart science with smart searching.
U.S. Department of Education's National Center for Education Statistics that annually collects fiscal and non-fiscal data about all public schools, public school districts and state education agencies in the United States.
Gapminder World is powered by Trendalyzer and Google Spreadsheet. there are video-tutorials in the "help pages" available in the upper left corner of Gapminder World.
Fatality Analysis Reporting System data from the U.S. National Highway Traffic Safety Administration (NHTSA). FARS data can be obtained by downloading any of the published files from the Internet, at ftp://ftp.nhtsa.dot.gov/FARS. The files are available in SAS, DBF and sequential ASCII file formats.
data available from the National Bureau of Economic Research data archive. includes macro data such as business cycles, Industry Data such as Job Creation and Destruction Data, International Trade Data, "Individual Data," Hospital Data, Demographic and Vital Statistics, Patent Data, and more such as Data Appendixes from NBER Working Papers and Books, Segregation Data, etc.
SEDAC’s mission is to develop and operate applications that support the integration of socioeconomic and earth science data and to serve as an “information gateway” between the earth sciences and social sciences.
The Global Poverty Mapping Project seeks to enhance current understanding of the global distribution of poverty and the geographic and biophysical conditions of where the poor live.
The U.S. Census Grids provide raster data sets that include not only population and housing counts, but a wide variety of socioeconomic characteristics. These gridded data sets transform irregularly shaped census block and block group boundaries into a regular surface – a raster grid – for faster and easier analysis.
Datasets about libraries available from NCES. Electronic Catalog of NCES Products (National Center for Education Statistics). Publications and data products.
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.
The 2008 version of the Trans-Atlantic Slave Trade Database contains 8,374 voyages added since the CD-Rom was published in 1999 and additional information on 19,320 voyages. The expanded data set has 276 variables, compared with 99 in the Voyages Database available online.