Nature 26 Oct 2021--Catalogue of billions of phrases from 107 million papers could ease computerized searching of the literature. Catalogue of billions of phrases from 107 million papers could ease computerized searching of the literature.
In a project that could unlock the world’s research papers for easier computerized analysis, an American technologist [Carl Malamud]has released online a gigantic index of the words and short phrases contained in more than 100 million journal articles — including many paywalled papers.
The catalogue, which was released on 7 October and is free to use, holds tables of more than 355 billion words and sentence fragments listed next to the articles in which they appear. It is an effort to help scientists use software to glean insights from published work even if they have no legal access to the underlying papers, says its creator, Carl Malamud. He released the files under the auspices of Public Resource, a non-profit corporation in Sebastopol, California that he founded.
Malamud says that because his index doesn’t contain the full text of articles, but only sentence snippets up to five words long, releasing it does not breach publishers' copyright restrictions on the re-use of paywalled articles. However, one legal expert says that publishers might question the legality of how Malamud created the index in the first place.
Nature, July 2019. -- A giant data store quietly being built in India could free vast swathes of science for computer analysis — but is it legal? A giant data store quietly being built in India could free vast swathes of science for computer analysis —but is it legal?
Over the past year, Malamud has — without asking publishers — teamed up with Indian researchers to build a gigantic store of text and images extracted from 73 million journal articles dating from 1847 up to the present day. The cache, which is still being created, will be kept on a 576-terabyte storage facility at Jawaharlal Nehru University (JNU) in New Delhi. “This is not every journal article ever written, but it’s a lot,” Malamud says. It’s comparable to the size of the core collection in the Web of Science database, for instance. Malamud and his JNU collaborator, bioinformatician Andrew Lynn, call their facility the JNU data depot.
Article about Sci-Hub and ahat ought to be done i "Palladium - governance futurism" on Sept 24, 2021 By Jason Parry. Supports David Wiley's "thought experiment" on using Eminent Domain (expropriation).
(Le Monde diplomatique, octobre 2021) Alors que la production des connaissances scientifiques relève pour l'essentiel de la dépense publique, les revenus de leur publication sont captés par des mastodontes de l'édition, qui imposent leurs priorités et leurs tarifs. Cette position dominante de quelques acteurs privés entrave la diffusion du savoir et s'accompagne de nombreux effets délétères.
EFF June 2021: Major publishers want to censor research-sharing resource Sci-Hub from the internet, but archivists quickly respond to make that impossible. More than half of academic publishing is controlled by major publishers using burdensome paywalls. One project in particular, Sci-Hub, has threatened to break down this barrier by sharing articles without restriction. As a result,
China is working on a master plan for the internationalisation of its domestic journals and plans to pursue an open science strategy at a national level
2019 How librarians, pirates, and funders are liberating the world’s academic research from paywalls. Featuring Elaine Westworth, Aileen Fyfe, Theodora Bloom et al
"What’s standing in the way of a full-on revolution? The culture of science. "
"But there’s a big thing getting in the way of a revolution: prestige-obsessed scientists who continue to publish in closed-access journals. They’re like the road workers who keep paying fees to build infrastructure they can’t freely access. Until that changes, the walls will remain firmly intact."
The latest strategy for addressing the serials crisis that has fueled the crisis in scholarly publishing across the disciplines is the establishment of transformative open access agreements.
March 16, 2021
The agreement is the largest of its kind in North America to date, bringing together UC, which generates nearly 10 percent of all U.S. research output, and Elsevier, which disseminates about 17 percent of journal articles produced by UC faculty. The deal will double the number of articles made available through UC’s transformative open access agreements.
THE VERGE By Ian Graber-Stiehl Feb 8, 2018
Alexandra Elbakyan opened her email to a message from the world’s largest publisher: "YOU HAVE BEEN SUED." The student and programmer runs Sci-Hub, a website with over 64 million academic papers available for free to anybody in the world.