2017 February 23
by Daniel Lakeland

Look, the journal system and anonymous pre-publication peer review are a disaster for science. The reasons have been documented over and over in the last few years, and if you follow Andrew Gelman's blog you'd have seen hundreds of examples of serious problems. In part, this is a social issue, and I don't have a solution to the social issue (ie. how we promote science and scientists and fund them). But, in part, this is a technological problem. If there were a really good technological solution to publication, we would be better off. So, here's what I think such a system should look like:

  1. A decentralized archive of papers, data sets, and public commentary on papers.
  2. Each submission is given a UUID and cryptographically signed by all authors, revision history is allowed, and all revisions are stored.
  3. Propagation of new articles proceeds in a peer-to-peer fashion. Peers are cryptographically identified by signatures. Certain peers are marked as "trusted archives" by distributed vote. (ie. all the guys at University of Foo mark their official "university of Foo local archive run by the information technology department of Foo" as a trusted archive). Typical submission would be to your local instance, which immediately propagates to your trusted archives. The system is not tied to universities, and archives would be kept by organizations such as professional organizations, government libraries, state governments, nonprofits, even individuals.
  4. Metadata is propagated broadly, certainly complete copies of metadata are sent between major archives. Local instances automatically replicate metadata to your local computer based on keywords etc. Search proceeds by first searching your local metadata archive, and then requesting a metadata transfer from your trusted archives on the basis of search keywords, authors, etc, as well as sending peer-to-peer queries to other registered peers (your collaborators at other universities, etc.). Queries have a time-to-live and are propagated at least several steps, but no more than 4 or 5 (avoid exponential explosion).
  5. Actual content (papers, datasets etc) are propagated based on policy for the archive operator and/or the local operator. Archives would be really symmetric with individual instances on the basis of how they operate, but obviously would be expected to house much larger data storage and generally accept much more submissions for full archival.

The biggest problem, as I see it, is control of spam/bots. How do you prevent the system from becoming bogged down by automated submission of enormous quantities of meaningless trash? Cryptographic signatures help a little. Using a web of trust would help a lot. University Archives could obviously accept local submissions only from their own university employees and other partner archives (say University of California at Davis accepts transfers from all other UC systems schools, as well as say 20 or 40 other major universities globally such as UMich or WashU StLouis or Cambridge or University of Sao Paolo or U Tokyo or whatever.)

In the end, the way scientific communication would work is essentially that you'd write your stuff up, archive the data set, and submit it to your publication program on your computer. It would then assign a UUID, index all the metadata, and submit to each of your trusted archives, the trusted archives would propagate all the metadata throughout the network, within a few hours anyone in the world could find your article by keywords, authors, etc. Anyone could download it by peer-to-peer requests that eventually find an archive with a copy. Commentary would transfer with the paper, and commentary on a paper would easily be submitted to the system where it too would be archived along the lines of an email mailing-list archive by threaded conversations.

It seems pretty obvious this would be way better from a "moving science forward" perspective than anything we have now.

Going back to the social aspects though, it certainly seems that this wouldn't have anything like the prestige production of "A publication in Nature" or whatnot.


