open science

A (non-viral) copyleft/sharealike license for open research data

by Felix Schönbrodt & Roland Ramthun

The open availability of scientific material (such as research data, code, or other material) has often been identified as one cornerstone of a trustworthy, reproducible, and verifiable science. At the same time, the actual availability of such reproducible material still is scarce (though on the rise).

To increase the availability of open scientific material, we propose a license for scientific research data that increases the availability of other open scientific material. It borrows a mechanism from open source software development: The application of copyleft (or, in the CC terminology, “sharealike”) licenses. These are so-called “sticky licenses”, because they require that every reuse of the licensed material has to have the same license. This means, if you reuse material under this license, your own product/derivative must also (a) be freely reusable and (b) use that license, so that any derivative from your product is free as well, ad infinitum.

The promise of such a “viral” license is that it can induce more and more freedom into a system. It is supposed to be a strategy to reform the environment: The more artifacts have a copyleft license, the more likely it is that future products have the same license, until, at the end, everything is free.

Picture of a viral license by Phoebus87 (https://de.wikipedia.org/wiki/Datei:Symian_virus.png)

One criticism of such licenses stems from the definition of “freedom”: According to this point of view, the highest degree of freedom is if you can do anything with a material. This also includes commercial usage, which is usually closed for competitive reasons, or to integrate the material into a larger dataset which itself can not be open, because other parts of the data have restrictive licenses. We are not lawyers, but in our understanding this could, for example, also include restrictions due to privacy rights.

For example, imagine the compilation of an integrative database that includes both material from a copyleft source and another source that has individual-related material, which cannot be openly shared due to privacy rights (but could be shared as a restricted scientific use file). At least from our understanding, a strict copyleft license would preclude the reuse in such a restricted way. Hence, the copyleft license, although claiming to ensure freedom, does preclude a lot of potential reuse scenarios. From this point of view, a so-called permissive license (such as CC0, MIT, or BSD) provides more freedom than a copyleft license (see, e.g., The Whys and Hows of Licensing Scientific Code).

We propose a system that addresses both points of view, with the goal to provide some stickiness of scientific open sharing, but also the possibility to operate with scientific material that require restrictiveness, for example due to privacy rights.

The proposed copyleft license for open data: Open data requires open analysis code.

We suggest the following clause for the reuse of open research data:

Upon publication of any scientific work under a broad definition (including, but not limited to journal papers, books or book chapters, conference proceedings, blog posts) that is based in full or in part on this data set, all data analysis scripts involved in the creation of this work must be made openly available under a license that allows reuse (e.g., BSD or MIT).

(Of course more topics must be addressed in the license, such as the obligation to properly cite the authors of the data set, not to try to reidentify research participants, etc. But we focus only on the copyleft aspect here).

This system has some differences from traditional copyleft licenses.

  • First, usually the reuser has to share any derivative, which often is the same category as the open material (typically: you reuse a piece of software, and have to share your own software product under an open license). In this proposal, you reuse open data, and have to share open analysis code. Hence, you support the openness of a community in another currency. Without the need to publish derived data sets, integration scenarios of usually incompatible, open and closed data become possible.
  • Second, it restricts the copyleft property to a certain type of reuse, namely the creation of scientific work. This ensures, on the one hand, that open knowledge grows and scientific claims are verifiable to a larger extent than before. On the other hand, commercial reuse is enabled; furthermore there might be non-scientific reuse scenarios that do not involve analysis code, where the clause is not applicable anyway. Finally, even the most restrictive data set (where you have to go to a repository operator and analyze the data on dedicated computers in a secure room) can generate open derivatives.
  • Third, the license is not sticky: The published open analysis code itself does not require a copyleft when it is reused. Instead it has a permissive license.

Against the “research parasite” argument

The proposed system offers some protection against the “research parasites” argument. The parasite discussion refers to the free-rider problem in social dilemmas: While some people invest resources to provide a public good, others (the parasites/free-riders) profit from the public good, without giving back to the community (see also Linek et al., 2017). This often creates a feeling of injustice, and impulses to punish the free-riders. (An entire scientific field is devoted to the structural, sociological, political, and psychological properties and consequences of such social dilemma structures.)

In the proposed licensing system, those who profit from openness by reusing open data must give something back to the community. This increases overall openness, reusability, and reproducibility of scientific outputs, and probably decreases feelings of exploitation and unfairness for the data providers.

Do you think such a license would work? Do you see any drawbacks we didn’t think of?

You can leave feedback here as a comment, on Twitter (@nicebread303) or via email to felix@nicebread.de.

No comments | Trackback

Putting the ‘I’ in open science: How you can change the face of science

If we want to shift from a closed science to an open science, there has to be change at several levels. In this process, it’s easy to push the responsibility (and the power) for reform onto “the system”: “If only journals changed their policy …”, “It’s the responsibility of the granting agencies to change XYZ”, or “Unless university boards change their hiring practices, it is futile to …”.

Beyond doubt, change has to occur at the institutional level. In particular, many journals have already done a lot (see, for example, the TOP guidelines or the new registered reports article format). But journal policies aren’t enough, particularly since they are often not enforced.

In this blog post, I want to advocate for a complementary position of agency and empowerment: Let’s focus on steps each individual can do!

Here I want to show 9 steps that each individual can do, starting today, to foster open science:

What you can do today:

1) Join the community. Follow open science advocates on Twitter and blogs. While monitoring these tweets does not change anything per se, it can give you important updates about developments in open science, and useful hints about how to implement open science in practice. Here’s my personal, selective, and incomplete list of Twitter users that frequently tweet about open science: https://twitter.com/nicebread303/lists/openscience

2) Engage open values in peer review. I started to realize that my work as a reviewer is very valuable work. I review more than 6x the number of papers that I submit myself. I receive more requests than I can handle, so I have to decide anyway which request to accept and which not. Where should I allocate my reviewing resources to? I prefer not to allocate them to research that is closed and practically unverifiable. I’d rather allocate them to research that is transparent, verifiable, sustainable, and re-usable.

pro_lock_wide2-1024x410Exactly this is the goal of the PRO initiative (Peer Reviewer Openness initiative), which uses the reviewer’s role to foster open science practices.  The vision of the initiative is to switch from an opt-in model to an opt-out model: Openness is the new default; if authors don’t want it, they have to explicitly opt out. Signatories of the initiative only provide a comprehensive review of a manuscript if (a) open data and open material is provided, or (b) a public justification is given in the manuscript why this is not possible. Since the two weeks of the initiative’s existence, more than 160 reviewers signed it. I think this group already can have some impact, and I hope that more will sign.

[Read the paper — Sign the Initiative — More resources for open science]

Previous posts on the PRO initiative by Richard Morey, Candice Morey, Rolf Zwaan, and Daniel Lakens

What you can do this week:

3) Commit yourself to open science. In our “Voluntary Commitment to Research Transparency and Open Science” we explain which principles of research transparency we will follow from the day of signature on (see also my blog post). If you like it, sign it, and show the world that your research can be trusted and verified. Or use it as an inspiration to craft your own transparency commitment, on the openness level that you feel comfortable with.

4) Find local like-minded people. Find colleagues in your department that embrace the values of open science as you do. Found a local open science initiative where you can exchange about challenges, help each other with practical problems (How did that pre-registration work?), and talk about ways open science can be implemented in your specific field. Use this “coalition of the willing” as the starting point for the next step …

What you can do this month:

5) Found a local Open-Science-Committee. Explore whether your local open science initiative could be installed as an official open science committee (OSC) at your department/ institution. See our OSF project for information about our open science committee at the department psychology at LMU Munich. Maybe you can reuse and adapt some of our material. Not all of our faculty members have the same opinion about this committee, some are enthusiastic, some are more skeptical. But still, the department’s board unanimously decided to establish this committee in order to keep the discussion going. Our OSC has 32 members from all chairs and we meet two times each semester. Our OSC has 4 goals:

  • Monitor the international developments in the area of open science and communicate them to the department.
  • Organize workshops that teach skills for open science
  • Develop concrete suggestions concerning tenure-track criteria, hiring criteria, PhD supervision and grading, teaching, curricula, etc.
  • Channel the discussion concerning standards of research quality and transparency in the department. Explore in what way a department-wide consensus can be established concerning certain points of open science.

6) Pre-register your next study. Pre-registration is a new skill we have to learn, so the first try does not have to be perfect. For example, I had to revise two of my registrations because I forgot important parts in the first version. In my experience, writing a few pre-registration documents gives you a better feeling for how long they take, what they should contain, what level of detail is appropriate, etc.

You can even win 1000$ if you participate in the pre-reg challenge!

What you can do next semester and beyond:

7) Teach open science practices to students. You could plan your next Research Methods course as a pre-registered replication study.   See also this OSF collection of syllabi, the “Good Science, Bad Science” course from EJ Wagenmakers, and the OSF Collaborative Replications and Education Project (CREP).

8) Submit a registered report. Think about submitting a registered report if there’s a journal in your field that supports this format. In this new article format an introduction, methods section, and analysis plan is submitted before data is collected. This proposal is sent to review, and in the positive case you get an in-principle-acceptance and proceed to actual data collection. This means, the paper is published independent of the results (unless you screw up your data collection or analysis).

9) Promote the values of open science in committees. As a member of a job committee, you can argue for open science criteria and evaluate candidates (amongst other criteria, of course) whether they engage in open practices. For example, Betsy Levy Paluck wrote in her blog: “In a hiring capacity, I will appreciate applicants who, though they do not have a ton of publications, can link their projects to an online analysis registration, or have posted data and replication code. Why? I will infer that they were slowing down to do very careful work, that they are doing their best to build a cumulative science.”

These are 9 small and medium steps, which each researcher could implement to some extent. If enough researchers join us, we can change the face of research.

No comments | Trackback

Send this to a friend