Open Science Resources

CZI Open Science Program

open science for biomedical research

View the Project on GitHub chanzuckerberg/open-science

Search this site

Code development

Topics on this page:

This topic page focuses on general programming practices, as well as developing programming skills to create reproducible code. The information is appropriate for people new to coding, programmers who are new to data-intensive research, and software developers interested in developing open source research software.

Fundamental concepts in coding

The following resources provide a general overview of best practices in computational research:

Learning to code

If you are a clinical or wet-lab researcher who is interested in learning to code, this article helps you understand the process of coding, and compares some of the most common programming languages: Ten simple rules for biologists learning to program.

For tutorials that will teach you to code, please see the Resources section for organizations involved in training.

Version control with Git and GitHub

Version control is a foundational tool used in code development to track changes that occur to files over time. Git is a version control software frequently used in research computing, with GitHub a corresponding public repository in which projects tracked with Git can be viewed and used.

GitHub Learning Lab contains a large number of free tutorials for tasks relevant to writing and sharing research code, including:

Contributing to a project

Beginning to work on your own project can be a daunting task. It may be useful for you to take a look at how other similar projects operate. Many software development projects for research tools are hosted on GitHub, and would be grateful for you to participate in their project. If you’re not familiar with GitHub, please see the version control section to get started.

Most projects outline how you can contribute to their project in particular. The following resources include some general guidelines for contributing to other people’s open source projects:

Code architecture and design

Under construction

Refactoring catalog

Strategies for debugging

Under construction

Programming with Python: Debugging from Software Carpentry

Testing and continuous integration

Under construction

Testing Software from Research Software Engineering with Python

Writing software for others to use

While the section above highlights foundational concepts in programming, this section focuses on writing code that other researchers will be able to reuse (including your future self!).

For a comprehensive overview of overall tools and approaches when developing software, please see Research Software Engineering with Python. This book integrates across concepts highlighted in the section above, as well as the following subsections.

Documentation

Under construction

Please see [Tutorials and other user support on the Collaborative research software engineering projects page] for more information about developing more extensive and formally structured user documentation for software.

Licensing

Under construction

Include a license from Research Software Engineering with Python

Code citations

Clearly communicating how you’d like your software to be cited is essential for receiving credit for the work you do in software development. The following approaches represent separate (but complementary) approaches to ensuring you can track when other researchers publish research using your software.

For more information on these issues, please see products from the FORCE11 Software Citation Implementation Working Group.

Make it clear to users how your software should be cited in scholarly work. Some developers include instructions for citation within the user interface, while other times the preferred citation may appear on in the GitHub repository’s README or on the software’s documentation website. GitHub also allows for standard CITATION.cff files that will automatically render in a variety of bibliographic formats.

Engage in peer review of your code. Peer review of software is a separate process from the standard peer review process for scientific research (see below), and focuses more on the standards of the code itself (which is often not formally reviewed with standard publication review). The following organizations feature peer review mechanisms for software packages:

Publish a manuscript about your software. Unlike peer review of software (see above), submitting a publication describing your software focuses more on the narrative of the article to be published, rather than code comprising the software. The following sources provide more information about how and where to publish a peer-reviewed manuscript on your software:

User interfaces

Under construction

GUIs for research software: Why are they relevant?

Resources for accessibility:

Interoperability and shared infrastructure

Under construction

Joint Roadmap for Open Science Tools, now Invest in Open Infrastructure

Containerizing code

Under construction

Ten simple rules for writing Dockerfiles for reproducible data science