Coding practices¶
Welcome to the Coding Practices section! Here, you'll find essential guidance for organizing projects, writing clean code, setting up your environment, and collaborating using version control.
-
Good coding practices How we structure projects, name files, and write maintainable research code.
-
Environment & editor setup Set up Conda environments, configure Spyder, and learn to debug effectively.
-
Version control with Git Integrate Git and GitHub into your workflow for tracking changes and collaboration.
Why coding practices matter¶
When you code for your research project, remember that you're not just coding for yourself today—you're coding for:
- Your future self: Six months from now, you might not remember the specifics of your current project.
- Other scientists: Your code might be used or reviewed by researchers with varying coding skills and backgrounds. Writing clean and well-documented code ensures that your work can be understood and built upon by others.
Keeping your code tidy, easy to understand, and maintainable is crucial for effective research collaboration and aligns with the principles of Open Science.
Recommended resources
Make sure to explore our suggested Coding tutorials. We especially recommend The good research code handbook, which provides valuable insights into writing robust research code. Key sections include Writing decoupled code and Keeping things tidy.
Tip
If you're new to coding and many of the terms on this page seem unfamiliar, start by exploring some of the essential tools you'll use. Check out tutorials on Python, Git, and the Unix Shell on the Student starter pack page.
What if I code in MATLAB?
While the information in this page focuses on Python, the principles of writing clean, maintainable code are universal. Debugging, structuring code, and organizing projects apply just as much to MATLAB as they do to Python. Be sure to apply these practices regardless of the language you're using!
Special note for fMRI projects¶
If you're working on fMRI projects, you'll find specific information on setting up your environment in the Set-up your environment page of the fMRI section. This guide includes additional tips for managing data and code in neuroimaging research.
Best practices for organizing code and projects¶
A well-structured project helps in maintaining readability and collaboration. Here are some recommendations:
1. Folder structure¶
Use a logical structure for your project files:
my_project/
├── data/ # Raw data files
├── modules/ # Scripts to store your classes and functions
├── results/ # Output results and figures
├── environment.yml # Conda environment file
└── README.md # Project overview
2. Naming conventions¶
- Files: Use lowercase letters with underscores (e.g.,
data_processing.py). - Folders: Use meaningful names that reflect their contents.
- Variables: Use descriptive names (e.g.,
participant_idinstead ofid).
3. General coding tips¶
Tip
Write modular code by breaking down tasks into functions and classes. This approach enhances reusability and readability.
- Avoid "spaghetti code": Keep functions short and focused.
-
Use Docstrings to document functions and classes:
-
Follow PEP 8: Use tools like
blackto ensure code style compliance.
4. Saving results¶
Organizing your results properly is crucial for reproducibility, collaboration, and long-term maintainability of your research code. This section covers how to structure your results folders, save scripts and logs, and use utility functions to streamline these processes.
To keep your project organized, we've provided a set of utility functions that automate common tasks like setting random seeds, creating unique output directories, saving scripts, and configuring logging. These functions should be defined in a separate file called utils.py located in the modules/ directory of your project.
Utility functions in modules/utils.py
The following functions are defined in modules/utils.py (see the box below for the definitions):
set_random_seeds(seed=42): Sets random seeds for reproducibility.create_run_id(): Generates a unique identifier based on the current date and time.create_output_directory(directory_path): Creates a directory for saving results.save_script_to_file(output_directory): Saves the executing script to the output directory.setup_logger(log_file_path, level=logging.INFO): Configures logging to log both to the console and a file.
| modules/utils.py | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
Using the utility functions in a script
To use the functions defined in utils.py, import them in your script and follow the example below. This will ensure reproducibility and proper organization of your experimental results.
Example results folder structure
After running the script, your results might be structured as follows:
results/
├── 20241018-153045_train-pair-temp-ws-softmax_proba-0.2_probb-0.2_probtest-0.6_tempa-0.1_tempb-5_lr-1e-5/
│ ├── log_output.txt # Logs of the run
│ ├── main_script.py # Copy of the script that generated the results
│ ├── output_data.csv # Output data generated by the run
│ └── model_weights.pth # Saved model weights
Why create a results folder for each run?
- Reproducibility: Ensures that each set of results corresponds to a specific code version and parameters.
- Comparison: Makes it easier to compare results between different runs with varying parameters.
- Organization: Keeps your project clean by preventing files from different experiments from mixing together.
With these functions, you can ensure a well-organized, reproducible workflow, making it easier to manage long-term research projects and collaborate with others.
Next steps
Ready to set up your development environment? Head to the Environment & editor setup page to configure Conda and Spyder for your projects.