Package managers and environments#
Python packages#
Python can be described as charged with batteries, to say that it already comes packed with a number of modules (functions and classes) to get you started. Still in every day use of Python, you will need additional external functionality which in Python is commonly imported through packages. Packages are the equivalent to libraries in R and can be compared to toolboxes in Matlab.
A Python package contains one or more modules. A module is a .py file that contains reusable code, typically functions or models.
Packages can be seen as analoguous to libraries in R and very similar to toolboxes in Matlab.
Note
What are dependency issues?
Dependency issues occur when multiple packages depend on different version of the same package. Only one single version of package is allowed to be installed in a Python environment. FInding a solution to this issue can be complicated.
For example package A might have been written based on version 1.2 of package B. Now package B releases an update where a function that was called wave_number() is now kalled k(). If package A is not updated and you only install the new version of package B, package A won’t work properly anymore. If you also need package C that was written using the k() function form package B, you can’t simply downgrade package B, otherwise package C won’t work anymore.
Python environments#
What are environments and why should we use them?#
A Python environment can be seen as a separate workspace where you can run Python programs without interfering with other projects. It helps keeping things organized by managing different versions of Python and packages needed for each project. For example, if one project needs Python 3.8 and another needs Python 3.10, you can create separate environments for each, so they don’t mix and cause problems. Tools like virtualenv, venv, and conda or mamba help create and manage these environments easily.
Uisng environments has a number of benefits:
Avoiding Dependency Conflicts
Different projects may require different versions of libraries. A Python environment ensures that each project uses the correct versions without conflicts.Keeping Projects Organized Each environment acts as a separate workspace, so your global Python installation remains clean and uncluttered.
Working with Different Python Versions
You can create environments with different Python versions, which is useful if one project needs Python <3.8 while another needs Python >3.10.Easier Collaboration
When sharing projects, you can provide a list of dependencies (requirements.txt for pip or environment.yml for conda or mamba), allowing others to recreate the exact environment.Experimenting Without Risk
You can safely test new libraries or code changes in a separate environment without affecting your main setup.Better Reproducibility
If you revisit a project months later, having an isolated environment ensures it still runs as expected without issues caused by updated or missing dependencies.Security Benefits
Running code in a virtual environment can prevent accidental modifications to system-wide files and protect your system from potential risks.
Important
Here we will focus on the use of mamba. condaprobably remains the most popular package manager but there are several benefits that mamba has compared to conda:
Much faster than Conda (especially for package resolution)
Uses parallel downloads, making installations and updates quicker
Fully compatible with Conda environments and packages.
Hint
R
In R is is common practice to start new Projects for each project. R Projects are a great way to magae paths and file dependencies, but they lack package management.
Virtual environments like they are commonly used in Python exist in R as well, through renv. renv manages package versions and dependencies separately for each project. This ensures that the code will work even after a long period of time and enhance compatibility with the systems of colleagues.
Using environmnents#
If you decide to use conda instead of mamba, you can replace mamba by conda in all commands listed here.
Creating a new environment
To create a new environment called ´my_env´ from a yml called ´my_env.yml´ file we use:
´´´bash mamba env create -n my_env -f my_env.yml ´´´ or we can create an environment from scratch:
To create a new environment (called
my_env) with a specific Python version (here 3.12) we use:mamba create --name my_env python=3.12
If we don’t specifiy the python version the environment will use the default version (currently 3.9).
We can install specific packages while creating the environment (note more packages can be installed at a later stage):mamba create --name my_env python=3.12 numpy pandas xarray
Packages are simply added to the end of the command, separated by spaces.
mambawill make sure that all dependencies of the different packages are installed and are compatible. If for example a package in the list or a dependecy would not be compatible with the wanted Python verison, an error message would be the result and the environment would not be created.Activating an environment
To start using an environment we use
activate:mamba activate my_env
Deavtivating an environment
To stop using an environment, we use
deactivate(no need to add the environment name at the end):mamba deactivateListing all available environments
mamba env list
or
mamba info --envs
Installing packages in an environmnent
If the environment in which the package should be installed is activated:
mamba install echopype
alternatively, the specific environmnent can be defined:
mamba install --name my_env echopype
Removing a Package from an Environment
To remove a package we use the
removecommand (here we remove the numpy package):mamba remove numpy
Deleting an environment
To completely remove an environment and all its packages:
mamba env remove --name my_env
Exporting an environment (for Sharing)
To save all installed packages into a file:
mamba env export > environment.yml
This file can be shared with others to recreate the environment.
Recreating an environment from a File
To create an exact copy of an environment from an environment.yml file:
mamba env create --file environment.yml
Installing new packages and other useful mamba commands#
Installing, updating, removing packages#
mamba |
Description |
- |
|---|---|---|
mamba install package_name |
Installs a package |
|
mamba remove package_name |
Removes a package |
|
mamba update package_name |
Update a package |
|
mamba update –all |
update all packages in the active environment |
|
mamba list |
list all installed packages |
|
mamba list –name my_env |
list all packages in a specific environmnent |
Cleaning up packages#
mamba |
Description |
- |
|---|---|---|
mamba clean –all |
Remove unneccessary package files |
|
mamba clean –packages |
Remove unsused package versions |
|
mamba clean –tarballs |
Remove temporary package files |
Checking mamba configuration#
View mamba settings:
mamba config --show
add channel for package retrieval (example adds the conda-forge channel):
mamba config --add channels conda-forge
Loading packages in Python#
Once packages are installed in an environment, they can be imported in a Python script with the import command. We will use the math package in the examples below. Functions from within the package can be accesses with dot .notation:
import math
math.pi #output: 3.14159....
we can give packages an alias usign as (here m)
import math as m
m.pi #output: 3.14159....
Tip
It is common practice to import some package using an alias, popular examples include (these aliases are not required but consistent with industry standards and imporve readibility):
Package name |
alias |
|---|---|
numpy |
np |
pandas |
pd |
xarray |
xr |
matplotlib.pyplot |
plt |
seaborn |
sns |
scikit-learn |
sklearn |
echopype |
ep |
datetime |
dt |
statistics |
stats |
BeautifulSoup |
bs4 |
itertools |
it |
collections |
col |
We can import specific functions:
from math import sqrt, pi
sqrt(64) #output: 8.0
When importing specific functions, we do not use the dot notations anymore but can call the functions directly.
we can also load all funcitons from one packake using *, it is considered good practice to only import what you need though:
from math import *
sqrt(64) #output: 8.0
Hint
Comparing loading packages in R and Python
Python |
R |
|---|---|
import packages |
load libraries |
|
|
Specific examples: |
|
|
|