Virtual Environments for Data Science

What is a virtual environment?

A virtual environment is an isolated Python setup for a specific project.

It keeps:

Python version (sometimes)
Installed libraries
Tooling (Jupyter, linters, etc.)

separate from other projects.

Why it’s essential in data analytics

Data analytics projects often depend on:

Specific versions of NumPy/Pandas/Matplotlib
Jupyter
Database drivers
Visualization libraries

If you install everything globally, you will eventually face:

Version conflicts
“It worked yesterday” problems
Broken notebooks after updates

Virtual environments prevent most of those issues.

Two popular choices

Option 1: `venvvenv` (built-in)

Comes with Python
Lightweight
Uses pippip for packages

Option 2: `condaconda` environments

Great for data science libraries
Handles compiled packages easily
Works with both conda installconda install and pip installpip install

Using `venvvenv` (recommended for pure pip projects)

Create a new environment

From your project folder:

command

python -m venv .venv

command

python -m venv .venv

Activate the environment

macOS/Linux:

command

source .venv/bin/activate

command

source .venv/bin/activate

Windows (PowerShell):

command

.\.venv\Scripts\Activate.ps1

command

.\.venv\Scripts\Activate.ps1

Install packages

command

pip install numpy pandas matplotlib seaborn jupyter

command

pip install numpy pandas matplotlib seaborn jupyter

Freeze requirements

This creates a reproducible spec:

command

pip freeze > requirements.txt

command

pip freeze > requirements.txt

Later someone can recreate the same installs:

command

pip install -r requirements.txt

command

pip install -r requirements.txt

Using conda environments (recommended for analytics stacks)

Create and activate

command

conda create -n analytics python=3.12

command

conda create -n analytics python=3.12

command

conda activate analytics

command

conda activate analytics

Install packages

command

conda install numpy pandas matplotlib seaborn jupyter

command

conda install numpy pandas matplotlib seaborn jupyter

Mixing conda + pip safely

Sometimes a package is not available in conda.

Recommended approach:

Install as much as possible with condaconda
Then install remaining packages with pippip

Example:

command

conda install numpy pandas
pip install yfinance

command

conda install numpy pandas
pip install yfinance

Environment naming conventions

Good names:

analyticsanalytics
titanic-edatitanic-eda
vizviz

Avoid generic names like testtest or newenvnewenv.

Best practices for data analytics projects

Create one environment per project
Pin versions for important packages (especially for long projects)
Keep a requirements.txtrequirements.txt (pip) or environment.ymlenvironment.yml (conda)
Store notebooks inside a project folder

Example conda environment file (`environment.ymlenvironment.yml`)

This is a common way to share environment configuration:

environment.yml

name: analytics
channels:
  - conda-forge
dependencies:
  - python=3.12
  - numpy
  - pandas
  - matplotlib
  - seaborn
  - jupyter
  - pip
  - pip:
      - yfinance

environment.yml

name: analytics
channels:
  - conda-forge
dependencies:
  - python=3.12
  - numpy
  - pandas
  - matplotlib
  - seaborn
  - jupyter
  - pip
  - pip:
      - yfinance

Then create it with:

command

conda env create -f environment.yml

command

conda env create -f environment.yml

Continue to: Installing Data Science Libraries (pip & conda) to learn the best ways to install and verify common analytics libraries.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Virtual Environments for Data Science

What is a virtual environment?

Why it’s essential in data analytics

Two popular choices

Option 1: `venvvenv` (built-in)

Option 2: `condaconda` environments

Using `venvvenv` (recommended for pure pip projects)

Create a new environment

Activate the environment

Install packages

Freeze requirements

Using conda environments (recommended for analytics stacks)

Create and activate

Install packages

Mixing conda + pip safely

Environment naming conventions

Best practices for data analytics projects

Example conda environment file (`environment.ymlenvironment.yml`)

Next

Was this page helpful?

Virtual Environments for Data Science

What is a virtual environment?

Why it’s essential in data analytics

Two popular choices

Option 1: venvvenv (built-in)

Option 2: condaconda environments

Using venvvenv (recommended for pure pip projects)

Create a new environment

Activate the environment

Install packages

Freeze requirements

Using conda environments (recommended for analytics stacks)

Create and activate

Install packages

Mixing conda + pip safely

Environment naming conventions

Best practices for data analytics projects

Example conda environment file (environment.ymlenvironment.yml)

Next

Was this page helpful?

Option 1: `venvvenv` (built-in)

Option 2: `condaconda` environments

Using `venvvenv` (recommended for pure pip projects)

Example conda environment file (`environment.ymlenvironment.yml`)