Skip to content

Installing Data Science Libraries (pip & conda)

Data science libraries you’ll use often

A common starter stack for data analytics includes:

  • NumPy: numerical computing
  • Pandas: data manipulation
  • Matplotlib: plotting foundation
  • Seaborn: statistical visualization
  • Plotly: interactive charts
  • Jupyter: notebooks
  • SciPy (optional early): scientific utilities
  • scikit-learn (later): ML utilities

pip vs conda (how to choose)

Use condaconda when

  • You’re using Anaconda/Miniconda
  • You want fewer build/compile issues
  • You need compiled dependencies (common in data science)

Use pippip when

  • You installed CPython from python.org
  • You’re inside a venvvenv
  • A package isn’t available via conda

Installing with conda

Step 1: Create and activate an environment

command
conda create -n analytics python=3.12
command
conda create -n analytics python=3.12
command
conda activate analytics
command
conda activate analytics

Step 2: Install the core stack

command
conda install numpy pandas matplotlib seaborn jupyter
command
conda install numpy pandas matplotlib seaborn jupyter

Step 3: Install Plotly

Plotly is often available via conda, but some users prefer pip. Try conda first:

command
conda install plotly
command
conda install plotly

If not available in your channels, use pip:

command
pip install plotly
command
pip install plotly

Installing with pip (venv)

Step 1: Create and activate

command
python -m venv .venv
command
python -m venv .venv
command
source .venv/bin/activate
command
source .venv/bin/activate

Step 2: Install packages

command
pip install numpy pandas matplotlib seaborn plotly jupyter
command
pip install numpy pandas matplotlib seaborn plotly jupyter

Verifying installs in Python

After installing packages, verify them in a Python session or notebook:

verify
import numpy as np
import pandas as pd
import matplotlib
import seaborn as sns
import plotly
 
print("NumPy:", np.__version__)
print("Pandas:", pd.__version__)
print("Matplotlib:", matplotlib.__version__)
print("Seaborn:", sns.__version__)
print("Plotly:", plotly.__version__)
verify
import numpy as np
import pandas as pd
import matplotlib
import seaborn as sns
import plotly
 
print("NumPy:", np.__version__)
print("Pandas:", pd.__version__)
print("Matplotlib:", matplotlib.__version__)
print("Seaborn:", sns.__version__)
print("Plotly:", plotly.__version__)

Installing Jupyter kernel for your environment

Sometimes Jupyter is installed globally but you want the kernel to point at your environment.

Install ipykernel:

command
pip install ipykernel
command
pip install ipykernel

Register the kernel:

command
python -m ipykernel install --user --name analytics --display-name "Python (analytics)"
command
python -m ipykernel install --user --name analytics --display-name "Python (analytics)"

Now your environment appears in Jupyter kernel selection.

Reproducibility: pinning versions

For long projects, pin versions so your notebook still runs months later.

pip: requirements.txtrequirements.txt

numpy==2.1.0
pandas==2.2.3
matplotlib==3.9.2
seaborn==0.13.2
plotly==5.24.1
jupyter==1.1.1
numpy==2.1.0
pandas==2.2.3
matplotlib==3.9.2
seaborn==0.13.2
plotly==5.24.1
jupyter==1.1.1

conda: environment.ymlenvironment.yml

name: analytics
channels:
  - conda-forge
dependencies:
  - python=3.12
  - numpy
  - pandas
  - matplotlib
  - seaborn
  - plotly
  - jupyter
name: analytics
channels:
  - conda-forge
dependencies:
  - python=3.12
  - numpy
  - pandas
  - matplotlib
  - seaborn
  - plotly
  - jupyter

Common errors and fixes

Error: ModuleNotFoundError: No module named 'pandas'ModuleNotFoundError: No module named 'pandas'

  • You installed in one environment but are running Python from another.
  • Solution: activate the correct environment and reinstall.

Error: Jupyter doesn’t show the right kernel

  • Install and register ipykernelipykernel as shown above.

Error: pippip installs but import fails

  • Check you’re using the intended pippip:
    • In a terminal inside the environment, run which pipwhich pip (macOS/Linux)
    • Or where pipwhere pip (Windows)

Next

Phase 1 is complete. Next we’ll start Phase 2: Numerical Computing (NumPy) with an Introduction to NumPy.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did