Skip to content

Python Packages for AI and Machine Learning

Python Packages for AI and Machine Learning

Section titled “Python Packages for AI and Machine Learning”

Python is the default language for many AI workflows largely because of its package ecosystem. This page explains the major package categories and when each one matters.

Do not memorize package names as a flat list. Think in layers.

CategoryWhat it is forCommon examples
Numerical computingArrays, vectorized math, linear algebraNumPy, SciPy
Data analysisTables, joins, cleaning, feature preppandas, Polars
Classical MLTraining and evaluating structured-data modelsscikit-learn, XGBoost
Deep learningNeural network training and inferencePyTorch, TensorFlow
Foundation model appsLLM and embedding-based workflowsopenai, google-genai, anthropic
OrchestrationChains, agents, retrieval pipelinesLangChain, LlamaIndex
Vector storage and retrievalEmbedding search and RAG storageChroma, FAISS, pgvector

Use NumPy when you need arrays, numerical operations, broadcasting, or linear algebra building blocks.

Use pandas when your work is table-oriented: loading CSV files, cleaning records, filtering data, grouping, and joining.

Use scikit-learn when you need classical machine learning workflows such as:

  • preprocessing
  • train/test splits
  • classification and regression
  • evaluation metrics
  • pipelines

Use PyTorch when you need deep learning, model training, tensor operations, or custom neural-network work.

Use provider SDKs when your application calls hosted AI models directly from Python. These packages usually cover:

  • text generation
  • embeddings
  • multimodal requests
  • structured integration into apps or services
Section titled “LangChain and related orchestration libraries”

Use orchestration frameworks when you need more than a single model call:

  • retrieval pipelines
  • agent workflows
  • stateful chains
  • multi-step tool use

Do not introduce these frameworks too early if a direct SDK call is enough.

Terminal window
python -m pip install numpy pandas matplotlib
Terminal window
python -m pip install numpy pandas scikit-learn
Terminal window
python -m pip install openai google-genai python-dotenv
Terminal window
python -m pip install langchain chromadb tiktoken
  • Choose the smallest stack that solves the problem.
  • Prefer direct SDKs before adding orchestration layers.
  • Keep foundational packages separate from product-specific integrations in your mental model.
  • Check compatibility and supported Python versions before standardizing on a stack.