Python Packages for AI and Machine Learning
Python Packages for AI and Machine Learning
Section titled “Python Packages for AI and Machine Learning”Python is the default language for many AI workflows largely because of its package ecosystem. This page explains the major package categories and when each one matters.
Start With the Categories
Section titled “Start With the Categories”Do not memorize package names as a flat list. Think in layers.
| Category | What it is for | Common examples |
|---|---|---|
| Numerical computing | Arrays, vectorized math, linear algebra | NumPy, SciPy |
| Data analysis | Tables, joins, cleaning, feature prep | pandas, Polars |
| Classical ML | Training and evaluating structured-data models | scikit-learn, XGBoost |
| Deep learning | Neural network training and inference | PyTorch, TensorFlow |
| Foundation model apps | LLM and embedding-based workflows | openai, google-genai, anthropic |
| Orchestration | Chains, agents, retrieval pipelines | LangChain, LlamaIndex |
| Vector storage and retrieval | Embedding search and RAG storage | Chroma, FAISS, pgvector |
Core Packages You Will Meet Often
Section titled “Core Packages You Will Meet Often”Use NumPy when you need arrays, numerical operations, broadcasting, or linear algebra building blocks.
pandas
Section titled “pandas”Use pandas when your work is table-oriented: loading CSV files, cleaning records, filtering data, grouping, and joining.
scikit-learn
Section titled “scikit-learn”Use scikit-learn when you need classical machine learning workflows such as:
- preprocessing
- train/test splits
- classification and regression
- evaluation metrics
- pipelines
PyTorch
Section titled “PyTorch”Use PyTorch when you need deep learning, model training, tensor operations, or custom neural-network work.
OpenAI and Google GenAI SDKs
Section titled “OpenAI and Google GenAI SDKs”Use provider SDKs when your application calls hosted AI models directly from Python. These packages usually cover:
- text generation
- embeddings
- multimodal requests
- structured integration into apps or services
LangChain and related orchestration libraries
Section titled “LangChain and related orchestration libraries”Use orchestration frameworks when you need more than a single model call:
- retrieval pipelines
- agent workflows
- stateful chains
- multi-step tool use
Do not introduce these frameworks too early if a direct SDK call is enough.
Example Installation Set by Goal
Section titled “Example Installation Set by Goal”Data and analysis
Section titled “Data and analysis”python -m pip install numpy pandas matplotlibClassical machine learning
Section titled “Classical machine learning”python -m pip install numpy pandas scikit-learnLLM application development
Section titled “LLM application development”python -m pip install openai google-genai python-dotenvRAG and orchestration
Section titled “RAG and orchestration”python -m pip install langchain chromadb tiktokenHow To Choose Packages Well
Section titled “How To Choose Packages Well”- Choose the smallest stack that solves the problem.
- Prefer direct SDKs before adding orchestration layers.
- Keep foundational packages separate from product-specific integrations in your mental model.
- Check compatibility and supported Python versions before standardizing on a stack.