AI & Machine Learning Basics

What is Artificial Intelligence?

Artificial Intelligence (AI) is the simulation of human intelligence in machines — enabling them to learn, reason, perceive, and interact. AI systems range from narrow task-specific tools to increasingly general-purpose reasoning systems.

Three levels of AI:

Narrow AI (ANI) — Designed for specific tasks (ChatGPT, image classifiers, recommendation engines)
General AI (AGI) — Hypothetical: performs any intellectual task a human can
Super AI (ASI) — Hypothetical: surpasses human intelligence across all domains

What is Machine Learning?

Machine Learning (ML) is a subset of AI where systems learn patterns from data rather than following explicit rules. The model improves with experience.

Types of Machine Learning

Supervised Learning — Learns from labeled data (input → output pairs)

Examples: spam detection, image classification, price prediction
Algorithms: Linear/Logistic Regression, Decision Trees, SVM, Neural Networks

Unsupervised Learning — Finds patterns in unlabeled data

Examples: customer segmentation, anomaly detection, topic modeling
Algorithms: K-Means, DBSCAN, PCA, Autoencoders

Reinforcement Learning — Agent learns by interacting with an environment, receiving rewards/penalties

Examples: game-playing AI (AlphaGo), robotics, trading bots
Key concepts: Agent, Environment, State, Action, Reward, Policy

Self-Supervised Learning — Model generates its own labels from input structure

Examples: GPT (predicts next token), BERT (predicts masked tokens), CLIP
Foundation of modern LLMs

Deep Learning

Deep Learning uses multi-layer neural networks to learn hierarchical representations:

Architecture	Use Case
CNN (Convolutional)	Images, video
RNN / LSTM / GRU	Sequences, time series
Transformer	Text, code, multimodal
Diffusion Models	Image/audio generation
GAN	Synthetic data generation

The Transformer Revolution

Transformers (introduced in “Attention Is All You Need”, 2017) are the backbone of modern AI:

Self-attention mechanism allows the model to weigh relationships between all tokens simultaneously
Scales efficiently with data and compute
Powers GPT, BERT, T5, LLaMA, Gemini, Claude, and virtually all modern LLMs

Large Language Models (LLMs)

LLMs are transformer-based models trained on massive text corpora. Key concepts:

Pre-training — Learn language patterns from billions of tokens
Fine-tuning — Adapt to specific tasks with smaller labeled datasets
RLHF — Reinforcement Learning from Human Feedback — aligns model to human preferences
Prompt engineering — Crafting inputs to guide model behavior
RAG — Retrieval-Augmented Generation — augment LLM with external knowledge at inference time
Context window — Maximum tokens the model can process at once (4K → 1M+ tokens in 2026)

AI in 2026: Key Trends

Agentic AI — Models that plan, use tools, and execute multi-step tasks autonomously (see AI Agents)

Multimodal Models — Process text, images, audio, video, and code together (GPT-4o, Gemini 2.0, Claude 3.5)

Reasoning Models — Dedicated “thinking” before answering (OpenAI o1/o3, DeepSeek R1)

Small Language Models (SLMs) — Efficient models for edge/on-device use (Phi-4, Gemma 3, Llama 3.2 1B)

AI Coding Assistants — GitHub Copilot, Cursor, Kiro — deeply integrated into developer workflows

Open Source AI — LLaMA 4, Mistral, Qwen 3, DeepSeek V3 — competitive with closed models

AI vs ML vs DL

	AI	ML	DL
Scope	Broadest	Subset of AI	Subset of ML
Approach	Rules + learning	Statistical learning	Neural networks
Data needs	Varies	Moderate	Large
Interpretability	Varies	Often interpretable	Often black-box
Examples	Expert systems, LLMs	Random forests, SVM	GPT, ResNet, DALL-E

Core ML Workflow

1. Define problem & success metrics
2. Collect & label data
3. Exploratory Data Analysis (EDA)
4. Feature engineering & preprocessing
5. Model selection & training
6. Evaluation (accuracy, F1, AUC, etc.)
7. Hyperparameter tuning
8. Deployment & monitoring
9. Iterate

Key Tools & Frameworks

Category	Tools
Languages	Python, R, Julia
ML Libraries	scikit-learn, XGBoost, LightGBM
Deep Learning	PyTorch, TensorFlow/Keras, JAX
LLM Frameworks	Hugging Face Transformers, LangChain, LlamaIndex
Data	Pandas, NumPy, Polars, DuckDB
Visualization	Matplotlib, Seaborn, Plotly
Experiment Tracking	MLflow, Weights & Biases, Neptune
Deployment	FastAPI, BentoML, Triton, vLLM
Cloud AI	AWS SageMaker, Azure ML, Google Vertex AI

Real-World Applications

Healthcare — Medical imaging, drug discovery, clinical decision support
Finance — Fraud detection, algorithmic trading, credit scoring
Retail — Recommendation systems, demand forecasting, dynamic pricing
Automotive — Autonomous driving, predictive maintenance
Software — Code generation, bug detection, automated testing
Science — Protein folding (AlphaFold), climate modeling, materials discovery

Evaluation Metrics

Task	Metrics
Classification	Accuracy, Precision, Recall, F1, AUC-ROC
Regression	MAE, MSE, RMSE, R²
Generation	BLEU, ROUGE, BERTScore, human eval
Ranking	NDCG, MRR
Clustering	Silhouette score, Davies-Bouldin

AI & Machine Learning Basics

AI & Machine Learning Basics

What is Artificial Intelligence?

What is Machine Learning?

Types of Machine Learning

Deep Learning

The Transformer Revolution

Large Language Models (LLMs)

AI in 2026: Key Trends

AI vs ML vs DL

Core ML Workflow

Key Tools & Frameworks

Real-World Applications

Evaluation Metrics

Further Reading