Skip to content

AI & Machine Learning Basics

Artificial Intelligence (AI) is the simulation of human intelligence in machines โ€” enabling them to learn, reason, perceive, and interact. AI systems range from narrow task-specific tools to increasingly general-purpose reasoning systems.

Three levels of AI:

  • Narrow AI (ANI) โ€” Designed for specific tasks (ChatGPT, image classifiers, recommendation engines)
  • General AI (AGI) โ€” Hypothetical: performs any intellectual task a human can
  • Super AI (ASI) โ€” Hypothetical: surpasses human intelligence across all domains

Machine Learning (ML) is a subset of AI where systems learn patterns from data rather than following explicit rules. The model improves with experience.

Supervised Learning โ€” Learns from labeled data (input โ†’ output pairs)

  • Examples: spam detection, image classification, price prediction
  • Algorithms: Linear/Logistic Regression, Decision Trees, SVM, Neural Networks

Unsupervised Learning โ€” Finds patterns in unlabeled data

  • Examples: customer segmentation, anomaly detection, topic modeling
  • Algorithms: K-Means, DBSCAN, PCA, Autoencoders

Reinforcement Learning โ€” Agent learns by interacting with an environment, receiving rewards/penalties

  • Examples: game-playing AI (AlphaGo), robotics, trading bots
  • Key concepts: Agent, Environment, State, Action, Reward, Policy

Self-Supervised Learning โ€” Model generates its own labels from input structure

  • Examples: GPT (predicts next token), BERT (predicts masked tokens), CLIP
  • Foundation of modern LLMs

Deep Learning uses multi-layer neural networks to learn hierarchical representations:

ArchitectureUse Case
CNN (Convolutional)Images, video
RNN / LSTM / GRUSequences, time series
TransformerText, code, multimodal
Diffusion ModelsImage/audio generation
GANSynthetic data generation

Transformers (introduced in โ€œAttention Is All You Needโ€, 2017) are the backbone of modern AI:

  • Self-attention mechanism allows the model to weigh relationships between all tokens simultaneously
  • Scales efficiently with data and compute
  • Powers GPT, BERT, T5, LLaMA, Gemini, Claude, and virtually all modern LLMs

LLMs are transformer-based models trained on massive text corpora. Key concepts:

  • Pre-training โ€” Learn language patterns from billions of tokens
  • Fine-tuning โ€” Adapt to specific tasks with smaller labeled datasets
  • RLHF โ€” Reinforcement Learning from Human Feedback โ€” aligns model to human preferences
  • Prompt engineering โ€” Crafting inputs to guide model behavior
  • RAG โ€” Retrieval-Augmented Generation โ€” augment LLM with external knowledge at inference time
  • Context window โ€” Maximum tokens the model can process at once (4K โ†’ 1M+ tokens in 2026)

Agentic AI โ€” Models that plan, use tools, and execute multi-step tasks autonomously (see AI Agents)

Multimodal Models โ€” Process text, images, audio, video, and code together (GPT-4o, Gemini 2.0, Claude 3.5)

Reasoning Models โ€” Dedicated โ€œthinkingโ€ before answering (OpenAI o1/o3, DeepSeek R1)

Small Language Models (SLMs) โ€” Efficient models for edge/on-device use (Phi-4, Gemma 3, Llama 3.2 1B)

AI Coding Assistants โ€” GitHub Copilot, Cursor, Kiro โ€” deeply integrated into developer workflows

Open Source AI โ€” LLaMA 4, Mistral, Qwen 3, DeepSeek V3 โ€” competitive with closed models

AIMLDL
ScopeBroadestSubset of AISubset of ML
ApproachRules + learningStatistical learningNeural networks
Data needsVariesModerateLarge
InterpretabilityVariesOften interpretableOften black-box
ExamplesExpert systems, LLMsRandom forests, SVMGPT, ResNet, DALL-E
1. Define problem & success metrics
2. Collect & label data
3. Exploratory Data Analysis (EDA)
4. Feature engineering & preprocessing
5. Model selection & training
6. Evaluation (accuracy, F1, AUC, etc.)
7. Hyperparameter tuning
8. Deployment & monitoring
9. Iterate
CategoryTools
LanguagesPython, R, Julia
ML Librariesscikit-learn, XGBoost, LightGBM
Deep LearningPyTorch, TensorFlow/Keras, JAX
LLM FrameworksHugging Face Transformers, LangChain, LlamaIndex
DataPandas, NumPy, Polars, DuckDB
VisualizationMatplotlib, Seaborn, Plotly
Experiment TrackingMLflow, Weights & Biases, Neptune
DeploymentFastAPI, BentoML, Triton, vLLM
Cloud AIAWS SageMaker, Azure ML, Google Vertex AI
  • Healthcare โ€” Medical imaging, drug discovery, clinical decision support
  • Finance โ€” Fraud detection, algorithmic trading, credit scoring
  • Retail โ€” Recommendation systems, demand forecasting, dynamic pricing
  • Automotive โ€” Autonomous driving, predictive maintenance
  • Software โ€” Code generation, bug detection, automated testing
  • Science โ€” Protein folding (AlphaFold), climate modeling, materials discovery
TaskMetrics
ClassificationAccuracy, Precision, Recall, F1, AUC-ROC
RegressionMAE, MSE, RMSE, Rยฒ
GenerationBLEU, ROUGE, BERTScore, human eval
RankingNDCG, MRR
ClusteringSilhouette score, Davies-Bouldin