AI & Machine Learning Basics
AI & Machine Learning Basics
Section titled โAI & Machine Learning BasicsโWhat is Artificial Intelligence?
Section titled โWhat is Artificial Intelligence?โArtificial Intelligence (AI) is the simulation of human intelligence in machines โ enabling them to learn, reason, perceive, and interact. AI systems range from narrow task-specific tools to increasingly general-purpose reasoning systems.
Three levels of AI:
- Narrow AI (ANI) โ Designed for specific tasks (ChatGPT, image classifiers, recommendation engines)
- General AI (AGI) โ Hypothetical: performs any intellectual task a human can
- Super AI (ASI) โ Hypothetical: surpasses human intelligence across all domains
What is Machine Learning?
Section titled โWhat is Machine Learning?โMachine Learning (ML) is a subset of AI where systems learn patterns from data rather than following explicit rules. The model improves with experience.
Types of Machine Learning
Section titled โTypes of Machine LearningโSupervised Learning โ Learns from labeled data (input โ output pairs)
- Examples: spam detection, image classification, price prediction
- Algorithms: Linear/Logistic Regression, Decision Trees, SVM, Neural Networks
Unsupervised Learning โ Finds patterns in unlabeled data
- Examples: customer segmentation, anomaly detection, topic modeling
- Algorithms: K-Means, DBSCAN, PCA, Autoencoders
Reinforcement Learning โ Agent learns by interacting with an environment, receiving rewards/penalties
- Examples: game-playing AI (AlphaGo), robotics, trading bots
- Key concepts: Agent, Environment, State, Action, Reward, Policy
Self-Supervised Learning โ Model generates its own labels from input structure
- Examples: GPT (predicts next token), BERT (predicts masked tokens), CLIP
- Foundation of modern LLMs
Deep Learning
Section titled โDeep LearningโDeep Learning uses multi-layer neural networks to learn hierarchical representations:
| Architecture | Use Case |
|---|---|
| CNN (Convolutional) | Images, video |
| RNN / LSTM / GRU | Sequences, time series |
| Transformer | Text, code, multimodal |
| Diffusion Models | Image/audio generation |
| GAN | Synthetic data generation |
The Transformer Revolution
Section titled โThe Transformer RevolutionโTransformers (introduced in โAttention Is All You Needโ, 2017) are the backbone of modern AI:
- Self-attention mechanism allows the model to weigh relationships between all tokens simultaneously
- Scales efficiently with data and compute
- Powers GPT, BERT, T5, LLaMA, Gemini, Claude, and virtually all modern LLMs
Large Language Models (LLMs)
Section titled โLarge Language Models (LLMs)โLLMs are transformer-based models trained on massive text corpora. Key concepts:
- Pre-training โ Learn language patterns from billions of tokens
- Fine-tuning โ Adapt to specific tasks with smaller labeled datasets
- RLHF โ Reinforcement Learning from Human Feedback โ aligns model to human preferences
- Prompt engineering โ Crafting inputs to guide model behavior
- RAG โ Retrieval-Augmented Generation โ augment LLM with external knowledge at inference time
- Context window โ Maximum tokens the model can process at once (4K โ 1M+ tokens in 2026)
AI in 2026: Key Trends
Section titled โAI in 2026: Key TrendsโAgentic AI โ Models that plan, use tools, and execute multi-step tasks autonomously (see AI Agents)
Multimodal Models โ Process text, images, audio, video, and code together (GPT-4o, Gemini 2.0, Claude 3.5)
Reasoning Models โ Dedicated โthinkingโ before answering (OpenAI o1/o3, DeepSeek R1)
Small Language Models (SLMs) โ Efficient models for edge/on-device use (Phi-4, Gemma 3, Llama 3.2 1B)
AI Coding Assistants โ GitHub Copilot, Cursor, Kiro โ deeply integrated into developer workflows
Open Source AI โ LLaMA 4, Mistral, Qwen 3, DeepSeek V3 โ competitive with closed models
AI vs ML vs DL
Section titled โAI vs ML vs DLโ| AI | ML | DL | |
|---|---|---|---|
| Scope | Broadest | Subset of AI | Subset of ML |
| Approach | Rules + learning | Statistical learning | Neural networks |
| Data needs | Varies | Moderate | Large |
| Interpretability | Varies | Often interpretable | Often black-box |
| Examples | Expert systems, LLMs | Random forests, SVM | GPT, ResNet, DALL-E |
Core ML Workflow
Section titled โCore ML Workflowโ1. Define problem & success metrics2. Collect & label data3. Exploratory Data Analysis (EDA)4. Feature engineering & preprocessing5. Model selection & training6. Evaluation (accuracy, F1, AUC, etc.)7. Hyperparameter tuning8. Deployment & monitoring9. IterateKey Tools & Frameworks
Section titled โKey Tools & Frameworksโ| Category | Tools |
|---|---|
| Languages | Python, R, Julia |
| ML Libraries | scikit-learn, XGBoost, LightGBM |
| Deep Learning | PyTorch, TensorFlow/Keras, JAX |
| LLM Frameworks | Hugging Face Transformers, LangChain, LlamaIndex |
| Data | Pandas, NumPy, Polars, DuckDB |
| Visualization | Matplotlib, Seaborn, Plotly |
| Experiment Tracking | MLflow, Weights & Biases, Neptune |
| Deployment | FastAPI, BentoML, Triton, vLLM |
| Cloud AI | AWS SageMaker, Azure ML, Google Vertex AI |
Real-World Applications
Section titled โReal-World Applicationsโ- Healthcare โ Medical imaging, drug discovery, clinical decision support
- Finance โ Fraud detection, algorithmic trading, credit scoring
- Retail โ Recommendation systems, demand forecasting, dynamic pricing
- Automotive โ Autonomous driving, predictive maintenance
- Software โ Code generation, bug detection, automated testing
- Science โ Protein folding (AlphaFold), climate modeling, materials discovery
Evaluation Metrics
Section titled โEvaluation Metricsโ| Task | Metrics |
|---|---|
| Classification | Accuracy, Precision, Recall, F1, AUC-ROC |
| Regression | MAE, MSE, RMSE, Rยฒ |
| Generation | BLEU, ROUGE, BERTScore, human eval |
| Ranking | NDCG, MRR |
| Clustering | Silhouette score, Davies-Bouldin |