Skip to content

AWS SageMaker โ€” Machine Learning Platform

Amazon SageMaker is AWSโ€™s fully managed end-to-end machine learning platform. It covers every step of the ML lifecycle โ€” from data preparation and model training to deployment and monitoring.

In Azure terms: SageMaker โ‰ˆ Azure Machine Learning (Azure ML)

ComponentDescription
StudioWeb-based IDE for ML (notebooks, pipelines, experiments)
NotebooksManaged Jupyter notebooks with pre-built ML environments
Training JobsRun model training on managed compute (CPU/GPU clusters)
EndpointsDeploy models as real-time HTTP endpoints
Batch TransformRun inference on large datasets (no live endpoint)
PipelinesMLOps workflow orchestration
Feature StoreCentralized storage and reuse of ML features
Model RegistryVersion, track, and approve models before deployment
Data WranglerVisual data preparation and transformation
ClarifyBias detection and model explainability
ExperimentsTrack model training runs and hyperparameter comparisons
CanvasNo-code ML for business analysts
BedrockSeparate service โ€” managed access to foundation models (LLMs)
Data Prep (Data Wrangler / S3)
โ†“
Feature Engineering (Feature Store)
โ†“
Model Training (Training Job โ€” EC2 GPU/CPU)
โ†“
Model Evaluation (Clarify, Experiments)
โ†“
Model Registry (versioning + approval)
โ†“
Deployment (Real-time Endpoint / Batch Transform)
โ†“
Monitoring (Model Monitor โ€” drift detection)

SageMaker provisions compute, trains your model, and terminates the instance โ€” you pay only for training time:

import sagemaker
from sagemaker.estimator import Estimator
estimator = Estimator(
image_uri='763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.1.0-gpu-py310',
role='arn:aws:iam::123456789:role/SageMakerRole',
instance_type='ml.g4dn.xlarge', # GPU instance
instance_count=1,
hyperparameters={
'epochs': 10,
'learning-rate': 0.001,
},
output_path='s3://my-bucket/model-output/'
)
estimator.fit({'train': 's3://my-bucket/training-data/'})

Deploy trained models as REST APIs:

predictor = estimator.deploy(
initial_instance_count=1,
instance_type='ml.m5.large'
)
# Invoke the endpoint
result = predictor.predict({'input': [1.0, 2.0, 3.0]})
Terminal window
# Invoke endpoint via CLI
aws sagemaker-runtime invoke-endpoint \
--endpoint-name my-endpoint \
--content-type application/json \
--body '{"input": [1.0, 2.0, 3.0]}' \
output.json

SageMaker includes many optimized built-in algorithms (no container needed):

AlgorithmTypeUse Case
XGBoostGradient boostingClassification, regression
Linear LearnerLinear modelsBinary/multiclass classification
K-MeansClusteringCustomer segmentation
BlazingTextNLPText classification, word2vec
Object DetectionComputer visionDetect objects in images
Semantic SegmentationComputer visionPixel-level image labeling
DeepARTime seriesDemand forecasting
Random Cut ForestAnomaly detectionFraud detection
FeatureSageMakerAzure ML
IDESageMaker StudioAzure ML Studio
NotebooksManaged JupyterManaged Jupyter + VS Code
TrainingTraining Jobs (managed clusters)Compute Clusters / Compute Instances
Model deploymentEndpoints (real-time/batch)Online / Batch Endpoints
MLOps pipelinesSageMaker PipelinesAzure ML Pipelines
Model registrySageMaker Model RegistryAzure ML Model Registry
Feature storeSageMaker Feature StoreAzure ML Feature Store
No-code MLSageMaker CanvasAzure ML Automated ML
LLM accessAmazon BedrockAzure OpenAI Service
Experiment trackingSageMaker ExperimentsAzure ML Experiments / MLflow

For Large Language Models (LLMs) and foundation models, AWS uses Amazon Bedrock (not SageMaker directly):

  • Access models from Anthropic (Claude), Meta (Llama), Amazon (Titan), Mistral, Cohere
  • Serverless โ€” no infrastructure to manage
  • Supports RAG (Retrieval-Augmented Generation) via Knowledge Bases
  • Fine-tuning and agents support

In Azure terms: Amazon Bedrock โ‰ˆ Azure OpenAI Service