AWS SageMaker Unified Studio is …

What is AWS Sage Maker

Alright folks, today we are diving into the next generation of Amazon SageMaker. Imagine having one center for everything related to data, analytics, and AI. That is basically what SageMaker Unified Studio is. It lets us access data from anywhere like data lakes, warehouses, even third-party sources and then lets us build, collaborate, train models, run SQL analytics, all in one single interface. Plus, it is got help from Amazon Q Developer, an AI assistant that helps us write code, generate queries, and find insights. Think of it as your smart dev buddy who never sleeps.

Press enter or click to view image in full size

The next-gen SageMaker is not just for hardcore data scientists anymore. It is built for everyone from uni research projects and startups to full-on enterprise teams. If we are looking for an all-in-one platform to scale our AI and data workflows, SageMaker the move.

1. Lake House (The lakehouse architecture of Amazon SageMaker)

Press enter or click to view image in full size

SageMaker is now built on an open lakehouse architecture, fully compatible with Apache Iceberg. So we can unify data from Amazon S3, Redshift, and other sources , working from one clean, versioned copy. We do not have to move data around. Just run queries in place using any Iceberg-compatible tool. Permissions? Fully customizable down to the column level. Need real-time data? Zero-ETL integrations got us covered. And yes, we can also run federated queries across external systems , so data stays where it is, and we still get full access.

2. Data & AI Governance

Press enter or click to view image in full size

It is not all about building cool stuff, we have also gotta keep it organized and safe. With SageMaker Catalog (powered by Amazon DataZone), we can find data and models using semantic search. Do not feel like browsing? Just ask Amazon Q Developer in plain English. We can set fine-grained access policies, manage everything centrally, and publish or subscribe to shared assets easily. Plus, there are Guardrails to protect model behavior, and monitoring tools for tracking data quality, security, and full ML lineage.

3. Unified Studio

Press enter or click to view image in full size

Unified Studio is an all-in-one interface within Amazon SageMaker that brings together the entire machine learning and AI development lifecycle data exploration, preprocessing, model training, deployment, and MLOps all in a single, seamless environment.

Why is it called Unified?

Because before this, you’d need to use multiple separate tools (Jupyter, Glue Studio, Airflow UI, Redshift console, etc.). Now, with Unified Studio, everything is integrated in one place — streamlined and efficient.

So, Unified Studio = a single environment to manage your entire AI/ML workflow.

3.1 SQL Analytics

You know how painful it can be to write SQL on messy datasets or slow systems? Now imagine doing all that on Amazon Redshift, which powers SageMaker is SQL engine. It is serverless, so we do not worry about infrastructure. We can connect streaming data, operational databases, or external apps, no complicated ETL needed. Just connect and query in (almost) real-time. And if we do not feel like writing SQL? Just talk in plain English to Amazon Q. It understands and translates it into clean SQL.

3.2 Data Processing (Amazon SageMaker Data Processing)

When it comes to processing data, SageMaker gives us tools like Athena, EMR, AWS Glue, and Apache Airflow (MWAA). These make it super easy to connect to hundreds of data sources. We can run open-source frameworks like Apache Spark and we do not need to manage servers. Just focus on the logic, and the rest is handled. On top of that, we have got tools for data quality checks, sensitive data detection, lineage tracking, and super-granular access control, all natively integrated into the SageMaker Lakehouse.

3.3 Model Deployment (Amazon SageMaker AI)

SageMaker AI is basically an IDE for ML. It bundles everything we need to go from idea to production, Jupyter notebooks, profilers, MLOps pipelines, debuggers, and more. Need to build a foundation model or fine-tune a huge pretrained one? It is all possible here. We can experiment, train, and deploy at scale, without jumping across tools. Also, SageMaker gives access to hundreds of pretrained models that we can deploy instantly. So yeah, very deadline-friendly.

3.4 Gen AI App development (Amazon Bedrock in SageMaker Unified Studio)

Inside SageMaker Unified Studio, we can build Gen AI apps using Amazon Bedrock. The UI is super intuitive, and we get access to top-tier foundation models from Bedrock. There are advanced features too: Knowledge Bases, Guardrails, Agents, and Flows. These help us build Gen AI that is not just functional, but also safe and aligned with responsible AI principles. Fast dev cycle, clean UI, and all secure.

Let’s break down SageMaker Unified Studio

Press enter or click to view image in full size

Let’s break down SageMaker Unified Studio, an all-in-one environment that brings together the entire ML and data workflow: from data exploration, ETL, coding, training (including large-scale clusters), building generative applications (like chatbots and prompt-based tools), to deployment and MLOps. All seamlessly integrated with an AI assistant (Amazon Q) to maximize productivity.

Get Lintang Gilang Pratama’s stories in your inbox

Join Medium for free to get updates from this writer.

🧩 IDE & Applications

JupyterLab this is the core environment. A familiar notebook setup for coding, data exploration, or training models. It already includes many ML libraries , just plug and play.
Spaces think of these as isolated “workspace containers.” Each Space can have its own configuration (compute, storage, and environment). Perfect for team collaboration to avoid experiment conflicts.
Partner AI Apps If you need third-party integrations like Bedrock, Hugging Face, or Snowflake , you can connect directly from here.

📊 Data Analysis & Integration

Query Editor for a native SQL editor for accessing data from S3, Redshift, or Glue. You can run queries and explore data directly within the Studio without switching tools.

Press enter or click to view image in full size

Visual ETL flows Ideal for those who prefer a no-code/low-code approach. Just drag and drop to build ETL workflows, and it automatically generates the Spark code in the background.

Press enter or click to view image in full size

🔄 Orchestration

Workflows is like Apache Airflow but natively integrated. Ideal for building end-to-end ML pipelines , from preprocessing, training, evaluation, all the way to deployment.
ML Pipelines is A more structured version of ML workflows. You can define each step in a modular way , making it easy to maintain and repeat.

🧠 Machine Learning & Generative AI (App Development)

Chat agent You can build a generative chatbot using models like Claude or Jurassic from Bedrock. Just set up the prompt, memory, and handler , then you can instantly test the conversation.

Press enter or click to view image in full size

Flow is A visual editor for building AI-powered logic or applications. No need to code everything , just drag and connect the components, giving off a Node-RED kind of vibe.

Press enter or click to view image in full size

Prompt If you want to experiment with prompts (like for LLMs or chatbots), this is the place. You can test prompt structures and get instant results from generative models.
My apps Dashboard is a place to view all the apps you have built. Easily manage, edit, or deploy them directly from here.

Press enter or click to view image in full size

🧠 Machine Learning & Generative AI (Model Development)

Jumpstart is models is A wide range of pre-trained models from open-source libraries (like HuggingFace, TensorFlow, etc.) are available. You can use them out of the box, retrain, or deploy as needed.

Press enter or click to view image in full size

Training jobs is this is where you can schedule and monitor your model training. Supports manual, automated, and even distributed training via HyperPod.
Inference endpoints Once your model is ready, you can deploy it to real-time or batch endpoints directly from here.
HyperPod Perfect for large models , like training LLMs. You can rent GPU clusters with high-performance architectures (Trainium, A100, H100). Extremely powerful for enterprise-scale use cases.

🧠 Model Development (AI Ops)

MLflow Tracking for Tracks all your experiments , from parameters and metrics to model artifacts. Fully integrated, so you just need to enable it.
Model Registry for Version control for your models. Manage staging vs production, approval flows, and roll back between model versions.
Model Evaluations for Compare model performance across different versions. Get metrics like accuracy, F1-score, and AUC , along with visualizations.

What’s Next?

Now that you have seen what SageMaker Unified Studio can do, the next step is simple, here are some ideas

Explore Your Data : Use JupyterLab to connect to S3 or Redshift, run simple queries, and visualize your data. This helps with creating reports, dashboards, or just understanding your data better.

Build a GenAI Chatbot : Create a chatbot or internal assistant using Amazon Bedrock models. Set up prompts, test the responses, and deploy it with very little code.

Automate Your ML Pipeline : Use SageMaker Pipelines to build a complete machine learning workflow, from data collection and cleaning to model training and deployment.

Train Large Models : For high performance, use HyperPod to train large language models with powerful GPU clusters. Perfect for large-scale AI projects.

Best Regards

Lintang Gilang Pratama

Content

AWS SageMaker Unified Studio is …

What is AWS Sage Maker

1. Lake House (The lakehouse architecture of Amazon SageMaker)

2. Data & AI Governance

3. Unified Studio

Let’s break down SageMaker Unified Studio

Get Lintang Gilang Pratama’s stories in your inbox

What’s Next?

Builder

Build & Tooling

Tags