Best Practices for Snowflake Notebooks in Workspace
Running Snowflake Notebooks in Workspace.
Snowflake
Notebook
Snowflake Workspace
Published
December 10, 2025
Best Practices for Snowflake Notebooks in Workspace
DISCLAIMER: Snowflake Notebooks in Workspace was in PrPr (Private Preview) at the time of writing this notebook (December 10, 2025). So you might expect some changes as the service is developed further and becomes PuPr (Public Preview) and GA (Generally Available).
Snowflake Notebook is a fully-managed Jupyter-powered notebook built for end-to-end DS and ML development on Snowflake data. This includes: - Familiar Jupyter experience - Get the full power of a Jupyter Python notebook environment, directly connected to the governed Snowflake data. - Full IDE features: Easy editing and file management for maximum productivity. - Powerful for AI/ML: Runs in a pre-built container environment optimized for scalable AI/ML development with fully-managed access to CPUs and GPUs, parallel data loading, distributed training APIs for popular ML packages (e.g. xgboost, pytorch, lightGBM). - Governed collaboration: Enable multiple users to collaborate simultaneously with built-in governance and a complete history of changes via Git or shared workspaces.
In Snowflake, a notebook consumes compute resources through its configured virtual warehouses or compute pools. In this blog we are focusing on Snowflake Notebooks in Workspace which run on Snowpark Container Services (SPCS) and a compute pool is required.
A compute pool is an account-level construct, analogous to a Snowflake virtual warehouse. The naming scope of the compute pool is your account. That is, you cannot have multiple compute pools with the same name in your account.
The minimum information required to create a compute pool includes the following:
The machine type (referred to as the instance family) to provision for the compute pool nodes
The minimum nodes to launch the compute pool with
The maximum number of nodes the compute pool can scale to (Snowflake manages the scaling.)
By default, all workloads can run on a compute pool, such as: - user-deployed: services and jobs - workloads managed by Snowflake: notebooks, model serving, and ML jobs.
You can control which workloads run on those compute pools by using account-level parameters: Check ALLOWED_SPCS_WORKLOAD_TYPES and DISALLOWED_SPCS_WORKLOAD_TYPES to manage the workloads that can run on a compute pool.
Snowflake uses the placement group concept for fault isolation within Snowflake region. Check Compute Pool Placement for more information, especially in cases where you would like to have low latency between nodes for tightly coupled services.
Other important things to consider: - compute pool privileges - compute pool maintenance: In general, scheduled maintenance occurs every Saturday from 8 PM to Sunday at 8 AM, and every Sunday from 8 PM to Monday at 8 AM
--SHOW COMPUTE POOLS;SHOW COMPUTE POOLS;SELECT"instance_family","state",COUNT(*) AS number_of_poolsFROMTABLE(RESULT_SCAN(LAST_QUERY_ID()))GROUPBY"state", "instance_family"ORDERBY"state", number_of_pools DESC;
Snowpark Container Services lets you more easily deploy, manage, and scale containerized applications. After you create an application and upload the application image to a repository in your Snowflake account, you can run your application containers as a service.
A service represents Snowflake running your containerized application on a compute pool, which is a collection of virtual machine (VM) nodes.
There are two types of services: - Long-running services. A long-running service is like a web service that does not end automatically. After you create a service, Snowflake manages the running service. For example, if a service container stops, for whatever reason, Snowflake restarts that container so the service runs uninterrupted. (e.g. CREATE SERVICE command) - Job services. A job service terminates when your code exits, similar to a stored procedure. When all containers exit, the job service is done. (e.g. EXECUTE JOB SERVICE command)
See picture below for an illustration, or in working with services for detailed info.
Important things to note: - While Snowflake might distribute instances of a service across multiple compute pool nodes, all containers within a single service instance always run on the same compute pool node. - You can create services via SQL, Snowflake Python APIs, Snowflake Rest APIs and also Snowflake CLI. - Make use of network policies for network ingress and external access integration for network egress.
Scenarios for using Snowpark Container Services
Common workloads are: - Batch Data Processing Jobs: Jobs like stored procedures across multiple job instances, and graphics processing unit (GPU) for computationally intensive tasks like AI and machine learning.
APIs or Web UI Over Snowflake Data: Deploy services that expose APIs or web interfaces with embedded business logic. Users interact with the service rather than raw data.
Snowflake Notebooks in Workspace using Services: How it works
Once the first notebook gets connected to a service on the compute pool, other notebooks can hook onto the same service instantly. Each service occupies one compute pool node. The notebooks on the same service will share the compute resource on the compute pool node. Here, each notebook still maintains its own virtual environment.
Key things to consider:
Idle time: the Idle time is set on the container service. For example, if it is set to 4 hours, the container service automatically shuts down if all notebooks connected to it have stopped running for 4 hours.
external access integration - EAI: EAIs are managed on the container service which applies to all notebooks in the same Workspace.
%lsmagic: %lsmagic is supported.
requirements.txt: Specify package versions and ensure consistent environment setup by using !pip install -r requirements.txt Check versions here to make sure your package version specified is compatible with the supported version range.
You can upload your wheel file by: !pip install file_name.whl
Nice to know
You can import packages from stages, with:
from snowflake.snowpark import Sessionimport syssession = Session.builder.getOrCreate()session.file.get("@stage_name/math_tools.py","/tmp/")sys.path.append("/tmp/")import math_toolsmath_tools.add_one(3)
Limitations to consider:
plotly, altair, and other visualization packages that rely on HTML rendering are not yet supported.
Notebooks in different Workspaces cannot share the same service.
Artifact Repo and Custom Images are in the roadmap.
%lsmagic
Managing Snowflake Notebooks in Workspace
Below you see some considerations to take while using Snowflake Notebooks in Workspace, which include cost and monitoring capability.
Cost Aspects
A notebook consumes compute resources through its configured virtual warehouses or compute pools. To manage costs and ensure efficient operations, it’s important to monitor usage across individual notebooks, users, and the underlying compute infrastructure. This visibility helps ensure efficient operations and supports cost accountability throughout your environment.
Snowflake provides access to detailed usage data through ACCOUNT_USAGE views and system tables. This data can help answer questions such as:
What is the hourly credit consumption per notebook?
How frequently were notebooks run in the past week?
Which users ran notebooks in the past month?
Which compute pools or warehouses did notebooks use over the past week?
What is the total credit cost of notebooks using a specific compute resource?
from snowflake.snowpark.context import get_active_sessionimport plotly.express as pxsession = get_active_session()# Query the dataquery ="""SELECT user_name, SUM(credits) AS total_creditsFROM snowflake.account_usage.notebooks_container_runtime_historyGROUP BY user_nameORDER BY total_credits DESC"""df = session.sql(query).to_pandas()# Create bar chartfig = px.bar(df, x='USER_NAME', y='TOTAL_CREDITS', title='Total Credits by User', labels={'USER_NAME': 'User', 'TOTAL_CREDITS': 'Total Credits'})fig.show()
from snowflake.snowpark.context import get_active_sessionimport matplotlib.pyplot as pltsession = get_active_session()# Query the dataquery ="""SELECT user_name, SUM(credits) AS total_creditsFROM snowflake.account_usage.notebooks_container_runtime_historyGROUP BY user_nameORDER BY total_credits DESC"""df = session.sql(query).to_pandas()# Create bar chartplt.figure(figsize=(10, 6))plt.bar(df['USER_NAME'], df['TOTAL_CREDITS'])plt.xlabel('User')plt.ylabel('Total Credits')plt.title('Total Credits by User')plt.xticks(rotation=45, ha='right')plt.tight_layout()plt.show()