A company??s daily Snowflake workload consists of a huge number of concurrent queries triggered between 9pm and 11pm. At the individual level, these queries are smaller
statements that get completed within a short time period.
What configuration can the company??s Architect implement to enhance the performance of this workload? (Choose two.)
Correct Answer:
AB
These two configuration options can enhance the performance of the workload that consists of a huge number of concurrent queries that are smaller and faster.
✑ Enabling a multi-clustered virtual warehouse in maximized mode allows the
warehouse to scale out automatically by adding more clusters as soon as the current cluster is fully loaded, regardless of the number of queries in the queue. This can improve the concurrency and throughput of the workload by minimizing or preventing queuing. The maximized mode is suitable for workloads that require high performance and low latency, and are less sensitive to credit consumption1.
✑ Setting the MAX_CONCURRENCY_LEVEL to a higher value than its default value
of 8 at the virtual warehouse level allows the warehouse to run more queries concurrently on each cluster. This can improve the utilization and efficiency of the warehouse resources, especially for smaller and faster queries that do not require a lot of processing power. The MAX_CONCURRENCY_LEVEL parameter can be set when creating or modifying a warehouse, and it can be changed at any time2.
References:
✑ Snowflake Documentation: Scaling Policy for Multi-cluster Warehouses
✑ Snowflake Documentation: MAX_CONCURRENCY_LEVEL
A company has an external vendor who puts data into Google Cloud Storage. The company's Snowflake account is set up in Azure.
What would be the MOST efficient way to load data from the vendor into Snowflake?
Correct Answer:
B
The most efficient way to load data from the vendor into Snowflake is to create an external stage on Google Cloud Storage and use the external table to load the data into Snowflake (Option B). This way, you can avoid copying or moving the data across different cloud platforms, which can incur additional costs and latency. You can also leverage the external table feature to query the data directly from Google Cloud Storage without loading it into Snowflake tables, which can save storage space and improve performance. Option A is not efficient because it requires the vendor to create a Snowflake account and a data share, which can be complicated and costly. Option C is not efficient because it involves copying the data from Google Cloud Storage to Azure Blob storage using external tools, which can be slow and expensive. Option D is not efficient because it requires creating a Snowflake account in the Google Cloud Platform (GCP), ingesting data into this account, and using data replication to move the data from GCP to Azure, which can be complex and time-consuming. References: The answer can be verified from Snowflake??s official documentation on external stages and external tables available on their website. Here are some relevant links:
✑ Using External Stages | Snowflake Documentation
✑ Using External Tables | Snowflake Documentation
✑ Loading Data from a Stage | Snowflake Documentation
What are characteristics of the use of transactions in Snowflake? (Select TWO).
Correct Answer:
AC
A. Snowflake's transactions can indeed include DDL (Data Definition Language), DML (Data Manipulation Language), and query statements. When executed within a transaction block, they all contribute to the atomicity of the transaction—either all of them commit together or none at all.C. Snowflake supports explicit transaction control through the use of the BEGIN TRANSACTION (or simply BEGIN) and COMMIT statements. Alternatively, the BEGIN WORK and COMMIT WORK syntax is also supported, which is a standard SQL syntax for initiating and ending transactions, respectively.Note: The END TRANSACTION statement is not used in Snowflake to end a transaction; the correct statement is COMMIT or COMMIT WORK.
Why might a Snowflake Architect use a star schema model rather than a 3NF model when designing a data architecture to run in Snowflake? (Select TWO).
Correct Answer:
DE
A star schema model is a type of dimensional data model that consists of a single fact table and multiple dimension tables. A 3NF model is a type of relational data model that follows the third normal form, which eliminates data redundancy and ensures referential integrity. A Snowflake Architect might use a star schema model rather than a 3NF model when designing a data architecture to run in Snowflake for the following reasons:
✑ A star schema model is more suitable for analytical queries that require
aggregating and slicing data across different dimensions, such as those performed by a BI tool. A 3NF model is more suitable for transactional queries that require inserting, updating, and deleting individual records.
✑ A star schema model is simpler and faster to query than a 3NF model, as it involves fewer joins and less complex SQL statements. A 3NF model is more complex and slower to query, as it involves more joins and more complex SQL statements.
✑ A star schema model can provide a simple flattened single view of the data to a
particular group of end users, such as business analysts or data scientists, who need to explore and visualize the data. A 3NF model can provide a more detailed and normalized view of the data to a different group of end users, such as application developers or data engineers, who need to maintain and update the data.
The other options are not valid reasons for choosing a star schema model over a 3NF model in Snowflake:
✑ Snowflake can handle the joins implied in a 3NF data model, as it supports ANSI
SQL and has a powerful query engine that can optimize and execute complex queries efficiently.
✑ The Architect can use both star schema and 3NF models to remove data
duplication from the data stored in Snowflake, as both models can enforce data integrity and avoid data anomalies. However, the trade-off is that a star schema model may have more data redundancy than a 3NF model, as it denormalizes the data for faster query performance, while a 3NF model may have less data redundancy than a star schema model, as it normalizes the data for easier data maintenance.
✑ The Architect can use both star schema and 3NF models to design a landing zone
to receive raw data into Snowflake, as both models can accommodate different types of data sources and formats. However, the choice of the model may depend on the purpose and scope of the landing zone, such as whether it is a temporary or permanent storage, whether it is a staging area or a data lake, and whether it is a single source or a multi-source integration.
References:
✑ Snowflake Architect Training
✑ Data Modeling: Understanding the Star and Snowflake Schemas
✑ Data Vault vs Star Schema vs Third Normal Form: Which Data Model to Use?
✑ Star Schema vs Snowflake Schema: 5 Key Differences
✑ Dimensional Data Modeling - Snowflake schema
✑ Star schema vs Snowflake Schema
When using the Snowflake Connector for Kafka, what data formats are supported for the messages? (Choose two.)
Correct Answer:
CD
The data formats that are supported for the messages when using the Snowflake Connector for Kafka are Avro and JSON. These are the two formats that the connector can parse and convert into Snowflake table rows. The connector supports both schemaless and schematized JSON, as well as Avro with or without a schema registry1. The other options are incorrect because they are not supported data formats for the messages. CSV, XML, and Parquet are not formats that the connector can parse and convert into Snowflake table rows. If the messages are in these formats, the connector will load them as VARIANT data type and store them as raw strings in the
table2. References: Snowflake Connector for Kafka | Snowflake Documentation, Loading Protobuf Data using the Snowflake Connector for Kafka | Snowflake Documentation