ARA-C01 Actual Questions - Instant Download 163 Questions
Download Free Latest Exam ARA-C01 Certified Sample Questions
NEW QUESTION # 48
When activating Tri-Secret Secure in a hierarchical encryption model in a Snowflake account, at what level is the customer-managed key used?
- A. At the root level (HSM)
- B. At the table level (TMK)
- C. At the account level (AMK)
- D. At the micro-partition level
Answer: B
NEW QUESTION # 49
By default, the maximum file size that can be unloaded to a single file in snowflake is
- A. 5 GB for Azure, AWS and GCP
- B. 16 MB
- C. 10 MB
Answer: B
NEW QUESTION # 50
An Architect entered the following commands in sequence:
USER1 cannot find the table.
Which of the following commands does the Architect need to run for USER1 to find the tables using the Principle of Least Privilege? (Choose two.)
- A. GRANT ROLE PUBLIC TO ROLE INTERN;
- B. GRANT USAGE ON DATABASE SANDBOX TO ROLE INTERN;
- C. GRANT USAGE ON SCHEMA SANDBOX.PUBLIC TO ROLE INTERN;
- D. GRANT OWNERSHIP ON DATABASE SANDBOX TO USER INTERN;
- E. GRANT ALL PRIVILEGES ON DATABASE SANDBOX TO ROLE INTERN;
Answer: B,C
Explanation:
Explanation
* According to the Principle of Least Privilege, the Architect should grant the minimum privileges necessary for the USER1 to find the tables in the SANDBOX database.
* The USER1 needs to have USAGE privilege on the SANDBOX database and the SANDBOX.PUBLIC schema to be able to access the tables in the PUBLIC schema. Therefore, the commands B and C are the correct ones to run.
* The command A is not correct because the PUBLIC role is automatically granted to every user and role in the account, and it does not have any privileges on the SANDBOX database by default.
* The command D is not correct because it would transfer the ownership of the SANDBOX database from the Architect to the USER1, which is not necessary and violates the Principle of Least Privilege.
* The command E is not correct because it would grant all the possible privileges on the SANDBOX database to the USER1, which is also not necessary and violates the Principle of Least Privilege.
References: : Snowflake - Principle of Least Privilege : Snowflake - Access Control Privileges : Snowflake - Public Role : Snowflake - Ownership and Grants
NEW QUESTION # 51
USERADMIN and Security administrators (i.e. users with the SECURITYADMIN role) or higher can create roles.
- A. FALSE
- B. TRUE
Answer: B
NEW QUESTION # 52
Consider the following scenario where a masking policy is applied on the CREDICARDND column of the CREDITCARDINFO table. The masking policy definition Is as follows:
Sample data for the CREDITCARDINFO table is as follows:
NAME EXPIRYDATE CREDITCARDNO
JOHN DOE 2022-07-23 4321 5678 9012 1234
if the Snowflake system rotes have not been granted any additional roles, what will be the result?
- A. The sysadmin can see the CREDICARDND column data in clear text.
- B. Anyone with the Pl_ANALYTICS role will see the CREDICARDND column as*** 'MASKED* **'.
- C. The owner of the table will see the CREDICARDND column data in clear text.
- D. Anyone with the Pl_ANALYTICS role will see the last 4 characters of the CREDICARDND column data in dear text.
Answer: B
Explanation:
The masking policy defined in the image indicates that if a user has the PI_ANALYTICS role, they will be able to see the last 4 characters of the CREDITCARDNO column data in clear text. Otherwise, they will see 'MASKED'. Since Snowflake system roles have not been granted any additional roles, they won't have the PI_ANALYTICS role and therefore cannot view the last 4 characters of credit card numbers.
To apply a masking policy on a column in Snowflake, you need to use the ALTER TABLE ... ALTER COLUMN command or the ALTER VIEW command and specify the policy name. For example, to apply the creditcardno_mask policy on the CREDITCARDNO column of the CREDITCARDINFO table, you can use the following command:
ALTER TABLE CREDITCARDINFO ALTER COLUMN CREDITCARDNO SET MASKING POLICY creditcardno_mask; For more information on how to create and use masking policies in Snowflake, you can refer to the following resources:
CREATE MASKING POLICY: This document explains the syntax and usage of the CREATE MASKING POLICY command, which allows you to create a new masking policy or replace an existing one.
Using Dynamic Data Masking: This guide provides instructions on how to configure and use dynamic data masking in Snowflake, which is a feature that allows you to mask sensitive data based on the execution context of the user.
ALTER MASKING POLICY: This document explains the syntax and usage of the ALTER MASKING POLICY command, which allows you to modify the properties of an existing masking policy.
NEW QUESTION # 53
A Snowflake Architect is setting up database replication to support a disaster recovery plan. The primary database has external tables.
How should the database be replicated?
- A. Share the primary database with an account in the same region that the database will be replicated to.
- B. Move the external tables to a database that is not replicated, then replicate the primary database.
- C. Replicate the database ensuring the replicated database is in the same region as the external tables.
- D. Create a clone of the primary database then replicate the database.
Answer: B
Explanation:
Database replication is a feature that allows you to create a copy of a database in another account, region, or cloud platform for disaster recovery or business continuity purposes. However, not all database objects can be replicated. External tables are one of the exceptions, as they reference data files stored in an external stage that is not part of Snowflake. Therefore, to replicate a database that contains external tables, you need to move the external tables to a separate database that is not replicated, and then replicate the primary database that contains the other objects. This way, you can avoid replication errors and ensure consistency between the primary and secondary databases. The other options are incorrect because they either do not address the issue of external tables, or they use an alternative method that is not supported by Snowflake. You cannot create a clone of the primary database and then replicate it, as replication only works on the original database, not on its clones. You also cannot share the primary database with another account, as sharing is a different feature that does not create a copy of the database, but rather grants access to the shared objects. Finally, you do not need to ensure that the replicated database is in the same region as the external tables, as external tables can access data files stored in any region or cloud platform, as long as the stage URL is valid and accessible. References:
* [Replication and Failover/Failback] 1
* [Introduction to External Tables] 2
* [Working with External Tables] 3
* [Replication : How to migrate an account from One Cloud Platform or Region to another in
* Snowflake] 4
NEW QUESTION # 54
Dynamic data masking is supported in which editions of snowflake
- A. Enterprise
- B. VPS
- C. Standard
- D. Business Critical
Answer: A,B,D
NEW QUESTION # 55
A data platform team creates two multi-cluster virtual warehouses with the AUTO_SUSPEND value set to NULL on one. and '0' on the other. What would be the execution behavior of these virtual warehouses?
- A. Setting a '0' or NULL value means the warehouses will suspend immediately.
- B. Setting a '0' value means the warehouses will suspend immediately, and NULL means the warehouses will never suspend.
- C. Setting a '0' or NULL value means the warehouses will never suspend.
- D. Setting a '0' or NULL value means the warehouses will suspend after the default of 600 seconds.
Answer: B
Explanation:
The AUTO_SUSPEND parameter controls the amount of time, in seconds, of inactivity after which a warehouse is automatically suspended. If the parameter is set to NULL, the warehouse never suspends. If the parameter is set to '0', the warehouse suspends immediately after executing a query. Therefore, the execution behavior of the two virtual warehouses will be different depending on the AUTO_SUSPEND value. The warehouse with NULL value will keep running until it is manually suspended or the resource monitor limits are reached. The warehouse with '0' value will suspend as soon as it finishes a query and release the compute resources. Reference:
ALTER WAREHOUSE
Parameters
NEW QUESTION # 56
Which security, governance, and data protection features require, at a MINIMUM, the Business Critical edition of Snowflake? (Choose two.)
- A. Federated authentication and SSO
- B. AWS, Azure, or Google Cloud private connectivity to Snowflake
- C. Extended Time Travel (up to 90 days)
- D. Customer-managed encryption keys through Tri-Secret Secure
- E. Periodic rekeying of encrypted data
Answer: B,D
Explanation:
Explanation
According to the SnowPro Advanced: Architect documents and learning resources, the security, governance, and data protection features that require, at a minimum, the Business Critical edition of Snowflake are:
* Customer-managed encryption keys through Tri-Secret Secure. This feature allows customers to manage their own encryption keys for data at rest in Snowflake, using a combination of three secrets: a master key, a service key, and a security password. This provides an additional layer of security and control over the data encryption and decryption process1.
* Periodic rekeying of encrypted data. This feature allows customers to periodically rotate the encryption keys for data at rest in Snowflake, using either Snowflake-managed keys or customer-managed keys. This enhances the security and protection of the data by reducing the risk of key compromise or exposure2.
The other options are incorrect because they do not require the Business Critical edition of Snowflake. Option A is incorrect because extended Time Travel (up to 90 days) is available with the Enterprise edition of Snowflake3. Option D is incorrect because AWS, Azure, or Google Cloud private connectivity to Snowflake is available with the Standard edition of Snowflake4. Option E is incorrect because federated authentication and SSO are available with the Standard edition of Snowflake5. References: Tri-Secret Secure | Snowflake Documentation, Periodic Rekeying of Encrypted Data | Snowflake Documentation, Snowflake Editions | Snowflake Documentation, Snowflake Network Policies | Snowflake Documentation, Configuring Federated Authentication and SSO | Snowflake Documentation
NEW QUESTION # 57
You are creating a TASK to query a table streams created on the raw table and insert subsets of rows into multiple tables. You are following the below steps, but when you reached the step to resume the task, you received an error message as below.
Why is this error thrown and who can give you the required privilege?
Steps to be followed to get this error
-- Create a landing table to store raw JSON data.
-- Snowpipe could load data into this table. create or replace table raw (var variant);
-- Create a stream to capture inserts to the landing table.
-- A task will consume a set of columns from this stream. create or replace stream rawstream1 on table raw;
-- Create a second stream to capture inserts to the landing table.
-- A second task will consume another set of columns from this stream. create or replace stream rawstream2 on table raw;
-- Create a table that stores the names of office visitors identified in the raw data. create or replace table names (id int, first_name string, last_name string);
-- Create a table that stores the visitation dates of office visitors identified in the raw data.
create or replace table visits (id int, dt date);
-- Create a task that inserts new name records from the rawstream1 stream into the names table
-- every minute when the stream contains records.
-- Replace the 'mywh' warehouse with a warehouse that your role has USAGE privilege on. create or replace task raw_to_names
warehouse = etl_wh
schedule = '1 minute'
when
system$stream_has_data('rawstream1')
as
merge into names n
using (select var:id id, var:fname fname, var:lname lname from rawstream1) r1 on n.id = to_number(r1.id)
when matched then update set n.first_name = r1.fname, n.last_name = r1.lname
when not matched then insert (id, first_name, last_name) values (r1.id, r1.fname, r1.lname)
;
-- Create another task that merges visitation records from the rawstream1 stream into the visits table
-- every minute when the stream contains records.
-- Records with new IDs are inserted into the visits table;
-- Records with IDs that exist in the visits table update the DT column in the table.
-- Replace the 'mywh' warehouse with a warehouse that your role has USAGE privilege on. create or replace task raw_to_visits
warehouse = etl_wh schedule = '1 minute' when
system$stream_has_data('rawstream2') as
merge into visits v
using (select var:id id, var:visit_dt visit_dt from rawstream2) r2 on v.id = to_number(r2.id) when matched then update set v.dt = r2.visit_dt
when not matched then insert (id, dt) values (r2.id, r2.visit_dt);
-- Resume both tasks.
alter task raw_to_names resume;
- A. The role used to resume the task does not have EXECUTE TASK privilege. Only TASK OWNER can provide that privilege to the role.
- B. The role used to resume the task does not have EXECUTE TASK privilege. Only ACCOUNTADMIN can provide that privilege to the role.
- C. The role used to resume the task does not have EXECUTE TASK privilege. Both SECURITYADMIN and ACCOUNTADMIN can provide that privilege to the role.
Answer: B
NEW QUESTION # 58
You are running a large join on snowflake. You ran it on a medium warehouse and it took almost an hour to run. You then tried to run the join on a large warehouse but still the performance did not improve.
What may be the most possible cause of this.
- A. There may be a symptom on skew in your data where one of the value of the column is significantly more than rest of the values in the column
- B. Your warehouses do not have enough memory
- C. Since you have configured an warehouse with a low auto-suspend value, your warehouse is going down frequently
Answer: A
NEW QUESTION # 59
A Snowflake Architect is designing an application and tenancy strategy for an organization where strong legal isolation rules as well as multi-tenancy are requirements.
Which approach will meet these requirements if Role-Based Access Policies (RBAC) is a viable option for isolating tenants?
- A. Create a multi-tenant table strategy if row level security is not viable for isolating tenants.
- B. Create accounts for each tenant in the Snowflake organization.
- C. Create an object for each tenant strategy if row level security is not viable for isolating tenants.
- D. Create an object for each tenant strategy if row level security is viable for isolating tenants.
Answer: C
Explanation:
This approach meets the requirements of strong legal isolation and multi-tenancy. By creating separate accounts for each tenant, the application can ensure that each tenant has its own dedicated storage, compute, and metadata resources, as well as its own encryption keys and security policies. This provides the highest level of isolation and data protection among the tenancy models. Furthermore, by creating the accounts within the same Snowflake organization, the application can leverage the features of Snowflake Organizations, such as centralized billing, account management, and cross-account data sharing.
Reference:
Snowflake Organizations Overview | Snowflake Documentation
Design Patterns for Building Multi-Tenant Applications on Snowflake
NEW QUESTION # 60
Data is being imported and stored as JSON in a VARIANT column. Query performance was fine, but most recently, poor query performance has been reported.
What could be causing this?
- A. The recent data imports contained fewer fields than usual.
- B. The order of the keys in the JSON was changed.
- C. There were variations in string lengths for the JSON values in the recent data imports.
- D. There were JSON nulls in the recent data imports.
Answer: C
NEW QUESTION # 61
An Architect has designed a data pipeline that Is receiving small CSV files from multiple sources. All of the files are landing in one location. Specific files are filtered for loading into Snowflake tables using the copy command. The loading performance is poor.
What changes can be made to Improve the data loading performance?
- A. Create a multi-cluster warehouse and merge smaller files to create bigger files.
- B. Increase the size of the virtual warehouse.
- C. Change the file format from CSV to JSON.
- D. Create a specific storage landing bucket to avoid file scanning.
Answer: A
Explanation:
According to the Snowflake documentation, the data loading performance can be improved by following some best practices and guidelines for preparing and staging the data files. One of the recommendations is to aim for data files that are roughly 100-250 MB (or larger) in size compressed, as this will optimize the number of parallel operations for a load. Smaller files should be aggregated and larger files should be split to achieve this size range. Another recommendation is to use a multi-cluster warehouse for loading, as this will allow for scaling up or out the compute resources depending on the load demand. A single-cluster warehouse may not be able to handle the load concurrency and throughput efficiently. Therefore, by creating a multi-cluster warehouse and merging smaller files to create bigger files, the data loading performance can be improved. References:
* Data Loading Considerations
* Preparing Your Data Files
* Planning a Data Load
NEW QUESTION # 62
What Snowflake features should be leveraged when modeling using Data Vault?
- A. Snowflake's support of multi-table inserts into the data model's Data Vault tables
- B. Snowflake's ability to hash keys so that hash key joins can run faster than integer joins
- C. Scaling up the virtual warehouses will support parallel processing of new source loads
- D. Data needs to be pre-partitioned to obtain a superior data access performance
Answer: A
Explanation:
These two features are relevant for modeling using Data Vault on Snowflake. Data Vault is a data modeling approach that organizes data into hubs, links, and satellites. Data Vault is designed to enable high scalability, flexibility, and performance for data integration and analytics. Snowflake is a cloud data platform that supports various data modeling techniques, including Data Vault. Snowflake provides some features that can enhance the Data Vault modeling, such as:
* Snowflake's support of multi-table inserts into the data model's Data Vault tables. Multi-table inserts (MTI) are a feature that allows inserting data from a single query into multiple tables in a single DML statement. MTI can improve the performance and efficiency of loading data into Data Vault tables, especially for real-time or near-real-time data integration. MTI can also reduce the complexity and maintenance of the loading code, as well as the data duplication and latency12.
* Scaling up the virtual warehouses will support parallel processing of new source loads. Virtual warehouses are a feature that allows provisioning compute resources on demand for data processing.
Virtual warehouses can be scaled up or down by changing the size of the warehouse, which determines the number of servers in the warehouse. Scaling up the virtual warehouses can improve the performance
* and concurrency of processing new source loads into Data Vault tables, especially for large or complex data sets. Scaling up the virtual warehouses can also leverage the parallelism and distribution of Snowflake's architecture, which can optimize the data loading and querying34.
References:
* Snowflake Documentation: Multi-table Inserts
* Snowflake Blog: Tips for Optimizing the Data Vault Architecture on Snowflake
* Snowflake Documentation: Virtual Warehouses
* Snowflake Blog: Building a Real-Time Data Vault in Snowflake
NEW QUESTION # 63
A company is storing large numbers of small JSON files (ranging from 1-4 bytes) that are received from IoT devices and sent to a cloud provider. In any given hour, 100,000 files are added to the cloud provider.
What is the MOST cost-effective way to bring this data into a Snowflake table?
- A. An external table
- B. A copy command at regular intervals
- C. A pipe
- D. A stream
Answer: C
Explanation:
A pipe is a Snowflake object that continuously loads data from files in a stage (internal or external) into a table. A pipe can be configured to use auto-ingest, which means that Snowflake automatically detects new or modified files in the stage and loads them into the table without any manual intervention1.
A pipe is the most cost-effective way to bring large numbers of small JSON files into a Snowflake table, because it minimizes the number of COPY commands executed and the number of micro-partitions created. A pipe can use file aggregation, which means that it can combine multiple small files into a single larger file before loading them into the table. This reduces the load time and the storage cost of the data2.
An external table is a Snowflake object that references data files stored in an external location, such as Amazon S3, Google Cloud Storage, or Microsoft Azure Blob Storage. An external table does not store the data in Snowflake, but only provides a view of the data for querying. An external table is not a cost-effective way to bring data into a Snowflake table, because it does not support file aggregation, and it requires additional network bandwidth and compute resources to query the external data3.
A stream is a Snowflake object that records the history of changes (inserts, updates, and deletes) made to a table. A stream can be used to consume the changes from a table and apply them to another table or a task. A stream is not a way to bring data into a Snowflake table, but a way to process the data after it is loaded into a table4.
A copy command is a Snowflake command that loads data from files in a stage into a table. A copy command can be executed manually or scheduled using a task. A copy command is not a cost-effective way to bring large numbers of small JSON files into a Snowflake table, because it does not support file aggregation, and it may create many micro-partitions that increase the storage cost of the data5.
NEW QUESTION # 64
Is it possible for a data provider account with a Snowflake Business Critical edition to share data with an Enterprise edition data consumer account?
- A. If a user in the provider account with a share owning role sets share_restrictions to False when adding an Enterprise consumer account, it can import the share.
- B. A Business Critical account cannot be a data sharing provider to an Enterprise consumer. Any consumer accounts must also be Business Critical.
- C. If a user in the provider account with role authority to create or alter share adds an Enterprise account as a consumer, it can import the share.
- D. If a user in the provider account with a share owning role which also has override share restrictions privilege share_restrictions set to False when adding an Enterprise consumer account, it can import the share.
Answer: D
Explanation:
* Data sharing is a feature that allows Snowflake accounts to share data with each other without the need for data movement or copying1. Data sharing is enabled by creating shares, which are collections of database objects (tables, views, secure views, and secure UDFs) that can be accessed by other accounts, called consumers2.
* By default, Snowflake does not allow sharing data from a Business Critical edition account to a non-Business Critical edition account. This is because Business Critical edition offers higher levels of data protection and encryption than other editions, and sharing data with lower editions may compromise the security and compliance of the data3.
* However, Snowflake provides the OVERRIDE SHARE RESTRICTIONS global privilege, which allows a user to override the default restriction and share data from a Business Critical edition account to a non-Business Critical edition account. This privilege is granted to the ACCOUNTADMIN role by default, and can be granted to other roles as well4.
* To enable data sharing from a Business Critical edition account to an Enterprise edition account, the following steps are required34:
* A user in the provider account with the OVERRIDE SHARE RESTRICTIONS privilege must create or alter a share and add the Enterprise edition account as a consumer. The user must also set the share_restrictions parameter to False when adding the consumer. This parameter indicates whether the share is restricted to Business Critical edition accounts only. Setting it to False allows the share to be imported by lower edition accounts.
* A user in the consumer account with the IMPORT SHARE privilege must import the share and grant access to the share objects to other roles in the account. The user must also set the share_restrictions parameter to False when importing the share. This parameter indicates whether the consumer account accepts shares from Business Critical edition accounts only. Setting it to False allows the consumer account to import shares from lower edition accounts.
References:
* 1: Introduction to Secure Data Sharing | Snowflake Documentation
* 2: Creating Secure Data Shares | Snowflake Documentation
* 3: Enable Data Share:Business Critical Account to Lower Edition | Medium
* 4: Enabling sharing from a Business critical account to a non-business ... | Snowflake Documentation
NEW QUESTION # 65
An Architect needs to grant a group of ORDER_ADMIN users the ability to clean old data in an ORDERS table (deleting all records older than 5 years), without granting any privileges on the table. The group's manager (ORDER_MANAGER) has full DELETE privileges on the table.
How can the ORDER_ADMIN role be enabled to perform this data cleanup, without needing the DELETE privilege held by the ORDER_MANAGER role?
- A. Create a stored procedure that runs with caller's rights, including the appropriate "> 5 years" business logic, and grant USAGE on this procedure to ORDER_ADMIN. The ORDER_MANAGER role owns the procedure.
- B. Create a stored procedure that runs with owner's rights, including the appropriate "> 5 years" business logic, and grant USAGE on this procedure to ORDER_ADMIN. The ORDER_MANAGER role owns the procedure.
- C. This scenario would actually not be possible in Snowflake - any user performing a DELETE on a table requires the DELETE privilege to be granted to the role they are using.
- D. Create a stored procedure that can be run using both caller's and owner's rights (allowing the user to specify which rights are used during execution), and grant USAGE on this procedure to ORDER_ADMIN. The ORDER_MANAGER role owns the procedure.
Answer: B
Explanation:
This is the correct answer because it allows the ORDER_ADMIN role to perform the data cleanup without needing the DELETE privilege on the ORDERS table. A stored procedure is a feature that allows scheduling and executing SQL statements or stored procedures in Snowflake. A stored procedure can run with either the caller's rights or the owner's rights. A caller's rights stored procedure runs with the privileges of the role that called the stored procedure, while an owner's rights stored procedure runs with the privileges of the role that created the stored procedure. By creating a stored procedure that runs with owner's rights, the ORDER_MANAGER role can delegate the specific task of deleting old data to the ORDER_ADMIN role, without granting the ORDER_ADMIN role more general privileges on the ORDERS table. The stored procedure must include the appropriate business logic to delete only the records older than 5 years, and the ORDER_MANAGER role must grant the USAGE privilege on the stored procedure to the ORDER_ADMIN role. The ORDER_ADMIN role can then execute the stored procedure to perform the data cleanup12.
Reference:
Snowflake Documentation: Stored Procedures
Snowflake Documentation: Understanding Caller's Rights and Owner's Rights Stored Procedures
NEW QUESTION # 66
You want to automatically delete the files from stage after a successful load using the COPY INTO command.
What will be recommended approach for deletion?
- A. No need to do anything, snowflake does it automatically
- B. Set REMOVE=TRUE in the COPY INTO Command
- C. Set PURGE=TRUE in the COPY INTO command
Answer: C
NEW QUESTION # 67
Data is being imported and stored as JSON in a VARIANT column. Query performance was fine, but most recently, poor query performance has been reported.
What could be causing this?
- A. The recent data imports contained fewer fields than usual.
- B. There were variations in string lengths for the JSON values in the recent data imports.
- C. There were JSON nulls in the recent data imports.
- D. The order of the keys in the JSON was changed.
Answer: B,D
Explanation:
Data is being imported and stored as JSON in a VARIANT column. Query performance was fine, but most recently, poor query performance has been reported. This could be caused by the following factors:
* The order of the keys in the JSON was changed. Snowflake stores semi-structured data internally in a column-like structure for the most common elements, and the remainder in a leftovers-like column. The order of the keys in the JSON affects how Snowflake determines the common elements and how it optimizes the query performance. If the order of the keys in the JSON was changed, Snowflake might have to re-parse the data and re-organize the internal storage, which could result in slower query performance.
* There were variations in string lengths for the JSON values in the recent data imports. Non-native values, such as dates and timestamps, are stored as strings when loaded into a VARIANT column.
Operations on these values could be slower and also consume more space than when stored in a relational column with the corresponding data type. If there were variations in string lengths for the
* JSON values in the recent data imports, Snowflake might have to allocate more space and perform more conversions, which could also result in slower query performance.
The other options are not valid causes for poor query performance:
* There were JSON nulls in the recent data imports. Snowflake supports two types of null values in semi-structured data: SQL NULL and JSON null. SQL NULL means the value is missing or unknown, while JSON null means the value is explicitly set to null. Snowflake can distinguish between these two types of null values and handle them accordingly. Having JSON nulls in the recent data imports should not affect the query performance significantly.
* The recent data imports contained fewer fields than usual. Snowflake can handle semi-structured data with varying schemas and fields. Having fewer fields than usual in the recent data imports should not affect the query performance significantly, as Snowflake can still optimize the data ingestion and query execution based on the existing fields.
References:
* Considerations for Semi-structured Data Stored in VARIANT
* Snowflake Architect Training
* Snowflake query performance on unique element in variant column
* Snowflake variant performance
NEW QUESTION # 68
What is a key consideration when setting up search optimization service for a table?
- A. Search optimization service can help to optimize storage usage by compressing the data into a GZIP format.
- B. Search optimization service can significantly improve query performance on partitioned external tables.
- C. Search optimization service works best with a column that has a minimum of 100 K distinct values.
- D. The table must be clustered with a key having multiple columns for effective search optimization.
Answer: C
Explanation:
Explanation:
NEW QUESTION # 69
What are some of the characteristics of result set caches? (Choose three.)
- A. The retention period can be reset for a maximum of 31 days.
- B. The result set cache is not shared between warehouses.
- C. The data stored in the result cache will contribute to storage costs.
- D. Each time persisted results for a query are used, a 24-hour retention period is reset.
- E. Snowflake persists the data results for 24 hours.
- F. Time Travel queries can be executed against the result set cache.
Answer: A,D,E
Explanation:
Explanation
Comprehensive and Detailed Explanation: According to the SnowPro Advanced: Architect documents and learning resources, some of the characteristics of result set caches are:
* Snowflake persists the data results for 24 hours. This means that the result set cache holds the results of
* every query executed in the past 24 hours, and can be reused if the same query is submitted again and the underlying data has not changed1.
* Each time persisted results for a query are used, a 24-hour retention period is reset. This means that the result set cache extends the lifetime of the results every time they are reused, up to a maximum of 31 days from the date and time that the query was first executed1.
* The retention period can be reset for a maximum of 31 days. This means that the result set cache will purge the results after 31 days, regardless of whether they are reused or not. After 31 days, the next time the query is submitted, a new result is generated and persisted1.
The other options are incorrect because they are not characteristics of result set caches. Option A is incorrect because Time Travel queries cannot be executed against the result set cache. Time Travel queries use the AS OF clause to access historical data that is stored in the storage layer, not the result set cache2. Option D is incorrect because the data stored in the result set cache does not contribute to storage costs. The result set cache is maintained by the service layer, and does not incur any additional charges1. Option F is incorrect because the result set cache is shared between warehouses. The result set cache is available across virtual warehouses, so query results returned to one user are available to any other user on the system who executes the same query, provided the underlying data has not changed1. References: Using Persisted Query Results | Snowflake Documentation, Time Travel | Snowflake Documentation
NEW QUESTION # 70
Files arrive in an external stage every 10 seconds from a proprietary system. The files range in size from 500 K to 3 MB. The data must be accessible by dashboards as soon as it arrives.
How can a Snowflake Architect meet this requirement with the LEAST amount of coding? (Choose two.)
- A. Use Snowpipe with auto-ingest.
- B. Use a COPY command with a task.
- C. Use a combination of a task and a stream.
- D. Use a materialized view on an external table.
- E. Use the COPY INTO command.
Answer: A,C
NEW QUESTION # 71
Consider the following COPY command which is loading data with CSV format into a Snowflake table from an internal stage through a data transformation query.
This command results in the following error:
SQL compilation error: invalid parameter 'validation_mode'
Assuming the syntax is correct, what is the cause of this error?
- A. The value return_all_errors of the option VALIDATION_MODE is causing a compilation error.
- B. The VALIDATION_MODE parameter supports COPY statements that load data from external stages only.
- C. The VALIDATION_MODE parameter does not support COPY statements with CSV file formats.
- D. The VALIDATION_MODE parameter does not support COPY statements that transform data during a load.
Answer: A
NEW QUESTION # 72
A retail company has over 3000 stores all using the same Point of Sale (POS) system. The company wants to deliver near real-time sales results to category managers. The stores operate in a variety of time zones and exhibit a dynamic range of transactions each minute, with some stores having higher sales volumes than others.
Sales results are provided in a uniform fashion using data engineered fields that will be calculated in a complex data pipeline. Calculations include exceptions, aggregations, and scoring using external functions interfaced to scoring algorithms. The source data for aggregations has over 100M rows.
Every minute, the POS sends all sales transactions files to a cloud storage location with a naming convention that includes store numbers and timestamps to identify the set of transactions contained in the files. The files are typically less than 10MB in size.
How can the near real-time results be provided to the category managers? (Select TWO).
- A. All files should be concatenated before ingestion into Snowflake to avoid micro-ingestion.
- B. A stream should be created to accumulate the near real-time data and a task should be created that runs at a frequency that matches the real-time analytics needs.
- C. A Snowpipe should be created and configured with AUTO_INGEST = true. A stream should be created to process INSERTS into a single target table using the stream metadata to inform the store number and timestamps.
- D. The copy into command with a task scheduled to run every second should be used to achieve the near-real time requirement.
- E. An external scheduler should examine the contents of the cloud storage location and issue SnowSQL commands to process the data at a frequency that matches the real-time analytics needs.
Answer: B,C
Explanation:
To provide near real-time sales results to category managers, the Architect can use the following steps:
* Create an external stage that references the cloud storage location where the POS sends the sales transactions files. The external stage should use the file format and encryption settings that match the source files2
* Create a Snowpipe that loads the files from the external stage into a target table in Snowflake. The Snowpipe should be configured with AUTO_INGEST = true, which means that it will automatically detect and ingest new files as they arrive in the external stage. The Snowpipe should also use a copy
* option to purge the files from the external stage after loading, to avoid duplicate ingestion3
* Create a stream on the target table that captures the INSERTS made by the Snowpipe. The stream should include the metadata columns that provide information about the file name, path, size, and last modified time. The stream should also have a retention period that matches the real-time analytics needs4
* Create a task that runs a query on the stream to process the near real-time data. The query should use the stream metadata to extract the store number and timestamps from the file name and path, and perform the calculations for exceptions, aggregations, and scoring using external functions. The query should also output the results to another table or view that can be accessed by the category managers. The task should be scheduled to run at a frequency that matches the real-time analytics needs, such as every minute or every 5 minutes.
The other options are not optimal or feasible for providing near real-time results:
* All files should be concatenated before ingestion into Snowflake to avoid micro-ingestion. This option is not recommended because it would introduce additional latency and complexity in the data pipeline.
Concatenating files would require an external process or service that monitors the cloud storage location and performs the file merging operation. This would delay the ingestion of new files into Snowflake and increase the risk of data loss or corruption. Moreover, concatenating files would not avoid micro-ingestion, as Snowpipe would still ingest each concatenated file as a separate load.
* An external scheduler should examine the contents of the cloud storage location and issue SnowSQL commands to process the data at a frequency that matches the real-time analytics needs. This option is not necessary because Snowpipe can automatically ingest new files from the external stage without requiring an external trigger or scheduler. Using an external scheduler would add more overhead and dependency to the data pipeline, and it would not guarantee near real-time ingestion, as it would depend on the polling interval and the availability of the external scheduler.
* The copy into command with a task scheduled to run every second should be used to achieve the near-real time requirement. This option is not feasible because tasks cannot be scheduled to run every second in Snowflake. The minimum interval for tasks is one minute, and even that is not guaranteed, as tasks are subject to scheduling delays and concurrency limits. Moreover, using the copy into command with a task would not leverage the benefits of Snowpipe, such as automatic file detection, load balancing, and micro-partition optimization. References:
* 1: SnowPro Advanced: Architect | Study Guide
* 2: Snowflake Documentation | Creating Stages
* 3: Snowflake Documentation | Loading Data Using Snowpipe
* 4: Snowflake Documentation | Using Streams and Tasks for ELT
* : Snowflake Documentation | Creating Tasks
* : Snowflake Documentation | Best Practices for Loading Data
* : Snowflake Documentation | Using the Snowpipe REST API
* : Snowflake Documentation | Scheduling Tasks
* : SnowPro Advanced: Architect | Study Guide
* : Creating Stages
* : Loading Data Using Snowpipe
* : Using Streams and Tasks for ELT
* : [Creating Tasks]
* : [Best Practices for Loading Data]
* : [Using the Snowpipe REST API]
* : [Scheduling Tasks]
NEW QUESTION # 73
......
Free Snowflake ARA-C01 Exam 2024 Practice Materials Collection: https://www.testpassed.com/ARA-C01-still-valid-exam.html
Prepare for your exam certification with our ARA-C01 Certified Snowflake: https://drive.google.com/open?id=17IFMIcjPb8c3sHPOElhT0ebrDgpQK3u6