Forum Widgets
Latest Discussions
MERGE on delta table with source structure change
Hi everybody, I'm working on a Lakehouse on Synapse and want to merge two delta tables in a pyspark notebook. We are working on Apache Spark Version 3.3 The structure of the source table may change, some columns may be deleted for instance. I try to set the configuration"spark.databricks.delta.schema.autoMerge.enabled" to true But keep getting error message such as "cannot resolve column1 in INSERT clause given columns source.column2, source.column3 when I try to load new source data with only column2 and column3 Thanks for your help. Petepete441610Jul 12, 2023Copper Contributor9.6KViews0likes5CommentsAzure Synapse Analytics Dedicated SQL pool - IIF statement Support required
Trying to use IIF() in SELECT statement and getting the error "Incorrect syntax near '>'." in Azure Synapse Analytics-Dedicated SQL pool Tried like below: SELECT IIF(1>2, 'YES', 'NO'); This(attached pic as ref) error makes me to believe that IIF statement is not working at all in Azure Synapse Analytics Dedicated SQL Pool. Also this document link confirms that as well - IIF (Transact-SQL) - SQL Server | Microsoft Docs Knew that, CASE is alternative, but want to know Is there any specific reason behind this non-working of IIF (or) not implementing in Synapse Dedicated SQL pool? Any know-how regarding this is much appreciated!8.3KViews1like3CommentsNOT IN or IN Clause gives unexpected result Azure Synapse/Azure SQL Database
I have seen SQL developers using IN / https://docs.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql?view=sql-server-ver15 / NOT EXISTS conditions frequently in Azure Synapse/Azure SQL Database to filter out the rows which are not common in either of the tables using few joins and sub-queries. I made an observation recently that the NOT IN clause did not give expected number of rows for a query. For ex. If you expect an output of 3022 rows not in table T1 but the actual result you get is really unexpected ! Say you get an output result 0. Oops! How is it possible that you had to get an output of 3022 rows, but the actual windows shows you Zero result Expected : Actual Output : Question : Why there is a difference ? Answer : This is a limitation of the command when you have NULL record value in your table. Yes you read it correct, this is https://docs.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql?view=sql-server-ver15#result-value. Please read out below : Demo : Case 1 where Not IN gives expected result : I introduced NULL values in one columns on which I have put NOT IN or IN join and see the difference in output just because of one Null Value : How to Identify the issue: Try to find if any of the columns have NULL values due to which the issue is being faced : Mitigation : 1. You can use NOT EXISTS instead which will work even after these constraints (NULL valued column) You need to use Syntax for https://docs.microsoft.com/en-us/sql/t-sql/language-elements/exists-transact-sql?view=sql-server-ver15#e-using-not-exists : 2. You can eliminate NULLs. That's all Folks Cheers! Don't forget to comment/Ask Questions!Mukund_BhashkarMay 21, 2020Former Employee7.6KViews2likes2CommentsTrigger publish error Insufficient permissions to call this API.
I have create trigger for schedule run pipeline but after publish it shown "Insufficient permissions to call this API. b709b3ff-7fd4-4d33-a61b-13ec9df83db0 does not have Microsoft.Synapse/workspaces/credentials/useSecret/action on scope workspaces/skldevsynapse/credentials/WorkspaceSystemIdentity" then trigger is show status "Stopped" and not run. somone help me please.artichatsSep 26, 2021Copper Contributor6.3KViews1like3CommentsTrigger ADF from REST API with parameters
I need to call from website (or Azure function) a data factory pipeline using a POST REST API ( likely https://learn.microsoft.com/en-us/rest/api/datafactory/pipelines/create-run?tabs=HTTP) . The post simply sends a list of parameters to ADF and ADF simply reads this list and writes them to a storage account file. Ex : the REST API POST/sends parameters 1, 2, 3 to ADF and this appends new row to the file like so: Parameter 1, parameter 2, parameter 3 I was not able so far to find if I should use copy data or mapping data flow? . My issue is how to create the source dataset which contains as column1= parameter1, column2=parameter2, etc. then sync is the file. Very much appreciate your input Thank you,drhorg4Jun 14, 2023Copper Contributor5.2KViews0likes2CommentsNewbe, exporting full result from Azure Synapse Notebook cell
Hi, I have had success querying data from Customer Insights in Azure Synapse. I have a notebook attached to a sparkpool with language Spark Sql. Now I have come to a point where I want to get the full result of the query in a cell exported as a csv either in a blob or ftp (or anything). Seems that the result "preview" in the cells is a "TOP 1000". Which will be too few rows. RegardsSolvedFrancisRomstadJan 25, 2022Copper Contributor5.1KViews0likes1CommentI am unable to connect SQL serverless Built-in pool with SQL Server Management Studio?
Hi I have my CSV file in the Azure data lake gen 2. I created an external table for this CSV file in SQL Serverless Pool and I am able to read the table in the Azure Synapse notebook. However, getting the below error when I tried to connect this external table with SQL server management studio. I am unable to read the table. Can anyone advise what is the issue here? I have a csv file located in the data lake gen 2SolvedJasonB35Apr 14, 2022Copper Contributor4.7KViews0likes2Commentsspark pool taking more than 10m mins to start
I have created 3 notebook by selecting the language as Pyspark. spark pool is taking more then 10 minutes when starts executing the notebook. Running these notebooks sequentially, each time spark pool starts and stops. Although the setting to pause the Spark pool is set to 60 minutes. Please let me know is there any workaround to fast the start of spark pool, And same spark pool can be used in all the 3 notebooks without stopping.jasleen13Nov 25, 2020Copper Contributor3.8KViews0likes5CommentsGetting error "ErrorCode=InvalidTemplate, ErrorMessage=The expression 'udf(returnType=DoubleType())
Dear All, I have developed a notebook [pyspark] code in azure synapse and try to use this in pipeline and running from there. Notebook is running fine from develop section but from pipeline i am getting below error "code":"BadRequest","message":"ErrorCode=InvalidTemplate, ErrorMessage=The expression 'udf(returnType=DoubleType()) \n' is not valid: the string character '=' at position '14' is not expected.\"","target":"pipeline/PrepairedDataforML/runid/3762828c-c806-4619-b49d-b9c9772c5c2c","details":null,"error":null} Can some one guide if any additional setting is required in pipeline while calling notebook ?rahul_azureJan 25, 2021Copper Contributor3.5KViews0likes0CommentsWhat is meant by an 'integration dataset' in Azure Synapse Analytics?
I'm new to Synapse. I am using 'Azure Synapse' and I have noticed that there is an option to import an 'integration dataset'. I'm not sure what exactly it means and how it differs from some of the other options for instance. I can't find anything on the Microsoft documentation. Can anyone please explain to me what it means?.HamidBeeJul 05, 2022Brass Contributor3.5KViews0likes1Comment