upsert
1 Topicadf upsert
Hi all, I'm trying to create sample Upsert data pipeline in ADF (insert missing records and update changed ones). I created sample source and target Postgres table with all data types available, the number of source and target columns slightly differ. CREATE TABLE "schema1".source_for_upsert_0001 ( col_serial SERIAL PRIMARY KEY, col_smallint SMALLINT, col_integer INTEGER, col_bigint BIGINT, col_decimal DECIMAL(10, 2), col_numeric NUMERIC(10, 2), col_real REAL, col_double DOUBLE PRECISION, col_smallserial SMALLSERIAL, col_serial_alias SERIAL, col_bigserial BIGSERIAL, col_money MONEY, col_char CHAR(5), col_varchar VARCHAR(50), col_text TEXT, col_bytea BYTEA, col_timestamp TIMESTAMP, col_timestamptz TIMESTAMPTZ, col_date DATE, col_time TIME, col_timetz TIMETZ, col_boolean BOOLEAN, col_uuid UUID, col_json JSON, col_jsonb JSONB, col_xml XML, col_inet INET, col_cidr CIDR, col_macaddr MACADDR, col_bit BIT(8), col_varbit VARBIT(16), col_interval INTERVAL, col_point POINT, col_line LINE, col_lseg LSEG, col_box BOX, col_path PATH, col_polygon POLYGON, col_circle CIRCLE, col_tsquery TSQUERY, col_tsvector TSVECTOR ); INSERT INTO "schema1".source_for_upsert_0001 ( col_smallint, col_integer, col_bigint, col_decimal, col_numeric, col_real, col_double, col_smallserial, col_serial_alias, col_bigserial, col_money, col_char, col_varchar, col_text, col_bytea, col_timestamp, col_timestamptz, col_date, col_time, col_timetz, col_boolean, col_uuid, col_json, col_jsonb, col_xml, col_inet, col_cidr, col_macaddr, col_bit, col_varbit, col_interval, col_point, col_line, col_lseg, col_box, col_path, col_polygon, col_circle, col_tsquery, col_tsvector ) VALUES ( 1, 100, 1000, 1234.56, 1234.56, 12.34, 12345.6789, 1, DEFAULT, DEFAULT, '$1234.56', 'A', 'Sample Text', 'This is a text field.', E'\x48656c6c6f', '2024-12-13 12:00:00', '2024-12-13 12:00:00+00', '2024-12-13', '12:00:00', '12:00:00+00', TRUE, '550e8400-e29b-41d4-a716-446655440000', '{"key": "value"}', '{"key": "value"}', '<note><to>User</to><message>Hello!</message></note>', '192.168.1.1', '192.168.0.0/24', '08:00:2b:01:02:03', B'10101010', B'1010101010101010', '1 year 2 months', '(1,1)', '((0,0),(1,1))', '((0,0),(1,1))', '((0,0),(2,2))', '((0,0),(1,1),(2,2),(2,0),(0,0))', '((0,0),(1,1),(2,2),(2,0),(0,0))', '<(1,1),1>', 'cat & dog', 'cat:3A dog:2A' ); CREATE TABLE "schema1".target_for_upsert_0001 ( col_2_serial SERIAL PRIMARY KEY, col_2_smallint SMALLINT, col_2_integer INTEGER, col_2a_integer INTEGER, col_2_bigint BIGINT, col_2_decimal DECIMAL(10, 2), col_2_numeric NUMERIC(10, 2), col_2_real REAL, col_2_double DOUBLE PRECISION, col_2_smallserial SMALLSERIAL, col_2_serial_alias SERIAL, col_2_bigserial BIGSERIAL, col_2_money MONEY, col_2_char CHAR(5), col_2_varchar VARCHAR(50), col_2_text TEXT, col_2_bytea BYTEA, col_2_timestamp TIMESTAMP, col_2_timestamptz TIMESTAMPTZ, col_2_date DATE, col_2_time TIME, col_2_timetz TIMETZ, col_2_boolean BOOLEAN, col_2_uuid UUID, col_2_json JSON, col_2_jsonb JSONB, col_2_xml XML, col_2_inet INET, col_2_cidr CIDR, col_2_macaddr MACADDR, col_2_bit BIT(8), col_2_varbit VARBIT(16), col_2_interval INTERVAL, col_2_point POINT, col_2_line LINE, col_2_lseg LSEG, col_2_box BOX, col_2_path PATH, col_2_polygon POLYGON, col_2_circle CIRCLE, col_2_tsquery TSQUERY, col_2_tsvector TSVECTOR ); I used "data flow". Upsert source: Upsert derived column - as we don't have "updated at" timestamp column in source / target, I plan to use md5 of all present row values to compare changes Upsert Alter Row - Upsert if: isNull(md5_columns)==false() Upsert Sink: Debugging: Could someone kindly look into it to advise what could be wrong? The last screenshot seems intended to clarify, but it doesn't really help to see the root cause.55Views0likes0Comments