Fix data compare by text key
Beginning with dbForge Studio of MySQL 7.3 (possibly earlier), and continuing into the recently released 7.4.201, I am seeing some incorrect results when using the data comparison feature.
Specifically, it seems to happen when I compare data based on a textual key column rather than the integer-based primary key. Here is an simple example that will produce incorrect results with 7.4.201:
-- ###########################################
-- Server 1
CREATE TABLE test (
id INT(10) UNSIGNED NOT NULL AUTOINCREMENT,
key
VARCHAR(255) NOT NULL,
value VARCHAR(80) NOT NULL,
PRIMARY KEY (id),
UNIQUE INDEX uniquekey (key
)
)
ENGINE = INNODB,
CHARACTER SET utf8,
COLLATE utf8generalci;
INSERT INTO test(id, key
, value) VALUES
(1, 'PROBLEM12', 'Not Only in Source, Not PROBLEM123456789'),
(2, 'PROBLEM123456789', 'Not Only in Target, Not Different'),
(3, 'NEW1', 'NEW1name'),
(4, 'NEW2', 'NEW2name');
-- ###########################################
-- Server 2
CREATE TABLE test (
id INT(10) UNSIGNED NOT NULL AUTOINCREMENT,
key
VARCHAR(255) NOT NULL,
value VARCHAR(80) NOT NULL,
PRIMARY KEY (id),
UNIQUE INDEX uniquekey (key
)
)
ENGINE = INNODB,
CHARACTER SET utf8,
COLLATE utf8generalci;
INSERT INTO test(id, key
, value) VALUES
(1, 'PROBLEM12', 'Not Only in Source, Not PROBLEM123456789'),
(2, 'PROBLEM123456789', 'Not Only in Target, Not Different');
After creating the table and data on two servers, run the "New Data Comparison" tool and select the Test table. Under the Column Mapping dialog:
id: (uncheck key, and compare, leaving only sync checked)
key: (check key, leaving compare and sync checked)
value: (no change, leave compare and sync checked)
Run the compare.
Under the "Only in Source" results, it will incorrectly include the record with key=PROBLEM12. Under Different Records, it will incorrectly display record id 1 as if it had the key belonging to record id 2. Under "Only in Target", it will incorrectly include the record with key=PROBLEM_123456789.
Naturally, this is very concerning, as it could lead to data corruption when deploying changes from one server to another. Also, the synchronization script contains results which are obviously incorrect and will produce and error (given the unique constraint on the key column).
--
-- Deleting data from table test
DELETE FROM test WHERE key
= 'PROBLEM_123456789';
--
-- Updating data of table test
UPDATE test SET id = 2, value = 'Not Only in Target, Not Different' WHERE key
= 'PROBLEM_123456789';
--
-- Inserting data into table test
INSERT INTO test(key
, id, value) VALUES
('NEW1', 3, 'NEW1name'),
('NEW2', 4, 'NEW2name'),
('PROBLEM12', 1, 'Not Only in Source, Not PROBLEM_123456789');
This will cause the UPDATE to not have any records to update, and then throw a unique key constraint error when attempting to insert a duplicate key for "PROBLEM12".