Bulk Update statement

Bug #692669 reported by Olaf van der Spek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
New
Wishlist
Unassigned

Bug Description

Assume strict mode is enabled and I have to update 10000 rows in a table with 100000 rows. The updates are independent of each other (so update xbt_files set leechers = 1 where fid in (1,2) doesn't work).
Executing 10000 queries isn't efficient. There's a nice trick via insert ... on duplicate key update, but it breaks in strict mode (and strict mode is good, so I don't want to disable it).

Another way is to execute the updates via update ... case ... when ... then ... else ... end. However, this way can be 20x slower (for 10000 updates) and the syntax is worse.
So I'd like to request support for bulk updates. I think batch/bulk updates are a common operation and deserve proper and efficient support.

Possible syntax:
update2 xbt_files (fid, leechers, seeders, completed) values (3,1,2,3),(2,4,5,6) set
leechers = values(leechers), seeders = values(seeders), completed = values(completed);

The syntax is just a possibility, the semantics are the most important part.

How to repeat:
set sql_mode = '';

drop table if exists xbt_files;

create table xbt_files
(
 fid int not null auto_increment,
 info_hash blob not null,
 leechers int not null default 0,
 seeders int not null default 0,
 completed int not null default 0,
 flags int not null default 0,
 mtime int not null,
 ctime int not null,
 primary key (fid),
 unique key (info_hash(20))
);

insert into xbt_files (info_hash, mtime, ctime) values ('oRUImUWFyXvKItw3tVfB',
unix_timestamp(), unix_timestamp()), ('7XnB2eksvUgHvavG3YPJ', unix_timestamp(),
unix_timestamp()), ('9EsYv9y72FuHsL5yQYf8', unix_timestamp(), unix_timestamp());

insert into xbt_files (fid, leechers, seeders, completed) values (2,6268,4510,38453) on
duplicate key update leechers = values(leechers), seeders = values(seeders), completed =
values(completed);

update xbt_files set leechers = case fid when 2 then 4 else leechers end, seeders = case
fid when 2 then 5 else seeders end, completed = case fid when 2 then 6 else completed end
where fid in (2);

set sql_mode = 'strict_all_tables';

insert into xbt_files (fid, leechers, seeders, completed) values (3,6268,4510,38453) on
duplicate key update leechers = values(leechers), seeders = values(seeders), completed =
values(completed);
# ERROR 1364 (HY000): Field 'info_hash' doesn't have a default value

update xbt_files set leechers = case fid when 3 then 7 else leechers end, seeders = case
fid when 3 then 8 else seeders end, completed = case fid when 3 then 9 else completed end
where fid in (3);

select * from xbt_files;

http://bugs.mysql.com/bug.php?id=34228

Revision history for this message
Kristian Nielsen (knielsen) wrote :

Have you tried sending multiple UPDATE statements in a single request?

I think you would be able to achieve most of the needed efficiency this way.

Changed in maria:
importance: Undecided → Wishlist
Revision history for this message
Olaf van der Spek (olafvdspek) wrote :

TBH I haven't. Why do you think this would achieve comparable performance?

Revision history for this message
Kristian Nielsen (knielsen) wrote :

> TBH I haven't. Why do you think this would achieve comparable performance?

The main overhead in executing 10000 statements one after the other is in round-trips between the client and server.

By sending multiple statements in batches, this overhead is eliminated.

Revision history for this message
Olaf van der Spek (olafvdspek) wrote :

That way, the total query size is 86 byte per row vs 20 b per row. That's no good. Queries are already 1+ mb.
This has to be send, transfered, received and parsed. Then you need to lock 10000 times. Then you might need to commit 10000 times.

Hmm, I guess benchmarking settles this dispute faster.

Revision history for this message
Sergey Petrunia (sergefp) wrote :

How about using a temporary table and a multi table UPDATE statement:

-- create a temptable with update 'specifications':
-- fid=3 : change (leechers, seeders, completed) to (7,8,9)
-- fid=2 : change (leechers, seeders, completed) to (4,5,6)
create temporary table updates (
  fid int,
  new_leechers int,
  new_seeders int,
  new_completed int
);
insert into updates values (3, 7, 8, 9), (2, 4, 5, 6);

update xbt_files, updates
set
  xbt_files.leechers = updates.new_leechers,
  xbt_files.seeders = updates.new_seeders,
  xbt_files.completed = updates.new_completed
where
  xbt_files.fid=updates.fid;

Revision history for this message
Olaf van der Spek (olafvdspek) wrote :

Doesn't work because the same row might be updated multiple times. It's also slower and requires 3 queries instead of 1.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.