hbase split starvation due to transactions.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Trafodion |
In Progress
|
High
|
Oliver Bucaojit |
Bug Description
We ran a longevity test on a system. Running OE with 512 drivers.
Our max hfile was set to 10GB.
After a while it was noticed in some of the hbase regionserver logs
2015-04-27 10:35:06,990 INFO [regionserver60
2015-04-27 10:35:13,926 INFO [regionserver60
Looking at the hdfs GUI:
Contents of directory /apps/hbase/
559232b70b5340d
6c253a61ee344b1
8837fc13d3a241d
901c5708daa0485
b9d4bb1179414f9
bbe2057994194a2
Notice how the 2nd entry is over 10GB. It can't split because we have active transactions. And because our 512 drivers are not letting up, the split is starving out.
Once we killed the drivers, stopping new transactions, the split happened almost instantly.
Hall, Gary winding down...
1:48 PM
2015-04-27 17:48:57,235 INFO [regionserver60
Above says 5hr 52 mins but it actually took less than a minute once the transactions stopped.
We understand that split must be delayed until transactions have stopped, but in a high transaction environments, we need to make sure that a window will be given for the splits to actually happen.
Changed in trafodion: | |
status: | New → In Progress |
Refer to Blueprint: https:/ /blueprints. launchpad. net/trafodion/ +spec/dtm- handling- rebalance- split