SQLite timeout in glance image_cache

Bug #1362048 reported by Jordan Pittier
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Glance
Confirmed
Medium
zwei

Bug Description

Hi,
Sometime I get the following stack trace in Glance-API :

GET /v1/images/42646b2b-cf0b-4b15-b011-19d0a6880ffb HTTP/1.1" 200 4970175 2.403391
    for chunk in image_iter:
  File "/opt/stack/new/glance/glance/api/middleware/cache.py", line 281, in get_from_cache
    yield chunk
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/opt/stack/new/glance/glance/image_cache/drivers/sqlite.py", line 373, in open_for_read
    with self.get_db() as db:
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/opt/stack/new/glance/glance/image_cache/drivers/sqlite.py", line 391, in get_db
    conn.execute('PRAGMA synchronous = NORMAL')
  File "/opt/stack/new/glance/glance/image_cache/drivers/sqlite.py", line 77, in execute
    return self._timeout(lambda: sqlite3.Connection.execute(
  File "/opt/stack/new/glance/glance/image_cache/drivers/sqlite.py", line 74, in _timeout
    sleep(0.05)
  File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 31, in sleep
    hub.switch()
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch
    return self.greenlet.switch()
Timeout: 2 seconds

It happens also from time to time in the Gate. See the following logstash request :

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwicmV0dXJuIHNlbGYuZ3JlZW5sZXQuc3dpdGNoKClcIiBBTkQgZmlsZW5hbWU6XCJsb2dzL3NjcmVlbi1nLWFwaS50eHRcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTQwOTEyNjQ1NjU3NywibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIifQ==

This caused the gate failure of : http://logs.openstack.org/22/116622/2/check/check-tempest-dsvm-postgres-full/f079ef9/logs/screen-g-api.txt.gz? (wait for a full load of this page then grep "Timeout: 2 seconds")

Sorry for not being able to investigate more.

Jordan

Revision history for this message
Matthew Treinish (treinish) wrote :

It looks like you found a glance issue with tempest, but I'm not seeing how this bug affects tempest. Can you elaborate on how the bug is in tempest?

Changed in tempest:
status: New → Incomplete
Revision history for this message
Jordan Pittier (jordan-pittier) wrote :

Yeah, I though for a minute that to trigger a reverify in the gate, a bug in Tempest had to be logged. But that's obviously wrong.

no longer affects: tempest
Revision history for this message
Mark Washenberger (markwash) wrote :

Jordan,

Any ideas about how to reproduce this issue? Is it related to load on the disk where the sqlite file is stored?

Changed in glance:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Mark Washenberger (markwash) wrote :

I'm marking this as confirmed because its clear that the sqlite code anticipates a scenario where load on the sqlite db results in lock contention which blocks access.

It's not clear to me how to fix this, however. It seems like SQLite lock contention is pretty firmly baked into the cache implementation.

Revision history for this message
Jordan Pittier (jordan-pittier) wrote :

I have no idea how to reproduce the thing :( I never saw this problem on my local machine, I saw it only from time to time in the gate... It could possibly be related to a over loaded disk but it's hard to tell.

zwei (suifeng20)
Changed in glance:
assignee: nobody → zwei (suifeng20)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.