Eucalyptus

Strange behavior in managing image cache

Bug #837241 reported by wat on 2011-08-30

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Eucalyptus	New	Undecided	Neil Soman

Bug Description

In walrus's image cache management, the cached images with useCount=1 are harder to remove than ones with useCount=0.
A new cached image made after euca-register has useCount=0, and it can be marked as removable when all the other caches' useCount > 0.

After several "Maximum image cache size exceeded" errors happened, I tried to run 3 instances from different images at the same time and 2 of them were terminated after long "pending" status.
It seemed that they were removing thier caches each other (see the log below) because all the other's useCount were 1.

-----------------------------------

12:45:59 INFO [WalrusImageManager:Thread-81] Attempting to flush cached image: chtes2/vmlinuz-2.6.18-194.17.1.el5.manifest.xml
12:46:00 INFO [WalrusImageManager:Thread-81] Attempting to flush cached image: chtes5/initrd-2.6.18-194.17.1.el5.img.manifest.xml
12:46:01 INFO [WalrusImageManager:Thread-81] Attempting to flush cached image: chtes5/vmlinuz-2.6.18-194.17.1.el5.manifest.xml
12:46:02 INFO [WalrusImageManager:Thread-81] Attempting to flush cached image: chtes2/initrd-2.6.18-194.17.1.el5.img.manifest.xml
12:46:03 INFO [WalrusImageManager:Thread-81] Attempting to flush cached image: chtes2/TempBase1.img.manifest.xml
12:46:03 INFO [WalrusImageManager:Thread-81] Cached image: imgtesB/imgtesB1.manifest.xml size: 2147483648

12:58:55 INFO [WalrusImageManager:Thread-134] Attempting to flush cached image: imgtesB/imgtesB1.manifest.xml
12:58:57 INFO [WalrusImageManager:Thread-134] Cached image: chtes2/initrd-2.6.18-194.17.1.el5.img.manifest.xml size: 2683713

Environment/
Eucalyptus : 2.0.2
Host : CentOS5.5
Installed : binary(eucalyptus-2.0.2-centos-x86_64.tar.gz)
Hypervisor : kvm
VNET_MODE : MANAGED

-----------------------------------

The image cache in use must not be marked as removable.

Also it seems that the parameter useCount does not increase in normal run-instance proccess, but do increase where "Maximum image cache size exceeded" error happens.
This may cause an unreasonable result that the old unpopular cache with useCount=1 lives longer than a new popular cache with useCount=0.
Is there some reason not to increase useCount in every run-instance process?

thanks.