Can create and delete volumes but euca-describe-volumes returns nothing

Bug #920408 reported by Jônata Tyska Carvalho
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Eucalyptus
New
Undecided
Unassigned

Bug Description

I'm using ubuntu 10.04 with euca 1.6.2 with 6 machines CLC/W and CC/SC and 4 NC's.

I can create and delete volumes, but euca-describe-volumes returns nothing. Below there is useful information:

When i run euca-describe-volumes, the following message appears in cloud-output.log

09:07:33 WARN VolumeManager | VolumeManager.java:258 Error getting volume information from the Storage Controller: com.eucalyptus.config.NoSuchComponentException: StorageControllerConfiguration named 192.168.1.1
09:07:33 DEBUG VolumeManager | VolumeManager.java:259 com.eucalyptus.config.NoSuchComponentException: StorageControllerConfiguration named 192.168.1.1
09:07:33 INFO ServiceSinkHandler | ServiceSinkHandler.java:167 :1327316853241:eucalyptus:uid:admin:c7a4d9d4-1cb3-4325-bbc2-d03926d02ac2:MSG_SERVICED:DescribeVolumesResponseType:25:

When i do euca-create-volume i get a positive message:

VOLUME vol-599C0627 1 creating 2012-01-23T11:08:19.146Z

But the time showed is 2 hours ahead of that time returned from date command.

sc-stats.log shows the volumes created correctly, lvdisplay shows the volumes too, and /var/lib/eucalyptus/volumes has the volumes.

sudo euca_conf --list-scs show the SC registered - 192.168.1.1

Volumes was working previously, but i had to change some network configuration (ip's and topology and the things stopped to work). I already deregister and register all things again, but it not solved the problem, i even reinstalled the eucalyptus-sc service and nothing. All other things seems to be working fine.

Should not have a sc.log on SC?

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :

Here there is more useful information from cloud-debug.log, when euca-describe-volumes are executed:

11:48:07 WARN [VolumeManager:Volume.9] Error getting volume information from the Storage Controller: com.eucalyptus.config.NoSuchComponentException: StorageControllerConfiguration named 192.168.1.1
11:48:07 DEBUG [VolumeManager:Volume.9] com.eucalyptus.config.NoSuchComponentException: StorageControllerConfiguration named 192.168.1.1
com.eucalyptus.config.NoSuchComponentException: StorageControllerConfiguration named 192.168.1.1
 at com.eucalyptus.config.Configuration.getStorageControllerConfiguration(Configuration.java:309)
 at com.eucalyptus.images.util.StorageUtil.getVolumeReply(StorageUtil.java:121)
 at edu.ucsb.eucalyptus.cloud.ws.VolumeManager.DescribeVolumes(VolumeManager.java:255)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at org.mule.model.resolvers.AbstractEntryPointResolver.invokeMethod(AbstractEntryPointResolver.java:147)
 at org.mule.model.resolvers.ReflectionEntryPointResolver.invoke(ReflectionEntryPointResolver.java:127)
 at org.mule.model.resolvers.DefaultEntryPointResolverSet.invoke(DefaultEntryPointResolverSet.java:50)
 at org.mule.component.DefaultLifecycleAdapter.intercept(DefaultLifecycleAdapter.java:202)
 at org.mule.component.AbstractJavaComponent.invokeComponentInstance(AbstractJavaComponent.java:82)
 at org.mule.component.AbstractJavaComponent.doOnCall(AbstractJavaComponent.java:73)
 at org.mule.component.AbstractComponent.onCall(AbstractComponent.java:87)
 at org.mule.model.seda.SedaService$ComponentStageWorker.run(SedaService.java:533)
 at org.mule.work.WorkerContext.run(WorkerContext.java:310)
 at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1061)
 at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575)
 at java.lang.Thread.run(Thread.java:636)
11:48:07 INFO [ServiceSinkHandler:ReplyQueue.10] :1327326487578:eucalyptus:uid:admin:e72d4787-d8a8-42d6-ac58-aa6cdab5c5c9:MSG_SERVICED:DescribeVolumesResponseType:34:

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :
Download full text (6.6 KiB)

More useful information, when i try to attach a volume to an instance, it doesn't works and this messages appears on cloud-debug.log

--------------------

08:46:50 DEBUG [ClusterMessageQueue:Volume.10] :1327402010535:VolumeAttachCallback:uid:admin:554eddf9-a354-403e-ac14-dfb0076fad38:MSG_PENDING:cc-t300:<?xml version="1.0" encoding="UTF-8"?>
<euca:AttachVolume xmlns:euca="http://msgs.eucalyptus.ucsb.edu">
  <euca:BlockVolumeMessage>
  </euca:BlockVolumeMessage>
</euca:AttachVolume>:
08:46:50 DEBUG [ClusterMessageQueue:Volume.10] :1327402010536:VolumeAttachCallback:uid:admin:554eddf9-a354-403e-ac14-dfb0076fad38:MSG_PENDING:cc-t300:<?xml version="1.0" encoding="UTF-8"?>
<euca:AttachVolume xmlns:euca="http://msgs.eucalyptus.ucsb.edu">
  <euca:BlockVolumeMessage>
  </euca:BlockVolumeMessage>
</euca:AttachVolume>:
 at edu.ucsb.eucalyptus.cloud.ws.VolumeManager.AttachVolume(VolumeManager.java:321)
08:46:50 DEBUG [ClusterMessageQueue:Volume.10] Queued message of type VolumeAttachCallback for cluster cc-t300
08:46:50 DEBUG [ClusterMessageQueue:Thread-24] -> Dequeued message of type VolumeAttachCallback
08:46:50 INFO [ServiceSinkHandler:ReplyQueue.10] :1327402010540:eucalyptus:uid:admin:554eddf9-a354-403e-ac14-dfb0076fad38:MSG_SERVICED:AttachVolumeResponseType:31:
08:46:50 ERROR [OMSourcedElementImpl:New I/O client worker #2-18] Could not get parser from data source for element {http://eucalyptus.ucsb.edu/}AttachVolume
javax.xml.stream.XMLStreamException: Error in JiBX marshalling: null value for element "{http://eucalyptus.ucsb.edu/}remoteDev" from object of type edu.ucsb.eucalyptus.msgs.AttachVolumeType
Caused by: org.jibx.runtime.JiBXException: null value for element "{http://eucalyptus.ucsb.edu/}remoteDev" from object of type edu.ucsb.eucalyptus.msgs.AttachVolumeType
 at edu.ucsb.eucalyptus.msgs.AttachVolumeType.JiBX_eucalyptus_ucsb_edu_marshal_3_0(VmBlockDevice.groovy)
 at edu.ucsb.eucalyptus.msgs.JiBX_eucalyptus_ucsb_eduAttachVolumeType_access.marshal()
 at edu.ucsb.eucalyptus.msgs.AttachVolumeType.marshal(VmBlockDevice.groovy)
08:46:50 ERROR [WsSecHandler:New I/O client worker #2-18] org.apache.axiom.om.OMException: javax.xml.stream.XMLStreamException: java.lang.RuntimeException: Error obtaining parser from data source:Error in JiBX marshalling: null value for element "{http://eucalyptus.ucsb.edu/}remoteDev" from object of type edu.ucsb.eucalyptus.msgs.AttachVolumeType
org.apache.axiom.om.OMException: javax.xml.stream.XMLStreamException: java.lang.RuntimeException: Error obtaining parser from data source:Error in JiBX marshalling: null value for element "{http://eucalyptus.ucsb.edu/}remoteDev" from object of type edu.ucsb.eucalyptus.msgs.AttachVolumeType
Caused by: javax.xml.stream.XMLStreamException: java.lang.RuntimeException: Error obtaining parser from data source:Error in JiBX marshalling: null value for element "{http://eucalyptus.ucsb.edu/}remoteDev" from object of type edu.ucsb.eucalyptus.msgs.AttachVolumeType
Caused by: java.lang.RuntimeException: Error obtaining parser from data source:Error in JiBX marshalling: null value for element "{http://eucalyptus.ucsb.edu/}remoteDev" from object of type edu.ucsb.eucalyptus.msgs.AttachVolumeT...

Read more...

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :

Another new, i deregistered SC (which has the IP 192.168.1.1 (eth0 of CC/SC machine)) and register a new SC, 10.1.1.254 (br0 of CC/SC machine). I can still create volumes, and delete them, but euca-describe-volumes put the following message on cloud-debug.log

DEBUG [VolumeManager:Volume.2] com.eucalyptus.config.NoSuchComponentException: StorageControllerConfiguration named 192.168.1.1

It means that for some reason, eucalyptus is getting older information (192.168.1.1 instead 10.1.1.254), maybe could this be the problem.

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :

Found this lines on /var/lib/eucalyptus/db

INSERT INTO STORAGE_INFO VALUES(1,390,15,'cc-t300','eth0','//var/lib/eucalyptus/volumes',FALSE)
INSERT INTO STORAGE_INFO VALUES(2,50,10,'192.168.1.1','eth0','//var/lib/eucalyptus/volumes',FALSE)
INSERT INTO STORAGE_STATS_INFO VALUES(1,'cc-t300',11,11811160064)
INSERT INTO STORAGE_STATS_INFO VALUES(2,'192.168.1.1',3,32212254720)

But euca_conf --list-nodes returns:

registered storage controllers:
   cc-t300 10.1.1.254

It seems to be a database problem.

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :

I'm doing progress here but i really need help to fix this. Now i think i found a way to easily reproduce this error.

My cluster (that is CC and SC) was named as 192.168.1.1 (name and IP), i saw that name mistake and deregister it and register it again with the name of cc-t300. The problems with euca-describe-volumes started at this point. Now i figured out why: i had some volumes created when the name of the cluster was still 192.168.1.1, and in the StorageUtil.java we have this:

public static edu.ucsb.eucalyptus.msgs.Volume getVolumeReply( Map<String, AttachedVolume> attachedVolumes, Volume v ) throws EucalyptusCloudException {
    StorageControllerConfiguration scConfig = Configuration.getStorageControllerConfiguration( v.getCluster( ) );

It means that the method gets the name of the Cluster looking to the volume, all they. But the volume has an old name of Cluster (192.168.1.1), and method getStorageControllerConfiguration can't communicate with it, throwing an exception and stopping the euca-describe-volumes.

So guys, if you wanna reproduce the error just register a cluster (and a storage) with the name clusterA, create some volumes, then deregister sc and cluster, and register again with the name clusterB, then do euca-describe-volumes and you will see the errors message on cloud-debug.log

I can get here on my own, but i really need help to get this fixed from here.

Thanks

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :

Ok, my problem was solved but i think would be great if you dev guys could launch a bug fix to this.

I found the older records of volumes in eucalyptus_image.script and eucalyptus_storage.script in the directory /var/lib/eucalyptus/db of Cloud Controller, then i deleted that records and restarted the machine. The things are working now.

Revision history for this message
Lester Wade (lester-wade) wrote :

This was worked around by editing the storage files in /var/lib/eucalyptus/db/ to remove the bogus entries for the cluster controller, right?

Revision history for this message
Jônata Tyska Carvalho (jonatatyska) wrote :

That is it. eucalyptus_storage.script and eucaylptus_image.script in my case.

 The best thing to do is grep 'old content' /var/lib/eucalyptus/db/* and delete all records where the old content appears.

Revision history for this message
Andy Grimm (agrimm) wrote :

This issue is now being tracked upstream at http://eucalyptus.atlassian.net/browse/EUCA-2790

Please watch that issue for further updates.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.