The instructions to deploy microstack on a multipass vm fail

Bug #2024992 reported by John Lloyd Olsen
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MicroStack
New
Undecided
Unassigned
OpenStack Snap
New
Undecided
Unassigned

Bug Description

On a fresh Ubuntu 22.04 host, I follow the instructions on https://microstack.run/. I create the required multipass vm and as I have previously found some packages were missing (and I blame Canonical for this) I issue: sudo apt update && sudo apt install -y openvswitch-switch-dpdk python3-neutronclient python3-oauth2client python3-openstackclient python3-pymysql python3-novaclient && sudo lxd init and complete the initialisation of lxd as required. I continue to follow the instructions to end with:

ubuntu@microstack:~$ sunbeam cluster bootstrap --accept-defaults
Sunbeam Cluster not initialized
An unexpected error has occurred. Please run 'sunbeam inspect' to generate an inspection report.
Error: no close frame received or sent

Assuming this may mean simply a badly completed process which may actually have completed well enough to be workable, I do the following:

ubuntu@microstack:~$ sunbeam configure --accept-defaults --openrc demo-openrc
An unexpected error has occurred. Please run 'sunbeam inspect' to generate an inspection report.
Error: Leader for application 'keystone' is missing from model 'openstack'

If I omit to install the aforementioned packages I get a worse result.

So I run 'sunbeam inspect' only to find

ubuntu@microstack:~$ sunbeam inspect
⠋ Getting charm logs for openstack model ... ERROR model sunbeam-controller:microstack/0a3aaf7e-5724-4a2b-806a-bfcc5a397c51 not found
Error: Command '['/snap/openstack/182/juju/bin/juju', 'debug-log', '--model', '0a3aaf7e-5724-4a2b-806a-bfcc5a397c51', '--replay', '--no-tail']' returned non-zero exit status 2.

Why does Canonical publish instructions that persistently fail?

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I'd like you people to know I also spent most of my weekend trying to install charmed kubernetes with nothing but repeated failure. You obviously do not bother to test your instructions against fresh machines that are likely to be similar to what developers will be using after a fresh install of the OS.

Why do you not test your own instructions properly?

Basically, the longer I live (I am 65) the more disgusted I am becoming with Canonical's Community offerings. You are not doing this section of your business adequately. Is this because accountants are running your software business for you?

John Olsen

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Or is it because you 4 people: Olsen, Bryant, Mourereau and Matulis are simply doing a half-hearted incompetent job?

John Olsen

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Hi John,

I'm sorry that you are having troubles installing the sunbeam Microstack. I'm not quite sure which challenges you have run into, but I'd like to try and recreate these and fix them so that you don't run into challenges like this. First of all, thank you for raising a bug letting us know of your challenges.

You indicate that you have found that some previous packages were missing and you have chosen to install the following packages:

openvswitch-switch-dpdk python3-neutronclient python3-oauth2client python3-openstackclient python3-pymysql python3-novaclient

How did you come about choosing those packages? When you said you you have found these packages missing, how did you determine these packages were missing? The instructions do not mention these packages because they are not necessary.

The error message shown:
> Error: no close frame received or sent.

Indicates to me that there is some sort of networking issue in this particular configuration. It's not quite clear what the challenge is. Could you please re-run the sunbeam cluster bootstrap command as follows, so we can get some more verbose debugging output? The -v will add verbose output on the commandline to give us a bit of a sense of what's going on.

sunbeam -v cluster bootstrap --accept-defaults

Your other error that occurred when running the configure command - I'll raise a separate bug for that and spin it out from this one. The sunbeam configure command should certainly determine that the bootstrap command has been run or not and provide a more meaningful message.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

My computer is a 24 core, 128GB computer with INTEL 12TH GEN CORE I9-12900K 3.20GHZ DESKTOP PROCESSOR with a main 2TB HDD and a secondary 1TB HDD.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

I have raised bug #2024996 for the configure step not checking that its been bootstrapped.

I have also started an installation following the exact instructions from the microstack.run site and it appears to be working as the control plan installation is progressing along.

I have started a second installation with the additional packages and lxd initialization step to see if I can recreate the scenario with the error that you have run into.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Note, bug #1961192 looks to be related to the error that is reported around the "Error: no close frame received or sent." I am suspicious that the pylibjuju connection is not closing out the connection correctly as referenced in the bug - which then leads to this error that is seen.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Hello Billy,

Firstly thanks for being so prompt. Secondly I retract my remarks about the quality of you and your colleagues' work.

I am about to set about running the verbose-output instruction as you advised.

I found the packages were missing when trying to install Ussuri microstack by searching the web and discovering a document (which I apologise for not being able to find again.)

The thing was, with Ussuri, these packages did turn out (for me anyway) to be necessary to get that beta version to work. I also did have to initialise lxd it seemed. I did run the instructions on https://microstack.run/ just before the run I showed you, without any of those packages but the result seemed worse. I will run the instructions as you advised without those extraneous packages and let you know the (verbose) results asap ie now.

I am very sorry I lost my temper. I can only admit to having become your typical Angry Old Man.

Humbly

John Olsen

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :
Download full text (5.6 KiB)

I am not sure whether it is important, but my "microstack" vm is configured as follows:

multipass launch -name microstack -c 20 -m 64G -d 200G

I am attaching the requested verbose output. The extraneous packages I mentioned previously were not installed for this run, which was performed on a fresh vm.

I ran "sunbeam -v cluster bootstrap --accept-defaults > shared/sunbeam-install-failing-john-olsen.txt"

As you can see at the end of the output the final point was reached at least 1/2 hour before I ctrl-C'd the process which appeared to be hanging with no changes. I hope I have not pre-empted the process.

John Olsen

PS: I am pasting in the last parts of responses I received post-bootstrapping attempt.

ubuntu@microstack:~$ sunbeam -v cluster bootstrap --accept-defaults > shared/sunbeam-install-failing-john-olsen.txt
^C
Aborted!
ubuntu@microstack:~$ sunbeam configure --accept-defaults --openrc demo-openrc
An unexpected error has occurred. Please run 'sunbeam inspect' to generate an inspection report.
Error: Leader for application 'keystone' is missing from model 'openstack'

ubuntu@microstack:~$ sunbeam inspect
DEBUG:root:Cannot access endpoint ('10.93.113.149:17070', '-----BEGIN
CERTIFICATE-----\nMIIEEzCCAnugAwIBAgIVAP6y5nKenxFpcI50e8tq39Z9ntSzMA0GCSqGSIb3DQEB\nCwUAMCExDTALBgNVBAoTBEp1anUxEDAOBgNVBAMTB2p1anUtY2EwHhcNMjMwNjI1\nMjIzMDE4WhcNMzMwNjI1MjIzNTE4WjAhMQ0wCwYDVQQ
KEwRKdWp1MRAwDgYDVQQD\nEwdqdWp1LWNhMIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAxyjNUm9I\nPsK2A0rQRs+mFgrBCkoRYza0L8fGngi+SYF1zt4QRwJ+HFtGh2M3amNLweqeXmnU\nuGvoQ+t7FNs0wIMjzTG5aFzBXIS/drREykFUrL
QarYX3kxvbvBZ98j7uC+Xw1mXB\nbHuVk9RbcnEIPWrgdFWC0jS9TDp4NN98nk5yBuH59L7R2kX66hyiZaRr6yJFJCnO\neRqS6+sMslOREAYkzzxNEykMn7dTamDBeNCm/tW0GcfTD9a24JKtiNhDHCAifaYm\nN4/KtOFy3vUzOKbx1jq7tJsxEEhJgpqYp
TIFpqzYWQNxIs4GQe5wwvZTqLDfnF2+\n/ePoDQ+Cv5YQ3HADZ7AUEB2F0j/OBe0wmBBPS5w4rvXIrDBJ+OHAw3tuVz6DAjNW\nmwAOh7a6xwMshEHXxdd4oEx+Po8JOITkSWyuSH4sWk1wesBUn1WlowqXMkKPUCdr\n3Zu1mfwbSZAiAvgXgyHxJcqDUKpI
Wi9PyiD4d4QMRjDXukWhR/yRqUWZAgMBAAGj\nQjBAMA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBQn\naiexjgYRu65cA4PZyYMaLwgl6DANBgkqhkiG9w0BAQsFAAOCAYEADLFLPgxP/rLN\nnPI11GqASgSsfJamj6bzIk1
sUibf577FVpE2qPKEEqzDsrj8AXqF//0PUXpZRdyb\n70WdY3abvtYc26j9frAw9aPaxVNKaPz8Rz1DWIOPOksQMPaN2dwd9nCqlKOajnku\nuK9NWLppcjuoRN03LR9iTg0gxuhQPbSVfaOGWRAtoIZNinMAwqLbhR/MzBfrl/Gd\nCXX8EwJYrkmUzvhfdN
OZUjfFXW1xArX1OWpfby+wSRPbG90+J1XSuugOdWA0zt4V\nYhNYRkI5J3qaTekr2udSeepKShB/QeBwwJLv8S4VxUVil2Sj9jCiTHFlIhCDawdS\nlW/J6M8PpoEl9faExtcZ/vcPSkqjWzG9PJErxWOQrr919TFZY88Xr7eazBqEMHbR\nCTxswbtLNQYzp
FoOo8byCpNUFgWUHQPgiRA+Lc4BR7ykQC2GR2DHD5jT1UDVLIbM\n7i57yEqmJ9iCe0R038UVFD9kesVSk8PX+M1Ygyr9LvkmgbyvNcze\n-----END CERTIFICATE-----\n'): None
DEBUG:sunbeam.utils:Unable to connect to any endpoint: 10.93.113.149:17070
Traceback (most recent call last):
  File "/snap/openstack/182/lib/python3.10/site-packages/sunbeam/utils.py", line 140, in __call__
    return self.main(*args, **kwargs)
  File "/snap/openstack/182/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/snap/openstack/182/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.comm...

Read more...

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :
Revision history for this message
Billy Olsen (billy-olsen) wrote :

John,

> Secondly I retract my remarks about the quality of you and your colleagues' work.

Thank you for this, I took it that you were very frustrated from your weekend spent slogging through some things that met less than ideal results. That in and of itself is a problem that I'd like to see resolved as the goal of Microstack is supposed to help reduce the barrier to entry. The sunbeam-based Microstack is an endeavor to ensure that you can build something suitable for production (complete with high availability, etc). Thank you for your patience and your efforts to help find out what's going on.

An update on my installation scenarios. I ran two installation scenarios and here are the results:

1. I ran the instructions verbatim and the Microstack was installed successfully.
2. I ran the instructions with your slight modifications - installing these additional packages - and I did run into an error where the connections were getting closed - e.g. https://paste.ubuntu.com/p/X2qpSbCVDg/.

I won't have time to debug the second scenario at this moment, but I will endeavor to find some time in the near future to debug and diagnose this to see if I can understand what the cause is and look to patch it. I'll mark the bug as confirmed for the time being as there's definitely something going on here.

I'll next leave an update in the next comment on the logs you have provided (thank you for those!)

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Regarding the logs that you have provided. Thank you for supplying these! In the future, it may be beneficial to use tee instead of the redirection so you can get some of the output. I think you would have seen a bit more progress, especially running in verbose mode.

I don't see any obvious errors in the logs. The bootstrap here is a bit more complicated than the Ussuri-based microstack as its using Juju, Microk8s, Terraform and some other pieces under the covers. Unfortunately, the user feedback isn't terribly great during this bootstrap part as it just provides a spinning set of text next to deploying Control Plane. What tends to cause this to be slow is the access to pull the OCI images and place them into

From what I can see in the logs, the following steps have occurred:

1. Juju has been installed on the machine
2. Juju has added the local machine as a target for the deployment
3. Terraform has been initialized
4. Microk8s has been deployed to the local machine
5. Juju has been bootstrapped into Microk8s
6. Terraform has been used to deploy the control plane into the Microk8s

At the point in time that you cancelled the bootstrap, the control plane was still being deployed. Here is where the OCI images are being pulled down into the local Microk8s cluster. We have found this part to be a bit slower than desired. In some cases, depending on the bandwidth available, etc - the image pull process is taking exceedingly long and the bootstrap step will timeout :-(.

The way that the sunbeam version of Microstack works is that the commands attempt to be idempotent. You can simply run the bootstrap command again and it will fast-forward to the point it left off at.

We're currently in progress of addressing both the feedback to the user during this process and finding ways to speed up the image pull process. The user feedback will likely be solved first.

If you still have the deployment, you should hold off on running the configure step until the bootstrap step reports success. As previously mentioned, if it times out or if you cancel it, you can re-run the bootstrap command and the code will fast-forward to the place it left off at.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Hi Billy,

I am very glad at least now to know the command is idempotent so I am rerunning currently. I seem to be having some problems with Bandwidth even though my connection is via satellite. It is supposed to be "high speed" and it is certainly "high price" (partly why I love accountants so much!).

I have never realised the benefits of using tee and now do, thanks to you!

I can repeat process and this time wait until completion and try to maintain my optimism!

Cheers, and thanks for your patience and understanding

John Olsen.

by the way, great surname!

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

As it turned out the command repeatedly failed (rerunning) at setting up mysql. I have decided to cut my losses and restart with a fresh vm :(.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

The process has now run to the point of logging errors as shown in the attached file.

I am going to rerun the bootstrap command and log to a new file in the hope that this is just a timeout effectively.

John

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

The attached file shows the results of the aforementioned run. It appears to fail at "Running step Configure MySQL"

John

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I am running (and logging) the bootstrap process again. I seem to be having more luck this time, although preliminary results show:

"DEBUG Finished running step Initialize Terraform. Result: ResultType.COMPLETED common.py:231
           DEBUG Starting step Deploying OpenStack Control Plane common.py:214
           DEBUG Skipping step Deploying OpenStack Control Plane common.py:225
           DEBUG /var/snap/openstack/common/state/control.socket service.py:109
           DEBUG Starting step Configure MySQL common.py:214
           DEBUG Running step Configure MySQL common.py:228
[20:22:54] DEBUG Configuring cinder-mysql mysql.py:85"

at the current stage. Stay tuned for the complete log file.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I'm a little concerned at this:

"DEBUG Skipping step Deploying OpenStack Control Plane common.py:225"

Process continuing (slowly)

John

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Process ended with errors. See attachment. Intend to rerun.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

See attachment.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I ran 'sunbeam inspect' as suggested on screen. See attachment.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Not sure if this is a copy? See attachment

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Getting a little further

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Failed again.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

And again.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Maybe a little further.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Looks same as previous.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Hi John,

I think there was some sort of challenge in the mysql bits checked in late last week. I *think* a fix has been included in the candidate channel (rather than the stable channel). I need to investigate further and get back to you on this.

- Billy

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

OK Billy. I am getting nowhere fast currently so I might just give the candidate channel a whirl.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I am unsure whether anyone is still monitoring this channel. I was advised by James Page on the Community page relating to this project that the problem was related to my own personal Bandwidth problems. I was assured that this meant no fault could be attributed to Canonical here.

I have now been in communication with my satellite service provider, who tell me that my signal is not being "shaped" currently by our base authority the NBN (National Broadband Network) pursuant to their "Fair Use Policy". So I was advised to try downloading the suspect OCI images via my own mobile connection (phone as wifi hub). This is producing similar results in that I am unable to obtain more than a few hundred Bytes per second in download speed.

Is it possible that there are too many "network hops" to my house both via satellite and via mobile data. I am in rural Australia. Or, is there some other explanation. I have been completely unable to perform my most important work now for well over a week. I am becoming suspicious of our secret police as I have made public statements that the authorities would find embarrassing. I am unsure on this.

Can you assist me? Please.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I need to add that everything goes well until the point of commencement of downloading the OCI images after MicroK8S is deployed. I am following the process with the -v option when attempting to bootstrap the cluster, as well as following the process of deployment of the individual kubernetes pods with the 'get pods' command.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I have been repeatedly running the bootstrap command after it fails (after an hour or so), often to find the number of containers actually running is being reduced.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I was more recently advised (by my new ISP) to ask you guys about a method of changing the mirror through which downloads are occurring. Can you advise further?

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

It appears that all the xxxx-mysql-0 services are always staying at 0/2 deployed containers. I have just under half the 31 services with 2/2 deployed containers, a few with 1/1, usually the xxxx-0 services stay at 0/2. It is a consistent pattern in terms of which containers remain totally undeployed and "pending". These are the xxxx-mysql-0 services. Usually and eventually the xxxx-mysql-router-0 services are showing 2/2, however it can take 3 hours or more to reach this stage. The successfully deployed containers continue to vary with some being lost and later being recovered.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I am on a satellite connection, and I am wondering whether uploads from the OCI image sources are going via satellite to me or via land routes to Australia, then to satellite then to me?

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

I am expecting the latter to be the case since the footprint of the satellite would not cover most of the globe.

Revision history for this message
John Lloyd Olsen (johnnoe1958) wrote :

Also I feel I should advise you that in the recent process of changing ISP's, which occurred in the middle of this problem I am having (ie in the last few days), my IP Address was changed, so I would not expect that another change would help.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.