FWIW some discussion around this and the approach we take from irc: 15:59 < marios> jistr: tosky i think we need a launchpad bug for discussing what we will do here 16:00 < tosky> oh 16:00 < marios> jistr: tosky basically we can 'detect' if sahara is running during controller_pacemaker_1.sh because all the things are still running then. but by pacemaker_3.sh we've moved to next gen HA so we can't tell then if sahara was running earlier 16:01 < marios> jistr: tosky so another thought was writing to file/signal between those two steps but that is ugly :/ 16:01 < tosky> I would kind of expect that you can read the old status before starting to upgrade it, and then you can't properly know the old status because it was upgraded :) 16:02 < marios> tosky: jistr: we'd ideally use 'enabled_services' from the newton templates, but default is for sahara to be off now so it wouldn't tell us if it was previously running (no enabled_services for mitaka) 16:02 < jistr> marios, tosky: but reading the old status might not help, right? It's not so much whether sahar was running or not, as we know it *was* running in Mitaka. It's more about whether the user wants to keep it or not... 16:02 < jistr> i.e. the gap between the Mitaka and Newton defaults 16:03 < marios> jistr: tosky so tosky was trying to answer the 'can we just detect' which is was what was asked of us to do 16:03 < tosky> right, we know that it was on 16:03 < tosky> marios: but as jistr pointed out, we know that from the beginning 16:03 < tosky> if you stripped out sahara, you did some magic custom post-config manipulation 16:04 < marios> tosky: jistr yeah was thinking some more... like right, so it would be not trivial to just remove it then ^^ 16:04 < marios> tosky: jistr so that assumption means we don't need to try and detect and donig the env file/flag like the current review is ok 16:05 < marios> tosky: jistr unless they manually steopped sahara for some reason but then they'd know to explicitly disable it from the docs i guess 16:05 < jistr> marios, tosky: yea i think so. What we're asking is inherently undetectable AFAICT, it's a user decision. 16:06 < jistr> yea it might be detectable in cases when the users did something special with their deployment, as tosky wrote earlier, but that probably can't be considered the general case 16:06 < tosky> and if they did it once, they know they should explicitely remove it with the new switch 16:06 < tosky> yep: if they know about it and they don't want anymore, they can force its removal now; if they don't care, default -> keep 16:07 < jistr> +1 on default = keep 16:08 < jistr> +2'd the patch 16:08 < marios> jistr: advantage of doing this https://review.openstack.org/#/c/375517/2/environments/major-upgrade-pacemaker.yaml is we should be able to reuse for converge (though may not help at that point, i mean we need to use the existing 'deploy sahara' environment file 16:09 < marios> jistr: by 'this' i mean having a KeepSaharaServices in paramater_defaults frmo the controller step... will be persisted 16:10 < jistr> marios: yea we'll need some amount of docs around this whole issue simply b/c the Mitaka vs. Newton defaults differ, so one either needs to set the param on upgrade to false, or add the sahara env file on converge and beyond 16:11 < jistr> but i kinda like the explicitness here, e.g. in case user forgets about the whole issue, their sahara won't disappear just by itself, and they can still fix the issue later relatively easily (either stop it manually, or start passing the env file to start managing sahara configs properly) 16:12 < jistr> (depending on which way they want to go wrt sahara-ful vs. sahara-less deployment :) ) 16:12 < marios> my what a saharaful deployment ! 16:13 < jistr> :) 16:16 < tosky> jistr, marios: are you saying that the default configuration will still require some work to reach a proper configuration? 16:17 < marios> tosky: jistr on converge we will need to explicitly enable the sahara services yes 16:17 < tosky> I kind of understand the need for being explicit (as python tries to enforce), but I'd still argue that the default configuration would lead to a working setup 16:17 < tosky> which matches the environment available before 16:17 < marios> tosky: this review is about whether we keep/remove the sahara services during the controller upgrade 16:19 < jistr> tosky: it's sorta inevitable due to Newton being saharaless by default. So if we want to keep Sahara, we will need to start passing an env file on converge and beyond. And if we want to remove Sahara, we'd need to pass KeepSaharaOnUpgrade: false during the upgrade. 16:20 < tosky> jistr: even if it is sahara-less by default, we are talking about an upgrade here, and an upgrade is from mitaka, no other possibilities, so it should be possible to have the env file for convergence by default 16:20 < jistr> tosky, marios: the only way i can see this could be changed is to default to `KeepSaharaOnUpgrade: false` so that the "remove sahara" use case is without work (== the upgrade by default converges to the Newton defaults), which shifts the work to the "keep Sahara" case 16:20 < tosky> which is against the suggested direction of "whatever was available before" 16:23 < jistr> well i don't think it's possible (within reasonable implementation limits) to include env files automatically based on what we're upgrading from... i mean even if we baked Sahara env into the converge by default (which would make "remove Sahara" case a bit more complicated again perhaps), the user would still need to start passing the sahara env file *after* the upgrade, with any `overcloud deploy` commands 16:23 < marios> jistr: the requirement we currently have is 'whatever was there previously' ... going stricly on what you can deploy with the mitaka templates, then we can assume it was default on, 16:24 < marios> jistr: we could try work out how to detect that if we really need to - detect at pacemaker_1 and signale to pacemaker_3 16:24 < tosky> there is only one path from where we're upgrading from, that's my point 16:25 < jistr> marios, tosky: regardless of what we do for upgrade, we'll still need to start passing the env file after the upgrade, unless we go and change Newton defaults to deploy Sahara by default 16:26 < jistr> so we can't make Newton behave the same way as Mitaka did without passing the additional env file 16:26 < tosky> jistr: uhm, and really no possibility of working around this? 16:26 < tosky> I guess no 16:26 < tosky> unfortunate 16:26 < marios> jistr: 'any stack update operation henceforth' right 16:26 < jistr> yea... 16:27 < marios> jistr: we need a 'resource_registry_defaults' :) 16:27 < jistr> tosky: not so unfortunate though, as this means that all Newton deployments behave the same, regardless what they were upgraded from, which IMO should be prioritized over "i don't want to change the set of env files i pass in" 16:27 < jistr> marios: ^ 16:28 < tosky> jistr: all? So what happens to custom changes to the enabled services when you will migrate from N to O or later? 16:28 < marios> jistr: what do you mean? fresh newton deployment won't have sahara if they don't enable it explicitly dduring their deploy 16:28 < jistr> i mean "Does this Newton env have Sahara?" should be a question answerable by looking at what we pass into the deployment command, not by going over the history of that deployment :) 16:29 < jistr> tosky: as long as users keep passing their custom changes, they should persist 16:29 < marios> jistr: ah so if they have newton with sahara, regardless of from upgrade or from fresh, they would have to include the -e sahara henceforth forever and ever till evermore anyway 16:29 < jistr> marios: yea re "fresh newton deployment won't have sahara if they don't enable it explicitly dduring their deploy" -- that was actually my point. Upgraded deployments should behave the same. 16:30 < jistr> marios: exactly 16:30 < jistr> marios, tosky: if we don't stick to such approach, we'll go crazy just after a few releases 16:30 < jistr> essentially our compute vs. novacompute problem, scaled up :) 16:31 < tosky> so back to the initial point, with the difference that, if you don't specify -e sahara, after the upgrade process you should be able to recover it, while if you want to kill it you need to explicitely pass another env 16:31 < tosky> or configuration, or whatever it is relevant in this case 16:32 < marios> tosky: yeah another env if you like or just set the KeepSaharaOnUpgrade from the environment/major-upgrade-pacemaker.yaml for the controller upgrade 16:32 < jistr> tosky: yea. Basically for both options you need to do an explicit action. We could make the "remove Sahara" case non-explicit too, but it would make it slightly more dangerous perhaps, and the explictiness wouldn't be totally gone, it would just shift to the "keep Sahara" case. 16:33 < jistr> i.e. if we default to KeepSahara: false, we'd need to pass KeepSahara: true when we want to keep it 16:33 < tosky> jistr: explict, but keeping the service (minus the final convergence) should be still the default, so that you don't lose your data if you forgot -e sahara and you redeploy with it 16:34 < tosky> which means, if I get it correctly: please set KeepSahara true 16:34 < tosky> by default 16:34 < jistr> tosky: yea +1, safer 16:35 < marios> jistr: tosky filing a launchpad bug to try capture some of this discussion and so we can track the patches (there may yet be something we need on converge too) 16:37 < jistr> marios, tosky: generally, anytime we want to change the default set of deployed services, we'll have this problem (especially painful if the new default means removing some service i think) [m@m freenode]$