Avoid constructing the whole resource tree at connect()

Bug #1663541 reported by Lucas Alvares Gomes
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
python-redfish
Triaged
Undecided
Unassigned

Bug Description

This bug is related to [0] which points out that python-redfish is extremely verbose, perhaps, this one describes the reason for that.

Basically, as soon as you invoke connect()[1] python-redfish will follow pretty much all links to the resources in the Redfish tree and build objects from it, even if you only need to get a single one (say one System, which represents a "Node" in Ironic).

One thing that I've learned working in Ironic is that BMCs are fragile, I worry about the way things are architected in python-redfish because a simple connect could hammer the BMC with many requests at once, the more resources you have, say 1000 resources (between Systems, Managers and Chassis), this is going to be 7000+ requests at startup (Note, this is just an approximately number, keep reading.)

Back to [0], the example I'm using here contains only 1 System, 1 Chassis and 1 Manager [2]. This is enough for python-redfish to send 21 requests [3] and produce 1000+ lines of debug code (mostly for dumping the representation of the objects it creates) at the very first connect().

With the simulator this is not a problem, but I'm afraid that with a real piece of hardware this is going to be problematic.

[0] https://bugs.launchpad.net/python-redfish/+bug/1662896
[1] For example:

  import redfish
  redfish.connect(...)

[2] http://paste.openstack.org/show/598363/
[3] http://paste.openstack.org/show/598366/

More info:
----------

The tests are running the Redfish simulator version 1.0.0 (DSP2043), direct link: https://www.dmtf.org/sites/default/files/standards/documents/DSP2043_1.0.0.zip

With the mockup server: https://github.com/DMTF/Redfish-Mockup-Server/blob/master/redfishMockupServer.py

description: updated
summary: - Avoid constructing the whole resources tree at connect()
+ Avoid constructing the whole resource tree at connect()
description: updated
description: updated
description: updated
description: updated
description: updated
Revision history for this message
Uggla (rene-ribaud) wrote :

FYI, we are testing the library against real hardware:
- HPE BL460C G9 server using ILO4
- Moonshot server using ILOCM
- m510x cartridge using a "custom" ILO4.

So far, we didn't see issues with those hardware.

The issue is more related to the time to gather the data. ILO4 are using fast processor, so it takes time to retrieve the data.
Maybe we need to add a "switch" to select the part we want to retrieve (system, chassis...). We will think about that.
As well, maybe we should implement a refresh method, to update data without the need to reconnect.

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

Hi Rene,

I don't think the problem whether it works with XYZ hardware or not. It's a scalability problem, how many Systems/Chassis/Managers you have in your environment ?

My environment is very simple, I've 1 Chassis, 1 Manager and 1 System. The connect() method will issue 21 requests to the Redfish manager with that number. Now, roughly, let's say each resource you add you get 7 more requests at connect() (roughly cause depending on the resource it will have more or less links).

If you have 1000 resources we are talking about 7000 requests, plus, think that, when the Ironic driver is ready each node that uses that driver will issue a connect(). How many requests are we talking about here ? Just to be able to connect to the Redfish Manager ?

That's my point, I think this is an architectural problem with the current implementation that leads to this scalability issue.

re: refresh() idea, it would be handy indeed.

Revision history for this message
Uggla (rene-ribaud) wrote :

OK thanks, I better understand your point.
Will it be ok with:

1- A "switch" parameter that will allow you to retrieve only one part of the specified resource e.g: system.
2- A toplevel "switch" parameter, that will just capture the top level of the resource and not going down the tree.

So default call to connect(), behavior will be to capture as much as possible info on the BMC.
But by calling the method with the above parameters e.g :connect(res='system', toplevel), we will be able to drastically limit the number of requests. So we could use that in the Ironic drv.

Will it be a good compromise ?
Note : Of course this is an example, we could discuss naming, behavior etc...

Changed in python-redfish:
status: New → Triaged
Revision history for this message
Miles Gould (mgould) wrote :

A better design would be to discover the resource graph lazily; for instance, only populate a connection object's Systems field when the user accesses it. "Autodiscover all available resources" is almost always going to make way more requests than necessary.

Revision history for this message
Bruno Cornec (bruno-cornec) wrote :

I do not disagree with that, but Ironic is no tthe only use case we have in mind with this. We have another consumer, which is a brick interacting with CMDB for which we do want to have an exhaustive parsing of the graph.

So I still think that we need to address this through different modes of exploration, which would allow a consumer to choose how they want to parse, either fully or partially or lazily per branch.

Changed in python-redfish:
milestone: none → 0.6
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.