Comment 30 for bug 882036

Revision history for this message
Fabien (Open ERP) (fp-tinyerp) wrote : Re: rounding error

My small contribution to the troll: Float vs Decimal for OpenERP
----------------------------

Some claims that decimal has no rounding issue and should be used instead of float that can have rounding issues. People should know that it's completly false; EVERY NUMERICAL REPRESENTATION IN PYTHON MAY HAVE ROUNDING ISSUES, so you have to address them.

Illustration of the problem
----------------------------------------

For the simplicity of the example, suppose that 1 EUR = 3 USD (or 1 "Pack of 3 Units" = 3 "PCE"). What do you think will be the result of 3 PCEs sold at one USD each converted to EUR in your accounting ?

  >>> from decimal import Decimal as d
  >>> d('1') / d('3') * d('3')
  Decimal('0.9999999999999999999999999999')
  >>> 1.0 / 3.0 * 3.0
  1.0

This examples shoiws that the computation of "1/3" is more accurate using the float representation than the Decimal one.

Of course, you also have the opposite example like:

  >>> from decimal import Decimal as d
  >>> d('0.145')
  Decimal('0.145')
  >>> 0.145
  0.14499999999999999

In this example, the representation of 0.145 is more accurate in decimal than in the float. So, whether you use float or decimal, you may face rounding issues in the computation of unit of measures, currencies, or taxes (like: tax included in the price)

It also can be a problem when comparing numbers with "==", as showned by the following Python code:

  >>> from decimal import Decimal as d
  >>> (d('1') / d('3') * d('3')) == d("1")
  False
  >>> (0.1+0.1*0.1) == 0.11
  False

Solution
--------

So, whether you use floats or decimal representation of numerical values, you will have to handle three things:
  - store all results as decimal values: to not propagate rounding issues
  - implement a comparison operator instead of using "=="
  - get the number represented by your float/decimal number

To illustrate the latest item:
  >>> 0.145
  0.14499999999999999

It's not a problem to have "0.14499999999999999" if you know that you have store it in 3 digits in your database: it's evident that this float number represent 0.145 which is good.

The same apply to decimal, it's not a problem to have "Decimal('0.9999999999999999999')" because if you know that this number represent a 3 digits number, you can store it as 1.000.

OpenERP implements and uses these three things in order to be correct in all numerical representations. We just discovered a bug in the last one (get the exact numerical number represented by a float value and the fix of rvalyi fixes this if applied on fields.float with digits).

The problem arrives if someone uses "==" in his code without using a proper comparison operator.

PS: we do not need to change the clients as they do not perform any computations.

PROs and CONs
-------------

In my opinion, the main reasons to implement float or decimal in the code are not related AT ALL to accuracy or the fact that you can use the '==' operator which is false. You need to implement '==' and conversions for BOTH decimal and floats.

Both options are good (or both are bad) and both will lead to the same difficulties (implement the = operator), here are the main reasons to choose one instead of the other.

PROS for Decimal:

- the main advantage of decimal is to have a range of representable numerical values which is higher than the float one, which is limited by your processor while decimal is limited by your configuration.
- the other advantage of decimal is his rounding operators facilities

CONS for Decimal:

- it's not supported by most web-services like XML-RPC or JSON so you have to pass numbers as strings which is not very clean
- it's very very slow to compute, try this in our python console, you will be surprised:

    import time
    from decimal import Decimal as d

    t = time.time()
    val = 0.12
    for i in xrange(2**20):
    val = val * 2.13
    print time.time() - t

    t = time.time()
    val = d('0.12')
    for i in xrange(2**20):
    val = val * d('2.13')
    print time.time() - t

- it's very easy to introduce development mistakes, and it requires everything to be converted to decimal from a string. I bet lots of developers/modules will do that kind of troubles:

    >>> d('2') < 3.0
    False

- Some/most libraries returns float numbers rather than decimal ones, example with the GTK client: http://www.pygtk.org/pygtk2tutorial/sec-SpinButtons.html So you have to handle the complexity of using float AND the complexity of using decimals. If you get a float number from a lib, you will be embarased to convert it to a decimal:
        >>> d(2.5)
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          File "/usr/lib/python2.6/decimal.py", line 649, in __new__
            "First convert the float to a string")
        TypeError: Cannot convert float to Decimal. First convert the float to a string

        >>> d('%.2f' % 0.145)
        Decimal('0.14')

- I don't think you can use the context facilities in an ERP for the precision/rounding as it's global and not per number so the advantage of having a global context is not useful for our need. (suppose you multiple UoM, Qty and Price and Tax)

The good thing is that no company in the world would probably be impacted by float or decimal representation approximation of numerical values as you need about 100000 times the turnover of Microsoft in one single invoice in order to have a rounding issue of 0.01. (on a 64 bit machine, I suggest to buy & 128 bit machine if you are in this situation :)

The main problem of both is to not introduce development bugs due to computing exact representation or '==' operator. -> that's the reason of this bug which is easily solved by the solution of Rvalyi'solution to be put in fields.float (no need in others part in my opinion as it's just for storage; GTK client does not computed numbers)