wxBanker

Importing a CSV file can fail due to unexpected thousands separator

Bug #370571 reported by Steve Dodier-Lazaro on 2009-05-01

Affects		Status	Importance	Assigned to	Milestone
	wxBanker	Fix Released	Medium	Unassigned	wxBanker 0.6

Bug Description

Hi,

A CVS file can fail to be parsed if for instance there is an occurence of "+1 000" instead of "+1000". The spaces between numbers should be parsed in order to avoid this.

Also, I didn't check but it might be good to check if the currencies are parsed out of the amount's value aswell (for instance, € $ DOL EUR $ , etc).

Finally, i suggest you change the layout of the import form, and use something similar to :

1 | 2 | 3 | 4 | ...
For each column, a list of options (Date/Desc/Amount).

This should be much easier to understand than the current approach.

Cordially, SD.

Related branches

lp:~kolmis/wxbanker/csvimport (Merged)

Revision history for this message

Karel Kolman (kolmis) wrote on 2009-05-06:

I guess the right way to solve the first issue is to add a "thousand separator" setting next to the decimal one in the csv import dialog.

As for the second one, removal of non-digit characters on both ends of the column value might be an option, i'll think about that.

About the csv import dialog design: I actually prefer the import dialog mapping wxBanker keys to csv file columns instead of the other way around as you're suggesting. Bank exported csv files will usuallly have 10+ columns, you're mapping to 3 wxBanker values - date, amount, description, and the way it is is easier. The chaining of many csv columns into the description one seems intuitive too. Moneydance (or was that some other java personal banking app ?) has a csv import tool that works the way you would like, you map column 1 to something, column 2 to something, ... etc. I absolutely hated it when trying it out....

Revision history for this message

Michael Rooney (mrooney) wrote on 2009-05-06:

Yes, we could easily just remove spaces in the amount field for now, but a more robust solution would be probably be to remove every character that isn't a digit, negative sign, thousand or decimal separator. As Karel said this depends on having a user specified thousands separator which is probably a good idea anyway.

Changed in wxbanker:
importance:	Undecided → Medium
milestone:	none → 0.6
status:	New → Confirmed

Revision history for this message

Michael Rooney (mrooney) wrote on 2009-05-06:

Ah, I see your branch now, let me merge that in!

Revision history for this message

Michael Rooney (mrooney) wrote on 2009-05-06:

I think actually just ignoring all characters that aren't digits, a negative sign, or the decimal sep means we don't need the thousands sep, since it is replaced with nothing anyway. I made a new implementation of parseAmount (also pushed up):

    def parseAmount(self, val, settings):
        val = val.replace(settings['decimalSeparator'], '.')
        amountStr = ""
        for char in val:
            if char.isdigit() or char in "-.":
                amountStr += char

return float(amountStr)

What do you think about using this and removing the thousands sep from the UI?

summary:

- Importing a CSV file can fail due to "big numbers"
+ Importing a CSV file can fail due to unexpected thousands separator

Revision history for this message

Karel Kolman (kolmis) wrote on 2009-05-07:

i'm in favor of your solution !

Revision history for this message

Michael Rooney (mrooney) wrote on 2009-05-07:

Okay, I've reverted the changes to csvimportframe. Hopefully this should be fairly robust! I made one more change, which was to replace the decimal sep AFTER the loop. Otherwise if your separator is a comma for example and you have periods elsewhere, those periods won't be removed. The new way allows for float representations like "1.000,23":

    def parseAmount(self, val, settings):
        amountStr = ""
        for char in val:
            if char in "-1234567890"+settings['decimalSeparator']:
                amountStr += char

amountStr = amountStr.replace(settings['decimalSeparator'], '.')
return float(amountStr)

Changed in wxbanker:
status:	Confirmed → Fix Committed

Revision history for this message

Karel Kolman (kolmis) wrote on 2009-05-07:

Great, i suggest you change the settings parameter to passing of the decimalSeparator only.

I also created a simple regression test for this in my csvimport branch. I pretty much hate those doctests and I feel that unittests should live in its own space (directory). Therefore I placed the parseAmount testcase in the created test directory. Could you consider using this approach ?

Revision history for this message

Michael Rooney (mrooney) wrote on 2009-05-07: Re: [Bug 370571] Re: Importing a CSV file can fail due to unexpected thousands separator

I agree, the doctests are on their way out and new tests are great as
unit tests. I'll check out your structure and perhaps separate out the
tests, it seems more sustainable. I didn't do that initially as I just
don't know how to tell unittest to run tests across different files
and integrate the results.

Revision history for this message

Karel Kolman (kolmis) wrote on 2009-05-07:

You might look into test/alltests.py in my latest commit to the csvimport branch.

It seems fairly easy, you specify a list of modules and tell the TestLoad to create a TestSuite from it. The TextTestRunner than runs the tests and creates the text output (used in unittest.main() also I believe), the code is:

modules = ['test1', 'othertests']
suite = unittest.TestLoader().loadTestsFromNames(modules)
unittest.TextTestRunner(verbosity=2).run(suite)

Revision history for this message

Michael Rooney (mrooney) wrote on 2009-05-07:

#10

On Thu, May 7, 2009 at 2:36 PM, Karel Kolman <email address hidden> wrote:
> You might look into test/alltests.py in my latest commit to the
> csvimport branch.

Thanks Karel, I've pushed up some commits which merged your branch and
separated out the unittests into separate files in the test directory.
It is a much nicer framework indeed.

Michael Rooney (mrooney) on 2009-12-14

Changed in wxbanker:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.