validation-rules-NULL-operands

Bug #745473 reported by jason.p.pickering on 2011-03-30
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
DHIS
High
Unassigned

Bug Description

I defined a validation rule.. RDT positive cases <= RDT tested cases. When the user enters a zero or a blank for the RHS and a value for the LHS, the validation rule is passed, which in fact should not be the case. Setting the data elements to "Store significant zeroes" solves the problem when users enter a zero for the RHS, but it does not solve the problem when they leave the field blank.

I think one possible fix would be that if the value is NULL then it should be treated as a zero. This seems to be more or less the behaviour elsewhere in DHIS 2.

Shinichi Suzuki (shin461) wrote :

This makes large trouble when the multiple data element involved for comparison like as (A+B+C) > (D+E).
If one of A to E has zero value, this rule has skipped to evaluate on the current version (4015).
Please evaluate all element and skip validation check only when one side have not any non zero value operand.
So, please make this higher priority to solve.

tags: added: rule validation
Knut Staring (knutst) on 2011-07-01
Changed in dhis2:
importance: Undecided → High

This is not really a bug but yes its a tricky issue. DHIS by default does not store zero values, this has some disadvantages like the the one mentioned here, but also the big advantage that it reduces overall database size by approx 70 %. This however is completely configurable and if you want to save the zero values you can set all data elements to "store significant zeros".

Regarding validation rules: in some cases it will make sense to treat missing values as zero, like Suzuki points out above. If the user has actually entered an (ignored) zero value, the validation rule should report a violation. On the other hand, if the data is missing I don't think it makes sense to report a violation as we simply don't know the underlying situation.

So there are advantages about treating missing values as zeros, but there are also disadvantages. First, consider when doing a "bulk" validation of for instance a province for many months. In this case there are typically large gaps in the data - many facilities have not reported since they do not provide certain services, they haven't bothered to enter, they had technical problems or whatever. In this case the resulting list of violations would be "spammed" and rendered effectively unusable - there would be thousands of violations as a result of missing data since most validation rules are on the form A < B and if there is no data this would return a violation.

Second, when doing validation directly in data entry forms, for the usual, very large "integrated form" there are typically several sections within that form which don't apply to all facilities. The validation rules dealing with those irrelevant sections would then return violations as there would be no data and potentially cause confusion for users.

So there are benefits and drawbacks on both sides. This issue maybe qualifies for an option - when doing "bulk" validation the user can select whether to treat missing values as zeros or not (?).

Download full text (3.4 KiB)

Currently in validation result (dataentry) we are displaying 2 sections:
1. Voilation of validaiton rules
2. outliers

May be if we can add one more section for this issue say - partial
violations (either leftside or rightside expression may conation null) which
can be displayed by setting the option in system settings.

2011/7/4 Lars Helge Øverland <email address hidden>

> This is not really a bug but yes its a tricky issue. DHIS by default
> does not store zero values, this has some disadvantages like the the one
> mentioned here, but also the big advantage that it reduces overall
> database size by approx 70 %. This however is completely configurable
> and if you want to save the zero values you can set all data elements to
> "store significant zeros".
>
> Regarding validation rules: in some cases it will make sense to treat
> missing values as zero, like Suzuki points out above. If the user has
> actually entered an (ignored) zero value, the validation rule should
> report a violation. On the other hand, if the data is missing I don't
> think it makes sense to report a violation as we simply don't know the
> underlying situation.
>
> So there are advantages about treating missing values as zeros, but
> there are also disadvantages. First, consider when doing a "bulk"
> validation of for instance a province for many months. In this case
> there are typically large gaps in the data - many facilities have not
> reported since they do not provide certain services, they haven't
> bothered to enter, they had technical problems or whatever. In this case
> the resulting list of violations would be "spammed" and rendered
> effectively unusable - there would be thousands of violations as a
> result of missing data since most validation rules are on the form A < B
> and if there is no data this would return a violation.
>
> Second, when doing validation directly in data entry forms, for the
> usual, very large "integrated form" there are typically several sections
> within that form which don't apply to all facilities. The validation
> rules dealing with those irrelevant sections would then return
> violations as there would be no data and potentially cause confusion for
> users.
>
> So there are benefits and drawbacks on both sides. This issue maybe
> qualifies for an option - when doing "bulk" validation the user can
> select whether to treat missing values as zeros or not (?).
>
> --
> You received this bug notification because you are a member of DHIS 2
> India Developers, which is subscribed to DHIS.
> https://bugs.launchpad.net/bugs/745473
>
> Title:
> validation-rules-NULL-operands
>
> Status in DHIS 2 - District Health Information Software:
> New
>
> Bug description:
> I defined a validation rule.. RDT positive cases <= RDT tested cases.
> When the user enters a zero or a blank for the RHS and a value for the
> LHS, the validation rule is passed, which in fact should not be the
> case. Setting the data elements to "Store significant zeroes" solves
> the problem when users enter a zero for the RHS, but it does not solve
> the problem when they leave the field blank.
>
> I think one possible fix would be that if the value is NULL th...

Read more...

Download full text (3.9 KiB)

I agree it is a tricky issue, but I was just surprised that DHIS2 handled
NULLS/blank values in this case differently than aggregation. The situation
in this case was that data has been entered via the mobile client, and
therefore not subject to data entry form validation (through the web
interface). When trying to perform a data validation analysis, I was very
surprised that this rule did not work as I expected, as there were a number
of situations where people have entered something for positive cases, but
nothing for total cases tested (reminiscent of DHIS 1.4 compulsory pairs).

Anyway, it is of course not a big deal to deal with this type of violation
through a custom SQL query, but it does of course get rather time consuming
to make reports and custom SQL.

One possibility around this might be to apply the validation rules to
aggregated data instead of raw data?

Regards,
jason

2011/7/4 Lars Helge Øverland <email address hidden>

> This is not really a bug but yes its a tricky issue. DHIS by default
> does not store zero values, this has some disadvantages like the the one
> mentioned here, but also the big advantage that it reduces overall
> database size by approx 70 %. This however is completely configurable
> and if you want to save the zero values you can set all data elements to
> "store significant zeros".
>
> Regarding validation rules: in some cases it will make sense to treat
> missing values as zero, like Suzuki points out above. If the user has
> actually entered an (ignored) zero value, the validation rule should
> report a violation. On the other hand, if the data is missing I don't
> think it makes sense to report a violation as we simply don't know the
> underlying situation.
>
> So there are advantages about treating missing values as zeros, but
> there are also disadvantages. First, consider when doing a "bulk"
> validation of for instance a province for many months. In this case
> there are typically large gaps in the data - many facilities have not
> reported since they do not provide certain services, they haven't
> bothered to enter, they had technical problems or whatever. In this case
> the resulting list of violations would be "spammed" and rendered
> effectively unusable - there would be thousands of violations as a
> result of missing data since most validation rules are on the form A < B
> and if there is no data this would return a violation.
>
> Second, when doing validation directly in data entry forms, for the
> usual, very large "integrated form" there are typically several sections
> within that form which don't apply to all facilities. The validation
> rules dealing with those irrelevant sections would then return
> violations as there would be no data and potentially cause confusion for
> users.
>
> So there are benefits and drawbacks on both sides. This issue maybe
> qualifies for an option - when doing "bulk" validation the user can
> select whether to treat missing values as zeros or not (?).
>
> --
> You received this bug notification because you are a member of DHIS 2
> India Developers, which is subscribed to DHIS.
> https://bugs.launchpad.net/bugs/745473
>
> Title:
> validation-rules-NULL-ope...

Read more...

Download full text (5.7 KiB)

Does it possible to check no effective-elements in the one side of validation
rule?
If possible, when you found no effective-elements in the one side of the rule,
then you could skip evaluation of this validation rule.
If impossible, there no easy way found for me.
Best regards,
     Suzuki
-----Original Message-----
From: <email address hidden>
[mailto:<email address hidden>] On Behalf Of
jason.p.pickering
Sent: Monday, July 04, 2011 8:07 PM
To: <email address hidden>
Subject: Re: [Dhis2-devs] [Bug 745473] Re: validation-rules-NULL-operands

I agree it is a tricky issue, but I was just surprised that DHIS2 handled
NULLS/blank values in this case differently than aggregation. The situation in
this case was that data has been entered via the mobile client, and therefore
not subject to data entry form validation (through the web interface). When
trying to perform a data validation analysis, I was very surprised that this
rule did not work as I expected, as there were a number of situations where
people have entered something for positive cases, but nothing for total cases
tested (reminiscent of DHIS 1.4 compulsory pairs).

Anyway, it is of course not a big deal to deal with this type of violation
through a custom SQL query, but it does of course get rather time consuming to
make reports and custom SQL.

One possibility around this might be to apply the validation rules to
aggregated data instead of raw data?

Regards,
jason

2011/7/4 Lars Helge Overland <email address hidden>

> This is not really a bug but yes its a tricky issue. DHIS by default
> does not store zero values, this has some disadvantages like the the
> one mentioned here, but also the big advantage that it reduces overall
> database size by approx 70 %. This however is completely configurable
> and if you want to save the zero values you can set all data elements
> to "store significant zeros".
>
> Regarding validation rules: in some cases it will make sense to treat
> missing values as zero, like Suzuki points out above. If the user has
> actually entered an (ignored) zero value, the validation rule should
> report a violation. On the other hand, if the data is missing I don't
> think it makes sense to report a violation as we simply don't know the
> underlying situation.
>
> So there are advantages about treating missing values as zeros, but
> there are also disadvantages. First, consider when doing a "bulk"
> validation of for instance a province for many months. In this case
> there are typically large gaps in the data - many facilities have not
> reported since they do not provide certain services, they haven't
> bothered to enter, they had technical problems or whatever. In this
> case the resulting list of violations would be "spammed" and rendered
> effectively unusable - there would be thousands of violations as a
> result of missing data since most validation rules are on the form A <
> B and if there is no data this would return a violation.
>
> Second, when doing validation directly in data entry forms, for the
> usual, very large "integrated form" there are typically several
> ...

Read more...

Download full text (6.8 KiB)

I sort of understand what is being discussed now, I think. Basically, we
need a way whether to determine whether the validation rule is valid to
validate at all. If the right hand or left hand side are not valid (i.e. a
NULL operator) it may not make any sense to begin to validate the rule at
all. A similar condition could occur with a NaN left/right hand side
operator resulting from a pure zero denominator, as a hypothetical example.
This seems to be a different situation than evaluating the validation rule
when ALL of the operands are actually valid to begin to evaluate the rule.

On Mon, Jul 4, 2011 at 9:25 PM, Shinichi Suzuki <email address hidden> wrote:

> Does it possible to check no effective-elements in the one side of
> validation
> rule?
> If possible, when you found no effective-elements in the one side of the
> rule,
> then you could skip evaluation of this validation rule.
> If impossible, there no easy way found for me.
> Best regards,
> Suzuki
> -----Original Message-----
> From: <email address hidden>
> [mailto:<email address hidden>] On
> Behalf Of
> jason.p.pickering
> Sent: Monday, July 04, 2011 8:07 PM
> To: <email address hidden>
> Subject: Re: [Dhis2-devs] [Bug 745473] Re: validation-rules-NULL-operands
>
> I agree it is a tricky issue, but I was just surprised that DHIS2 handled
> NULLS/blank values in this case differently than aggregation. The situation
> in
> this case was that data has been entered via the mobile client, and
> therefore
> not subject to data entry form validation (through the web interface). When
> trying to perform a data validation analysis, I was very surprised that
> this
> rule did not work as I expected, as there were a number of situations where
> people have entered something for positive cases, but nothing for total
> cases
> tested (reminiscent of DHIS 1.4 compulsory pairs).
>
> Anyway, it is of course not a big deal to deal with this type of violation
> through a custom SQL query, but it does of course get rather time consuming
> to
> make reports and custom SQL.
>
> One possibility around this might be to apply the validation rules to
> aggregated data instead of raw data?
>
> Regards,
> jason
>
>
> 2011/7/4 Lars Helge Overland <email address hidden>
>
> > This is not really a bug but yes its a tricky issue. DHIS by default
> > does not store zero values, this has some disadvantages like the the
> > one mentioned here, but also the big advantage that it reduces overall
> > database size by approx 70 %. This however is completely configurable
> > and if you want to save the zero values you can set all data elements
> > to "store significant zeros".
> >
> > Regarding validation rules: in some cases it will make sense to treat
> > missing values as zero, like Suzuki points out above. If the user has
> > actually entered an (ignored) zero value, the validation rule should
> > report a violation. On the other hand, if the data is missing I don't
> > think it makes sense to report a violation as we simply don't know the
> > underlying situation.
> >
> > So there are advantages about treating missing values as zeros,...

Read more...

Tom Hiatt (hiattt-x) wrote :

The first place I looked for a way around this situation was an IFNULL or IFZERO button on the line below the box where ()*/+-D are shown. I don't know if this is just me (due to SQL exposure perhaps) or if other users would have the same inclination. The plus of having it here instead of a tick box for the whole expression would be the ability to specify for each element.

For example:
IFNULL(DataElement1,0) + IFNULL(DataElement2,0) + IFNULL(DataElement3,NULL)

Changed in dhis2:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers