Add option to select scope for doctests

Bug #1178807 reported by asmeurer
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Pyflakes
New
Wishlist
Unassigned

Bug Description

test.py:

import sys

import sys

def func():
    """
    A docstring
    >>> import sys
    >>> sys.path
    """

    return 1

The pyflakes output is

test.py:1: 'sys' imported but unused
test.py:6: redefinition of unused 'sys' from line 1

The line 6 "redefinition of unused 'sys' from line 1" is incorrect, because the doctest scope is separate from the module scope.

I can't figure out how to reproduce it with a small file, but in this file: https://github.com/sympy/sympy/blob/master/sympy/physics/mechanics/functions.py, the output is

sympy/physics/mechanics/functions.py:160: redefinition of unused 'diff' from line 27
sympy/physics/mechanics/functions.py:200: redefinition of unused 'dynamicsymbols' from line 19
sympy/physics/mechanics/functions.py:261: redefinition of unused 'dynamicsymbols' from line 19
sympy/physics/mechanics/functions.py:458: redefinition of unused 'Point' from line 26
sympy/physics/mechanics/functions.py:508: redefinition of unused 'Point' from line 26

That is, it complains about the imports in the doctests, but not the imports themselves (because they actually are used in the file).

Revision history for this message
asmeurer (asmeurer) wrote :

I think it's because it's pushing a function scope for doctests, but really, it should be using a completely new module scope. Doctests are run like they a from-scratch Python interpreter.

Revision history for this message
Florent (florent.x) wrote :

IIRC the imports at module level are available when running the doctests.
This could explain the warning.

I don't use doctests enough to confirm it, though.

Revision history for this message
asmeurer (asmeurer) wrote :

Ah OK. So apparently Python's doctest module does work this way, but SymPy's doctester does not. Would it be possible to add an option to make doctests operate in their own module scope? This is IMHO the correct way to write doctests, because then users can just copy and paste the output and it will work (and that's why SymPy's doctester works this way).

Florent (florent.x)
Changed in pyflakes:
importance: Undecided → Wishlist
Revision history for this message
John Vandenberg (jayvdb) wrote :

This was fixed on Mar 30, 2014 by ffe926c282 (https://github.com/pyflakes/pyflakes/commit/ffe926c282).

Reproduced and confirmed fixed using https://github.com/sympy/sympy/commit/96dcc6a635
Sympy master has the same underlying problem, but the names are different.

Phil Frost (bitglue)
Changed in pyflakes:
status: New → Fix Released
Revision history for this message
asmeurer (asmeurer) wrote :

The real issue here was a feature request to add an option to use module scopes for doctests (see comment #3).

Phil Frost (bitglue)
Changed in pyflakes:
status: Fix Released → New
summary: - Doctest scope is getting mixed in with regular scope
+ Add option to select scope for doctests
Revision history for this message
John Vandenberg (jayvdb) wrote :

I'd like to explore whether it is necessary to run doctests as a top level scope.

I've been working towards:
1. processing the doctest scope so it can use the module scope (sloppy use of doctest, but very common)
2. ensuring that the doctest scope does not report errors if it does not use the module scope (good use of doctest)
3. reporting errors when the module scope relies on the doctest scope (very very bad use of doctest)

The doctest scope currently can re-import names also imported in the module scope, without error. I believe that is very very close to the request in #3, that doctest have their own 'module scope'.
e.g. test_importInDoctestAndAfter : https://github.com/pyflakes/pyflakes/blob/master/pyflakes/test/test_doctests.py#L217

Note that `test_importBeforeAndInDoctest` above `test_importInDoctestAndAfter` is a bit misleading, being marked as 'todo'. The code sample in that test does not report any errors, as it is effectively equivalent to `test_importInDoctestAndAfter`.

IMO, broadly speaking, doctests shouldnt re-use names existing in the module scope, with imports being the obvious exception. If there is a case were re-importing in the doctest scope causes an error, IMO that needs to be fixed.
Are there other situations where doctest re-using names from the module scope is preferable or even necessary?

On the structural problems mentioned in #1, a recent checkin (https://github.com/pyflakes/pyflakes/commit/93aa3c435505b8541b151c3e4b24c0ec4333f0bb) changed it so that a doctest is a module scope instead of a function scope. This is still not perfect, as the doctest scope should reside in the module scope, but it is currently deeply nested and resides with the object's scope where the docstring exists. That means currently it could inherit variables from any scope where the docstring resides; clearly a source of bugs, and this fix has been included in two of my pull requests. Maybe I should pull it out into a separate pull request with dedicated tests.

The other outstanding problem I am aware of is that a module level import which is only used in doctests is not reported as an error. I have a fix for that (https://github.com/jayvdb/pyflakes/commit/a8ccb1cef8ec005b07ea347823c4987d995d29c8), but it is waiting on a few other patches to be merged.

The unclear case that I am aware of is 'from x import foo' in module scope being shadowed by a 'from y import foo' in doctest scope. I suspect we'll need to declare that problem as unsolvable by pyflakes, as doctests should show recommended importation for clients, whereas the module level imports of the same names are sometimes different (more complex) due to internal mess/structure being hidden from the end-user.

Revision history for this message
John Vandenberg (jayvdb) wrote :

doctest scope level pull request: https://github.com/pyflakes/pyflakes/pull/52

Revision history for this message
asmeurer (asmeurer) wrote :

For my use-case, even imports should be required for doctests. Doctests should be considered for all intents and purposes to be completely independent modules. The idea is that a user should be able to run the code in a doctest in their own Python session and completely reproduce the output.

Revision history for this message
Marius Gedminas (mgedmin) wrote :

Consider this:

    def parse_color(color):
        """Parse a color string and return (r, g, b) values from 0.0 to 1.0.

        Example:

            >>> parse_color('#fff')
            (1.0, 1.0, 1.0)

       """

Are you saying the name 'parse_color' shouldn't be automatically available in the doctest and instead there should be an explicit import? That feels to me like insisting on unnecessary boilerplate.

(BTW Launchpad's issue tracker is *terribly inconvenient* would you PLEASE migrate to GitHub issues?)

Revision history for this message
John Vandenberg (jayvdb) wrote :

A stricter scope for doctest is easy to construct. The problem will be that we may need to create a few pyflakes options in order to accommodate various opinions of 'strict' ;-) pyflakes doesnt seem to like providing end-users with options -- only PYFLAKES_BUILTINS and PYFLAKES_DOCTEST exist. :/

I expect most people would expect that a strict doctest scope would include only what is exported using __all__, if __all__ was defined (and sympy appears to do that systematically). That would typically eliminate imports, requiring re-imports in the doctest to access any other modules.

That would mean a doctest is not quite a completely independent module; instead each doctest essentially has an implicit "from <current module> import *' at the beginning.

@asmeurer, would that be 'close enough' for your needs?

If that still isnt good enough, then I think we need to design user selection of at least three alternatives:
1. cpython doctest mode (default)
2. __all__ exported names only
3. empty scope

The obvious approach is some command line argument.

If the module includes a footer of `if __name__ == "__main__": import doctest; doctest.testmod()` , we could detect the list of desired globals by identifying `doctest.testmod(globs=[(name, globals()[name]) for name in __all__])` and `doctest.testmod(globs=None)` as a means of enabling option 2 or 3 respectively. The detection logic could also support extracting a custom list of valid names using a literal dict, i.e. doctest.testmod(globs={'a': 1, 'b': b}).

A little more difficult approach, which probably needs assistance from pep8/flake8, is to export the "inside doctest" status to flake8 plugins, so that people wanting a stricter rules within doctests only can create flake8 plugins to achieve their own objectives.

Revision history for this message
John Vandenberg (jayvdb) wrote :

* would include only what is exported using __all__, if __all__ was defined, plus the name of the function/class the docstring is attached to if it isnt in __all__

(and +1 for moving issues to github ;-)

Revision history for this message
asmeurer (asmeurer) wrote :

@mgedmin yes, for my use-case yes, we do use that kind of boilerplate (here are some examples http://docs.sympy.org/latest/modules/simplify/simplify.html). I agree that not everyone wants this, though, which is why I suggested making it an option. The different levels of scoping that @jayvdb suggests may also be useful, although I would be hesitant to implement them unless an actual doctest runner uses them.

(and I'm also very +1 to moving to GitHub FWIW)

Revision history for this message
John Vandenberg (jayvdb) wrote :

I think all doctest runners supports custom globals. The following fails:

`
_m = 1

def doctest_stuff(self):
    '''
        >>> _m
        1
    '''
    return 1

if __name__ == "__main__":
    import doctest
    doctest.testmod(globs={})
`

The problem is that pyflakes and other tools are assuming doctests are run with the default `doctest.testmod` arguments.

I've raised a bug for the list of desired globals to be more easily specified in a way other tools can detect.

http://bugs.python.org/issue25699

Revision history for this message
John Vandenberg (jayvdb) wrote :

It looks like http://bugs.python.org/issue25699 wont be accepted, so pyflakes will likely need to build this in the module.
In which case, we need to decide how to allow users to configure pyflakes. Only using env vars is not very user friendly.

(and I'm also very +1 to moving to GitHub FWIW)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.