Activity log for bug #243554

Date Who What changed Old value New value Message
2008-06-27 15:25:32 Diogo Matsubara bug added bug
2008-08-05 15:41:04 Joey Stanford launchpad: importance Undecided High
2008-08-05 15:41:04 Joey Stanford launchpad: statusexplanation
2008-08-07 20:47:55 Diogo Matsubara launchpad: bugtargetdisplayname Launchpad itself OOPS Tools
2008-08-07 20:47:55 Diogo Matsubara launchpad: bugtargetname launchpad oops-tools
2008-08-07 20:47:55 Diogo Matsubara launchpad: title Bug #243554 in Launchpad itself: "oops report should record information about the running process" Bug #243554 in OOPS Tools: "oops report should record information about the running process"
2008-08-16 21:10:32 Christian Reis bug assigned to launchpad
2008-08-16 21:11:45 Christian Reis oops-tools: status New Triaged
2008-08-20 20:54:33 Francis J. Lacoste launchpad: status New Triaged
2008-08-20 20:54:33 Francis J. Lacoste launchpad: importance Undecided High
2008-08-20 20:54:33 Francis J. Lacoste launchpad: statusexplanation
2010-09-09 03:40:20 Robert Collins summary oops report should record information about the running process oops report should record information about the running environment
2010-09-09 03:46:07 Robert Collins description When an exception is raised in Launchpad, we record an OOPS for it. It'd be useful for debugging purposes to include information about the process, like memory usage, cpu load and things like that when the exception was raised. Francis suggested that we could use canonical.mem.resident() and canonical.mem.memory() to record that. For loadavg we would need to implement something new. What do you think? When timeouts occur, they can be caused by a) inefficient code or b) external influences. We should gather enough data that we don't spend time debugging the wrong things. Specifically we should gather: - system load average - number of cpucores (to normalise the load average) - process memory & physical memory (to guesstimate whether we're hitting swap) - *process* time since the request started. As each request is in a separate thread, the OS's system accounting can tell us whether 5 seconds of wall clock time was 5 seconds of CPU time, or 1 second of CPU time. The canonical.mem.resident() and canonical.mem.memory() will help in implementing this. os.loadavg will give us load averages. We can grep /proc/cpu as bzr does for the cpu counts, and time.clock() will give us CPU usage. We are hitting many questions we cannot answer today as a result of not knowing these things.
2010-09-09 03:53:25 Robert Collins description When timeouts occur, they can be caused by a) inefficient code or b) external influences. We should gather enough data that we don't spend time debugging the wrong things. Specifically we should gather: - system load average - number of cpucores (to normalise the load average) - process memory & physical memory (to guesstimate whether we're hitting swap) - *process* time since the request started. As each request is in a separate thread, the OS's system accounting can tell us whether 5 seconds of wall clock time was 5 seconds of CPU time, or 1 second of CPU time. The canonical.mem.resident() and canonical.mem.memory() will help in implementing this. os.loadavg will give us load averages. We can grep /proc/cpu as bzr does for the cpu counts, and time.clock() will give us CPU usage. We are hitting many questions we cannot answer today as a result of not knowing these things. When timeouts occur, they can be caused by a) inefficient code or b) external influences. We should gather enough data that we don't spend time debugging the wrong things. Specifically we should gather:  - system load average  - number of cpucores (to normalise the load average)  - process memory & physical memory (to guesstimate whether we're hitting swap)  - *process* time since the request started. As each request is in a separate thread, the OS's system accounting can tell us whether 5 seconds of wall clock time was 5 seconds of CPU time, or 1 second of CPU time. The canonical.mem.resident() and canonical.mem.memory() will help in implementing this. os.loadavg will give us load averages. We can grep /proc/cpu as bzr does for the cpu counts, and time.clock() will give us CPU usage ('per process', which may be equivalent to per-thread when we use it from a non main thread. Testing will be needed). If time.clock does not suffice, a small extension could call clock_gettime(CLOCK_THREAD_CPUTIME_ID, ....) We are hitting many questions we cannot answer today as a result of not knowing these things.
2010-10-05 19:22:27 Gary Poster tags infrastructure oops-tools oops-infrastructure
2010-11-16 13:36:56 Robert Collins description When timeouts occur, they can be caused by a) inefficient code or b) external influences. We should gather enough data that we don't spend time debugging the wrong things. Specifically we should gather:  - system load average  - number of cpucores (to normalise the load average)  - process memory & physical memory (to guesstimate whether we're hitting swap)  - *process* time since the request started. As each request is in a separate thread, the OS's system accounting can tell us whether 5 seconds of wall clock time was 5 seconds of CPU time, or 1 second of CPU time. The canonical.mem.resident() and canonical.mem.memory() will help in implementing this. os.loadavg will give us load averages. We can grep /proc/cpu as bzr does for the cpu counts, and time.clock() will give us CPU usage ('per process', which may be equivalent to per-thread when we use it from a non main thread. Testing will be needed). If time.clock does not suffice, a small extension could call clock_gettime(CLOCK_THREAD_CPUTIME_ID, ....) We are hitting many questions we cannot answer today as a result of not knowing these things. When timeouts occur, they can be caused by a) inefficient code or b) external influences. We should gather enough data that we don't spend time debugging the wrong things. Specifically we should gather:  - system load average  - number of cpucores (to normalise the load average)  - process memory & physical memory (to guesstimate whether we're hitting swap)  - *process* time since the request started. As each request is in a separate thread, the OS's system accounting can tell us whether 5 seconds of wall clock time was 5 seconds of CPU time, or 1 second of CPU time. The canonical.mem.resident() and canonical.mem.memory() will help in implementing this. os.loadavg will give us load averages. We can grep /proc/cpu as bzr does for the cpu counts, and time.clock() will give us CPU usage ('per process', which may be equivalent to per-thread when we use it from a non main thread. Testing will be needed). If time.clock does not suffice, a small extension could call clock_gettime(CLOCK_THREAD_CPUTIME_ID, ....) We are hitting many questions we cannot answer today as a result of not knowing these things. Alternatively: #RUSAGE_THREAD = 1 on my linux system - we'd want a C extension to get the right constant resource.getrusage(1)ru_utime should give us what we need.
2010-11-17 05:02:29 Robert Collins bug watch added http://bugs.python.org/issue10440
2011-03-10 05:39:46 Robert Collins summary oops report should record information about the running environment oopses do not gather environmental data(load, thread-cpu-time, ...)
2011-10-03 22:38:21 Robert Collins bug task added python-oops
2011-10-03 22:38:30 Robert Collins python-oops: status New Triaged
2011-10-03 22:38:33 Robert Collins python-oops: importance Undecided High
2011-10-13 20:39:29 Robert Collins affects oops-tools python-oops-tools