Update stats failed at accessing "_PRIVMGR_MD_".ROLE_USAGE

Bug #1446310 reported by Weishiun Tsai
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Released
High
Justin Du

Bug Description

We are seeing random occurrences of update stats failures. This type of failures often returns several error messages at once. One of them generally indicates that "_PRIVMGR_MD_".ROLE_USAGE does not exist or is inaccessible. Presumably that’s the root cause of this type of failures. Here is an example of such a failure:

SQL>update statistics for table mkey2;
*** ERROR[9200] UPDATE STATISTICS for table TRAFODION.ARKCASE_ARKT0149.MKEY2 encountered an error (2235) from statement HSCursor::prepare(). [2015-04-18 17:48:39]
*** ERROR[2235] Compiler Internal Error: Cannot produce a plan in optimizer pass one, originated from file ../optimizer/opt.cpp at line 6907. [2015-04-18 17:48:39]
*** ERROR[8804] The provided input statement does not exist in the current context. [2015-04-18 17:48:39]
*** ERROR[4082] Object TRAFODION."_PRIVMGR_MD_".ROLE_USAGE does not exist or is inaccessible. [2015-04-18 17:48:39]
*** ERROR[8804] The provided input statement does not exist in the current context. [2015-04-18 17:48:39]
*** ERROR[9200] UPDATE STATISTICS for table TRAFODION.ARKCASE_ARKT0149.MKEY2 encountered an error (2235) from statement fetchNumColumn. [2015-04-18 17:48:39]
*** ERROR[2235] Compiler Internal Error: Cannot produce a plan in optimizer pass one, originated from file ../optimizer/opt.cpp at line 6907. [2015-04-18 17:48:39]
*** ERROR[8804] The provided input statement does not exist in the current context. [2015-04-18 17:48:39]
*** ERROR[4082] Object TRAFODION."_PRIVMGR_MD_".ROLE_USAGE does not exist or is inaccessible. [2015-04-18 17:48:39]
*** ERROR[8804] The provided input statement does not exist in the current context. [2015-04-18 17:48:39]

In a recent run on a 4-node cluster with the v1.1.0 rc1 (v0417) build, we saw a total of 61 occurrences of such failure after running through the entire SQL regression test suite.

grep 4082 */*.log | grep ROLE_USAGE | wc -l
61

This high number is definitely a cause of a concern. The more troublesome aspect of this issue is that once the error occurs, SB_HISTOGRAMS and SB_HISTOGRAM_INTERVALS would get into an inconsistent state and the schema is no longer droppable. The user ends up having to manually clean up the tables and the schema.

Unfortunately, there is no certain way to reproduce this problem. Running on 2 different clusters would often yield different results. However, some analysis should be done to see why the security metadata has become such a bottleneck for our own compiler to cause all these failures. Measures should be taken on either the security side or the compiler side, or both, to avoid this.

Changed in trafodion:
assignee: nobody → Roberta Marton (roberta-marton)
Changed in trafodion:
assignee: Roberta Marton (roberta-marton) → Justin Du (justin-du-2)
milestone: r1.1 → r1.2
Revision history for this message
Justin Du (justin-du-2) wrote :

Wei-Shiun, can you re-try the related tests once you get the build with my check-in?

Revision history for this message
Weishiun Tsai (wei-shiun-tsai) wrote :

This is believed to be fixed by the same fix for https://bugs.launchpad.net/trafodion/+bug/1438372. Ran through the entire SQL regression test with the v0523 build installed both on a Cloudera cluster and a Hortonworks cluster. Did not see this particular error. This case will be closed.

-bash-4.1$ grep 4082 */*.log | grep ROLE_USAGE | wc -l
0

Changed in trafodion:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.