Loading...
Introduction

This post is about: “Adapting and adopting SQL Plan Management (SPM) to achieve execution plan stability for sub-second queries on a high-rate OLTP mission-critical application”. In our case, such an application is implemented on top of several Oracle 12c multi tenant databases, where a consistent average execution time is more valuable than flexible execution plans. We successfully achieved plan stability implementing a simple algorithm using PL/SQL calling DBMS_SPM public APIs.

Chart below depicts a typical case where the average performance of a large set of business-critical SQL statements suddenly degraded from sub-millisecond to 15 or 20ms, then beccome more stable around 3ms. Wide spikes are a typical trademark of an Execution Plan for one or more SQL statements flipping for some time. In order to produce a more consistent latency we needed to improve plan stability, and of course the preferred tool to achieve that on an Oracle database is SQL Plan Management.

Algorithm

We tested and ruled out adaptive SQL Plan Management, which is an excellent 12c new feature. But, due to the dynamics of this application, where transactional data shifts so fast, allowing this “adaptive SPM” feature to evaluate auto-captured plans using bind variable values captured a few hours earlier, rendered unfortunately false positives. These false positives “evolved” as execution plans that were numerically optimal for values captured (at the time the candidate plan was captured), but performed poorly when executed on “current” values a few hours later. Nevertheless, this 12c “adaptive SPM” new feature is worth exploring for other applications.

We adapted SPM so it would only generate SQL Plan Baselines on SQL that executes often, and that is critical for the business. The algorithm has some complexity such as candidate evaluation and SQL categorization; and besides SPB creation it also includes plan demotion and plan promotion. We have successfully implemented it in some PDBs and we are currently doing a rollout to entire CDBs. The algorithm is depicted on diagram on the left, and more details are included in corresponding presentation slides listed on the right-hand bar. I plan to talk about this topic on an international Oracle Users Group in 2018.

This algorithm is scripted into a sample PL/SQL package, which you can find on a subdirectory on my shared scripts. If you consider using this sample script for an application of your own, be sure you make it yours before attempting to use it. In other words: fully understand it first, then proceed to customize it accordingly and test it thoroughly.

Results

Chart below shows how average performance of business-critical SQL became more stable after implementing algorithm to adapt and adopt SPM on a pilot PDB. Not all went fine although: we had some outliers that required some tuning to the algorithm. Among challenges we faced: volatile data (creating a SPB when table was almost empty, then using it when table was larger); skewed values (create a SPB for non-popular value, then using it on a popular value); proper use of multiple optimal plans due to Adaptive Cursor Sharing (ACS); rejected candidates due to conservative initial restrictions on algorithm (performance per execution, number of executions, age of cursor, etc.)

Conclusion

If your OLTP application contains business critical SQL that executes at a high-rate, and where a spike on latency risks affecting SLAs, you may want to consider implementing SQL Plan Management. Consider then both: “adaptive SPM” if it satisfies your requirements, else build a PL/SQL library that can implement more complex logic for candidates evaluation and for SPBs maintenance. I do believe SPM works great, specially when you enhance its out-of-the-box functionality to satisfy your specific needs.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Introduction

This post is about: “Adapting and adopting SQL Plan Management (SPM) to achieve execution plan stability for sub-second queries on a high-rate OLTP mission-critical application”. In our case, such an application is implemented on top of several Oracle 12c multi tenant databases, where a consistent average execution time is more valuable than flexible execution plans. We successfully achieved plan stability implementing a simple algorithm using PL/SQL calling DBMS_SPM public APIs.

Chart below depicts a typical case where the average performance of a large set of business-critical SQL statements suddenly degraded from sub-millisecond to 15 or 20ms, then beccome more stable around 3ms. Wide spikes are a typical trademark of an Execution Plan for one or more SQL statements flipping for some time. In order to produce a more consistent latency we needed to improve plan stability, and of course the preferred tool to achieve that on an Oracle database is SQL Plan Management.

Algorithm

We tested and ruled out adaptive SQL Plan Management, which is an excellent 12c new feature. But, due to the dynamics of this application, where transactional data shifts so fast, allowing this “adaptive SPM” feature to evaluate auto-captured plans using bind variable values captured a few hours earlier, rendered unfortunately false positives. These false positives “evolved” as execution plans that were numerically optimal for values captured (at the time the candidate plan was captured), but performed poorly when executed on “current” values a few hours later. Nevertheless, this 12c “adaptive SPM” new feature is worth exploring for other applications.

We adapted SPM so it would only generate SQL Plan Baselines on SQL that executes often, and that is critical for the business. The algorithm has some complexity such as candidate evaluation and SQL categorization; and besides SPB creation it also includes plan demotion and plan promotion. We have successfully implemented it in some PDBs and we are currently doing a rollout to entire CDBs. The algorithm is depicted on diagram on the left, and more details are included in corresponding presentation slides listed on the right-hand bar. I plan to talk about this topic on an international Oracle Users Group in 2018.

This algorithm is scripted into a sample PL/SQL package, which you can find on a subdirectory on my shared scripts. If you consider using this sample script for an application of your own, be sure you make it yours before attempting to use it. In other words: fully understand it first, then proceed to customize it accordingly and test it thoroughly.

Results

Chart below shows how average performance of business-critical SQL became more stable after implementing algorithm to adapt and adopt SPM on a pilot PDB. Not all went fine although: we had some outliers that required some tuning to the algorithm. Among challenges we faced: volatile data (creating a SPB when table was almost empty, then using it when table was larger); skewed values (create a SPB for non-popular value, then using it on a popular value); proper use of multiple optimal plans due to Adaptive Cursor Sharing (ACS); rejected candidates due to conservative initial restrictions on algorithm (performance per execution, number of executions, age of cursor, etc.)

Conclusion

If your OLTP application contains business critical SQL that executes at a high-rate, and where a spike on latency risks affecting SLAs, you may want to consider implementing SQL Plan Management. Consider then both: “adaptive SPM” if it satisfies your requirements, else build a PL/SQL library that can implement more complex logic for candidates evaluation and for SPBs maintenance. I do believe SPM works great, specially when you enhance its out-of-the-box functionality to satisfy your specific needs.


Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A DBA deals with performance issues often, and having a SQL suddenly performing poorly is common. What do we do? We proceed to “pin” an execution plan, then investigate root cause (the latter is true if time to next fire permits).

DBMS_SPM provides some APIs to create a SQL Plan Baseline (SPB) from the Cursor Cache, or from a SQL Tuning Set (STS), but not from the Automatic Workload Repository (AWR). For the latter, you need a two-steps approach: create a STS from AWR, then load a SPB from the STS. Fine, except when your next fire is waiting for you, or when deciding which is the “best” plan is not trivial.

Take for example chart below, which depicts multiple execution plans with different performance for one SQL statement. The SQL statement is actually quite simple, and data is not significantly skewed. On this particular application, usually one-size-fits-all (meaning one-and-only-one plan) works well for most values passed on variable place holders. Then, which plan would you choose?

Looking at summary of known Execution Plans’ performance below (as reported by planx.sql), we can see the same 6 Execution Plans.

1st Plan on list shows an average execution time of 2.897ms according to AWR, and 0.896ms according to Cursor Cache; and number of recorded executions for this Plan are 2,502 and 2,178 respectively. We see this Plan contains one Nested Loop, and if we look at historical performance we notice this Plan takes less than 109ms 95% of the time, less than 115ms 97% of the time, and less then 134ms 99% of the time. We also see that worst recorded AWR period, had this SQL performing in under 150ms (on average for that one period).

We also notice that last plan on list performs one execution in 120.847ms on average (as per AWR) and 181.113ms according to Cursor Cache (on average as well). Then, “pinning” 1st plan on list seems like a good choice, but not too different than all but last plan, specially when we consider both: average performance and historical performance according to percentiles reported.

PLANS PERFORMANCE
~~~~~~~~~~~~~~~~~

       Plan ET Avg      ET Avg      CPU Avg     CPU Avg           BG Avg       BG Avg   Executions   Executions                                   ET 100th    ET 99th     ET 97th     ET 95th     CPU 100th   CPU 99th    CPU 97th    CPU 95th
 Hash Value AWR (ms)    MEM (ms)    AWR (ms)    MEM (ms)             AWR          MEM          AWR          MEM   MIN Cost   MAX Cost  NL  HJ  MJ Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)
----------- ----------- ----------- ----------- ----------- ------------ ------------ ------------ ------------ ---------- ---------- --- --- --- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
 4113179674       2.897       0.896       2.715       0.714           96            5        2,502        2,178          8        738   1   0   0     149.841     133.135     114.305     108.411     147.809     133.007     113.042     107.390
  578709260      29.576      32.704      28.865      31.685        1,583        1,436        6,150        1,843         67        875   1   0   0     154.560      84.264      65.409      57.311     148.648      75.209      62.957      56.305
 1990606009      74.399      79.054      73.163      77.186        1,117        1,192          172          214        905      1,108   0   1   0     208.648     208.648      95.877      95.351     205.768     205.768      94.117      93.814
 1242077371      77.961                  77.182                    1,772                     8,780                     949      1,040   0   1   0     102.966      98.206      91.163      89.272     100.147      97.239      90.165      88.412
 2214147219      79.650      82.413      78.242      80.817        1,999        2,143       42,360       24,862        906      1,242   0   1   0     122.535     101.293      98.442      95.737     119.240      99.118      95.266      93.156
 1214505235     120.847     181.113     105.485     162.783          506        1,355           48           12        114        718   1   0   0     285.950     285.950     285.950     285.950     193.954     193.954     193.954     193.954

Plans performance summary above is displayed in a matter of seconds by planx.sql, sqlperf.sql and by a new script create_sql_plan_baseline.sql. This output helps make a quick decision about which Execution Plan is better for “pinning”, meaning: to create a SPB on it.

Sometimes such decision is not that trivial, as we can see on sample below. Which plan is better? I would go with 2nd on list. Why? performance-wise this plan is more stable. It does a Hash Join, so I am expecting to see a Plan with full scans, but if I can get consistent executions under 0.4s (according to percentiles), I would be tempted to “pin” this 2nd Plan instead of 1st one. And I would stay away from 3rd and 5th. So maybe I would create a SPB with 3 plans instead of just one, and include on this SPB 1st, 2nd and 4th on the list.

PLANS PERFORMANCE
~~~~~~~~~~~~~~~~~

       Plan ET Avg      ET Avg      CPU Avg     CPU Avg           BG Avg       BG Avg   Executions   Executions                                   ET 100th    ET 99th     ET 97th     ET 95th     CPU 100th   CPU 99th    CPU 97th    CPU 95th
 Hash Value AWR (ms)    MEM (ms)    AWR (ms)    MEM (ms)             AWR          MEM          AWR          MEM   MIN Cost   MAX Cost  NL  HJ  MJ Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)
----------- ----------- ----------- ----------- ----------- ------------ ------------ ------------ ------------ ---------- ---------- --- --- --- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
 1917891576       0.467       0.334       0.330       0.172          119           33  554,914,504   57,748,249          6      1,188   2   0   0   6,732.017      10.592       1.628       1.572   1,420.864       1.557       1.482       1.261
   99953997       1.162       2.427       0.655       0.492           83           55   58,890,160    2,225,247         12      2,311   0   1   0     395.819     235.474     108.142      34.909      56.008      22.329      12.926       3.069
 3559532534       1.175   1,741.041       0.858      91.486          359           46   21,739,877          392          4         20   1   0   0  89,523.768   4,014.301     554.740     298.545  21,635.611     216.456      54.050      30.130
 3650324870       2.028      20.788       1.409       2.257          251          199   24,038,404      143,819         11      5,417   0   1   0     726.964     254.245      75.322      20.817     113.259      21.211      13.591       8.486
 3019880278      43.465                  43.029                   20,217                    13,349                   5,693      5,693   0   1   0      43.465      43.465      43.465      43.465      43.029      43.029      43.029      43.029
About new script create_sql_plan_baseline.sql

This new script is a life-saver for us, since our response time for an alert is usually measured in minutes, with a resolution (and sometimes a root cause analysis) expected in less than one hour from the time the incident is raised.

This script is quite simple:

  • it provides a list of known Execution Plans including current (Cursor Cache) and historical (AWR) performance as displayed in two samples above, then
  • asks on which Plan Hash Values (PHVs) you want to create a SPB on. It allows you to enter up to 3 PHVs; last
  • asks if you want these plans to be set as FIXED

After you respond to ACCEPT parameters, then a SPB for your SQL is created and displayed. It does not matter if the Plan exists on Cursor Cache and/or on AWR, it finds the Plan and creates the SPB for you. Then: finding known Execution Plans, deciding which one is a better choice (or maybe more than one), and creating a SPB, all can be done very rapidly.

If you still prefer to use SQL Profiles and not SPBs for whatever reason, script coe_xfr_sql_profile.sql is still around and updated. On these 12c days, and soon 18c and beyond, I’d much rather use SQL Plan Management and create SPBs although!

Anyways, enjoy these free scripts and become a faster hero “pinning” good plans. Then don’t forget to do diligent root cause analysis afterwards. I use SQLd360 by Mauro Pagano for deep understanding of what is going on with my SQL statements.

Soon, I will post about a cool free tool that automates the implementation of SQL Plan Management on a high-rate OLTP where stability is more important than flexibility (frequently changing Execution Plans). Stay tuned!


Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A DBA deals with performance issues often, and having a SQL suddenly performing poorly is common. What do we do? We proceed to “pin” an execution plan, then investigate root cause (the latter is true if time to next fire permits).

DBMS_SPM provides some APIs to create a SQL Plan Baseline (SPB) from the Cursor Cache, or from a SQL Tuning Set (STS), but not from the Automatic Workload Repository (AWR). For the latter, you need a two-steps approach: create a STS from AWR, then load a SPB from the STS. Fine, except when your next fire is waiting for you, or when deciding which is the “best” plan is not trivial.

Take for example chart below, which depicts multiple execution plans with different performance for one SQL statement. The SQL statement is actually quite simple, and data is not significantly skewed. On this particular application, usually one-size-fits-all (meaning one-and-only-one plan) works well for most values passed on variable place holders. Then, which plan would you choose?

Looking at summary of known Execution Plans’ performance below (as reported by planx.sql), we can see the same 6 Execution Plans.

1st Plan on list shows an average execution time of 2.897ms according to AWR, and 0.896ms according to Cursor Cache; and number of recorded executions for this Plan are 2,502 and 2,178 respectively. We see this Plan contains one Nested Loop, and if we look at historical performance we notice this Plan takes less than 109ms 95% of the time, less than 115ms 97% of the time, and less then 134ms 99% of the time. We also see that worst recorded AWR period, had this SQL performing in under 150ms (on average for that one period).

We also notice that last plan on list performs one execution in 120.847ms on average (as per AWR) and 181.113ms according to Cursor Cache (on average as well). Then, “pinning” 1st plan on list seems like a good choice, but not too different than all but last plan, specially when we consider both: average performance and historical performance according to percentiles reported.

PLANS PERFORMANCE
~~~~~~~~~~~~~~~~~

       Plan ET Avg      ET Avg      CPU Avg     CPU Avg           BG Avg       BG Avg   Executions   Executions                                   ET 100th    ET 99th     ET 97th     ET 95th     CPU 100th   CPU 99th    CPU 97th    CPU 95th
 Hash Value AWR (ms)    MEM (ms)    AWR (ms)    MEM (ms)             AWR          MEM          AWR          MEM   MIN Cost   MAX Cost  NL  HJ  MJ Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)
----------- ----------- ----------- ----------- ----------- ------------ ------------ ------------ ------------ ---------- ---------- --- --- --- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
 4113179674       2.897       0.896       2.715       0.714           96            5        2,502        2,178          8        738   1   0   0     149.841     133.135     114.305     108.411     147.809     133.007     113.042     107.390
  578709260      29.576      32.704      28.865      31.685        1,583        1,436        6,150        1,843         67        875   1   0   0     154.560      84.264      65.409      57.311     148.648      75.209      62.957      56.305
 1990606009      74.399      79.054      73.163      77.186        1,117        1,192          172          214        905      1,108   0   1   0     208.648     208.648      95.877      95.351     205.768     205.768      94.117      93.814
 1242077371      77.961                  77.182                    1,772                     8,780                     949      1,040   0   1   0     102.966      98.206      91.163      89.272     100.147      97.239      90.165      88.412
 2214147219      79.650      82.413      78.242      80.817        1,999        2,143       42,360       24,862        906      1,242   0   1   0     122.535     101.293      98.442      95.737     119.240      99.118      95.266      93.156
 1214505235     120.847     181.113     105.485     162.783          506        1,355           48           12        114        718   1   0   0     285.950     285.950     285.950     285.950     193.954     193.954     193.954     193.954

Plans performance summary above is displayed in a matter of seconds by planx.sql, sqlperf.sql and by a new script create_sql_plan_baseline.sql. This output helps make a quick decision about which Execution Plan is better for “pinning”, meaning: to create a SPB on it.

Sometimes such decision is not that trivial, as we can see on sample below. Which plan is better? I would go with 2nd on list. Why? performance-wise this plan is more stable. It does a Hash Join, so I am expecting to see a Plan with full scans, but if I can get consistent executions under 0.4s (according to percentiles), I would be tempted to “pin” this 2nd Plan instead of 1st one. And I would stay away from 3rd and 5th. So maybe I would create a SPB with 3 plans instead of just one, and include on this SPB 1st, 2nd and 4th on the list.

PLANS PERFORMANCE
~~~~~~~~~~~~~~~~~

       Plan ET Avg      ET Avg      CPU Avg     CPU Avg           BG Avg       BG Avg   Executions   Executions                                   ET 100th    ET 99th     ET 97th     ET 95th     CPU 100th   CPU 99th    CPU 97th    CPU 95th
 Hash Value AWR (ms)    MEM (ms)    AWR (ms)    MEM (ms)             AWR          MEM          AWR          MEM   MIN Cost   MAX Cost  NL  HJ  MJ Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)
----------- ----------- ----------- ----------- ----------- ------------ ------------ ------------ ------------ ---------- ---------- --- --- --- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
 1917891576       0.467       0.334       0.330       0.172          119           33  554,914,504   57,748,249          6      1,188   2   0   0   6,732.017      10.592       1.628       1.572   1,420.864       1.557       1.482       1.261
   99953997       1.162       2.427       0.655       0.492           83           55   58,890,160    2,225,247         12      2,311   0   1   0     395.819     235.474     108.142      34.909      56.008      22.329      12.926       3.069
 3559532534       1.175   1,741.041       0.858      91.486          359           46   21,739,877          392          4         20   1   0   0  89,523.768   4,014.301     554.740     298.545  21,635.611     216.456      54.050      30.130
 3650324870       2.028      20.788       1.409       2.257          251          199   24,038,404      143,819         11      5,417   0   1   0     726.964     254.245      75.322      20.817     113.259      21.211      13.591       8.486
 3019880278      43.465                  43.029                   20,217                    13,349                   5,693      5,693   0   1   0      43.465      43.465      43.465      43.465      43.029      43.029      43.029      43.029
About new script create_sql_plan_baseline.sql

This new script is a life-saver for us, since our response time for an alert is usually measured in minutes, with a resolution (and sometimes a root cause analysis) expected in less than one hour from the time the incident is raised.

This script is quite simple:

  • it provides a list of known Execution Plans including current (Cursor Cache) and historical (AWR) performance as displayed in two samples above, then
  • asks on which Plan Hash Values (PHVs) you want to create a SPB on. It allows you to enter up to 3 PHVs; last
  • asks if you want these plans to be set as FIXED

After you respond to ACCEPT parameters, then a SPB for your SQL is created and displayed. It does not matter if the Plan exists on Cursor Cache and/or on AWR, it finds the Plan and creates the SPB for you. Then: finding known Execution Plans, deciding which one is a better choice (or maybe more than one), and creating a SPB, all can be done very rapidly.

If you still prefer to use SQL Profiles and not SPBs for whatever reason, script coe_xfr_sql_profile.sql is still around and updated. On these 12c days, and soon 18c and beyond, I’d much rather use SQL Plan Management and create SPBs although!

Anyways, enjoy these free scripts and become a faster hero “pinning” good plans. Then don’t forget to do diligent root cause analysis afterwards. I use SQLd360 by Mauro Pagano for deep understanding of what is going on with my SQL statements.

Soon, I will post about a cool free tool that automates the implementation of SQL Plan Management on a high-rate OLTP where stability is more important than flexibility (frequently changing Execution Plans). Stay tuned!


Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A few years ago I created a post about “how to flush a cursor out the shared pool“, using DBMS_SHARED_POOL.PURGE. For the most part, this method has helped me to get rid of an entire parent cursor and all child cursors for a given SQL, but more often than not I have found than on 12c this method may not work, leaving active a set of cursors I want to flush.

Script below is an enhanced version, where besides using DBMS_SHARED_POOL.PURGE, we also create a dummy SQL patch, then drop it. This method seems to completely flush parent and child cursors. Why using this method instead?: We are implementing SQL Plan Management (SPM), and we have found that in some cases, some child cursors are still shared several hours after a SQL Plan Baseline (SPB) is created. We could argue a possible bug and pursue as such, but in the meantime my quick and dirty workaround is: whenever I want to flush an individual parent cursor for one SQL, and all of its child cursors, I just execute script below passing SQL_ID.

Anyways, just wanted to share and document this purge_cursor.sql script for those in similar need. I have developed it on 12.1.0.2, and haven’t tested it on lower or higher versions.

-- purge_cursor.sql
DECLARE
 l_name VARCHAR2(64);
 l_sql_text CLOB;
BEGIN
 -- get address, hash_value and sql text
 SELECT address||','||hash_value, sql_fulltext 
 INTO l_name, l_sql_text 
 FROM v$sqlarea 
 WHERE sql_id = '&&sql_id.';
 -- not always does the job
 SYS.DBMS_SHARED_POOL.PURGE (
 name => l_name,
 flag => 'C',
 heaps => 1
 );
 -- create fake sql patch
 SYS.DBMS_SQLDIAG_INTERNAL.I_CREATE_PATCH (
 sql_text => l_sql_text,
 hint_text => 'NULL',
 name => 'purge_&&sql_id.',
 description => 'PURGE CURSOR',
 category => 'DEFAULT',
 validate => TRUE
 );
 -- drop fake sql patch
 SYS.DBMS_SQLDIAG.DROP_SQL_PATCH (
 name => 'purge_&&sql_id.', 
 ignore => TRUE
 );
END;
/

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A few years ago I created a post about “how to flush a cursor out the shared pool“, using DBMS_SHARED_POOL.PURGE. For the most part, this method has helped me to get rid of an entire parent cursor and all child cursors for a given SQL, but more often than not I have found than on 12c this method may not work, leaving active a set of cursors I want to flush.

Script below is an enhanced version, where besides using DBMS_SHARED_POOL.PURGE, we also create a dummy SQL patch, then drop it. This method seems to completely flush parent and child cursors. Why using this method instead?: We are implementing SQL Plan Management (SPM), and we have found that in some cases, some child cursors are still shared several hours after a SQL Plan Baseline (SPB) is created. We could argue a possible bug and pursue as such, but in the meantime my quick and dirty workaround is: whenever I want to flush an individual parent cursor for one SQL, and all of its child cursors, I just execute script below passing SQL_ID.

Anyways, just wanted to share and document this purge_cursor.sql script for those in similar need. I have developed it on 12.1.0.2, and haven’t tested it on lower or higher versions.

-- purge_cursor.sql
DECLARE
 l_name VARCHAR2(64);
 l_sql_text CLOB;
BEGIN
 -- get address, hash_value and sql text
 SELECT address||','||hash_value, sql_fulltext 
 INTO l_name, l_sql_text 
 FROM v$sqlarea 
 WHERE sql_id = '&&sql_id.';
 -- not always does the job
 SYS.DBMS_SHARED_POOL.PURGE (
 name => l_name,
 flag => 'C',
 heaps => 1
 );
 -- create fake sql patch
 SYS.DBMS_SQLDIAG_INTERNAL.I_CREATE_PATCH (
 sql_text => l_sql_text,
 hint_text => 'NULL',
 name => 'purge_&&sql_id.',
 description => 'PURGE CURSOR',
 category => 'DEFAULT',
 validate => TRUE
 );
 -- drop fake sql patch
 SYS.DBMS_SQLDIAG.DROP_SQL_PATCH (
 name => 'purge_&&sql_id.', 
 ignore => TRUE
 );
END;
/

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

First things first: Do not rebuild Oracle indexes! … Unless you have to.

If you are even considering rebuilding indexes on an autonomous manner, please stop now, and first spend some time reading some of the many things Richard Foote has to say on his well-recommended blog.

A little side story: Many, many years ago, as I was supporting EBS at Oracle, one day I got a call from a guy with an aussie accent (even today I think it was Richard). He called Support simply to set the record straight (for me): I should not be promoting dropping, recreating or rebuilding indexes with a couple of nice notes/scripts I wrote for the delight of my then EBS customers. This guy educated me back then: such practices were no longer needed, as Oracle automatically reused index blocks after massive deletions (I was being a bad hombre!)… And I agreed with him. This was back before Y2K, and I still do agree with him today. But (there is always a but), there are some corner cases where we still have to rebuild indexes ourselves. And that is the reason of my story today.

Why would you even want to automate index rebuild?

Since my background is Oracle performance,  and “if all you have is a hammer, everything looks like a nail”, I am a bit bias in that direction, and I would have to answer: because there are some valid performance reasons. There is also a lesser reason about disk space savings, which is not so relevant these days of cheap storage… Nevertheless, I will briefly justify below these two (performance and space) under my specific corner case. Maybe you have a similar case of your own to consider…

Performance

We have this key custom OLTP application installed in over 30+ servers and 700+ PDBs. It executes high-rate simple queries that due to current application constraints, often produce “optimal” executions plans with persistent fast full index scans operations on them, or some very long index range scans. The data is a time-based progressive window which contains anything between the last two hours and up the last few days of rapidly changing values (intense on inserts and deletes). All indexes are appended with a monolithically increasing numeric column seeded from a sequence.

If you are thinking “why doing fast full index scans on OLTP queries in the 1st place?” I would have to ask you to bare with me on that question for a few months (we do need to make some improvements on the application data model, but that is a story for a day in the far future yet to come). For now, we are stuck with large indexes growing on their far right (like many citizens today) of every value of their leading column(s), and we have to do something about them! One more thing worth mentioning: the number of reads exceeds by far the number of writes or deletes.

Take for example, the output of script indexes_2b_shrunk.sql (below). We can see that 4th index down WORKXXXXXXXTANCES_PK, has a current size of 114.9 MB (you may need to scroll to the right using a hidden bar at the bottom of the report), but if it were to rebuilt it, its size would go down to 43 MB, reducing then its space by 62.6%. This script basically lists indexes that consume more than 10 MB of disk space, and with a potential savings of at least 25% if they were to be rebuilt.

Output of similar script indexes_2b_shrunk_fast_full_scan.sql (which I executed but omitted from this post), includes the same index because in addition to the considerations of space and savings, such index is referenced by at least one execution plan as a full scan. Then, if I can reduce its size by 62.6%, at least some SQL performing full scans on it would benefit, performance wise. This second script is identical to the first one with one exception: it only includes indexes for which at least a recent execution plan references them with full scans.

These two scripts referenced above take no action on the database – they simply list indexes that could be rebuilt as per their current size and estimated space savings (and if they are referenced by full scans in the case of the latter). By the way, the estimated size if rebuilt is pretty accurate. Just saying.

REPORT_DATE
-------------------
2017-10-07T12-55-01
please wait...
PDB: SOMEPDB
---
TABLE_NAME               OWNER.INDEX_NAME                          SAVING %  CURRENT SIZE ESTIMATED SIZE  COMMAND
------------------------ --------------------------------------- ---------- ------------- --------------  --------------------------------------------------------------------
XXXXINSTANCES            XXXXXXXXXXXX.XXXXINSTANCES_PK              50.6 %       647.7 MB       320.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxinstances_pk REBUILD ONLINE;
XXXXTRANSACTIONKEYS      XXXXXXXXXXXX.XXXXTRANSACTIONKEYS_PK        95.9 %    18,391.6 MB       752.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxtransactionkeys_pk REBUILD ONLINE;
HISTOXXXXXXXXIGNMENT     XXXXXXXXXXXX.HISTOXXXXXXXXIGNMENT_PK       55.1 %       783.8 MB       352.0 MB  ALTER INDEX xxxxxxxxxxxx.histxxxxxxxignment_pk REBUILD ONLINE;
WORKXXXXXXXTANCES        XXXXXXXXXXXX.WORKXXXXXXXTANCES_PK          62.6 %       114.9 MB        43.0 MB  ALTER INDEX xxxxxxxxxxxx.workxxxxxxxtances_pk REBUILD ONLINE;
XXXXTRANSACTIONKEYS      XXXXXXXXXXXX.XXXXTRANSACTIONKEYS_AK        96.9 %    23,896.1 MB       752.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxtransactionkeys_ak REBUILD ONLINE;
XXXXGCWORKSPACE          XXXXXXXXXXXX.XXXXGCWORKSPACE_PK            31.3 %       745.1 MB       512.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxgcworkspace_pk REBUILD ONLINE;
WORKXXXXXXXTANCESINDEX   XXXXXXXXXXXX.WORKXXXXXXXTANCESINDEX_PK     51.8 %        70.5 MB        34.0 MB  ALTER INDEX xxxxxxxxxxxx.workxxxxxxxtancesindex_pk REBUILD ONLINE;
Space

Take same output above, and look now at 5th index down XXXXTRANSACTIONKEYS_AK, currently consuming 23,896.1 MB of space. This is on an X5-2 with NVMe SSD, where space is kind of premium. If we can reduce the size of this index by a factor of 96.9% to only 752 MB, even if there are no execution plans referencing this index with a full scan, we would simply recover close to 23 GB of space! This is good news for operations DBAs who get paged when a Tablespace is running short (kind of annoying, specially at night!).

Of course there are two follow-up questions here, which I may just mention and leave for future blog posts: 1) do we really need to append every index with a monolithically increasing numeric column based on a sequence? (most probably a “no”); and b) considering that most leading columns have moderate cardinality and super wide values, would index compression help to reduce index size? (spoiler alert: definitely a “yes”).

Autonomous Online Index Rebuild for Oracle Multitenant

Script iod_indexes_rebuild_online.sql is what I use today to automate indexes rebuild, through a weekly Oracle Enterprise Manager (OEM) job executed in +30 12c Oracle multitenant CDBs (with +700 PDBs). This script rebuilds indexes that are referenced by full scans, are larger than 10 MB, and with an estimated space reduction of at least 25%. It excludes one particular table that for a very sticky reason needs to be avoided (an explicit lock that produces kind of an ugly concurrency contention)… The script also includes the indexes of a known big table, even if its indexes are not referenced by full scans. Both, the one exclusion and the one inclusion, are clearly marked on the script. You may need your own exclusions and inclusions (like I do on my ThanksGiving dinner), in case you decide to write your own script based on this.

This rebuild script is written in such a way that can be executed at the root container level either through OEM or SQL*plus, or at the PDB level, also from OEM or SQL*Plus.

Output of the autonomous online index rebuild script is recorded in 3 places: a spool file directed to the tmp directory, a trace file, and into the alert log. The content of each varies. The shortest is on the alert log, so we can see we ran something in the event we need to audit that timeframe. And the longest is the trace, which records accurate times and messages. Edited output below is from a sample trace file. On it we can see how an index with a size of 21,973.1 MB was reduced to 809.8 MB (96% reduction). Such index took 36 seconds to be rebuilt online. The estimated time is rubbish, but the estimated size is very accurate in my opinion.

+++ DBE IOD +++ iod_indexes_rebuild_online.sql begin
+++ DBE IOD +++ host: iod-db-xxxx-99999.node.xxx.xx
+++ DBE IOD +++ trace: /u01/app/oracle/diag/rdbms/xxxx1xx2_c/xxxx1xx2/trace/xxxx1xx2_ora_40444_iod_indexes_rebuild_online.trc
+++ DBE IOD +++ report: /tmp/iod_indexes_rebuild_online_2017-10-06T20-11-36.txt
...
*** MODULE NAME:(IOD_INDEXES_REBUILD_ONLINE (15) XXXXX_XX) 2017-10-06 20:32:32.618
*** ACTION NAME:(SELECTING PDB INDEX CANDIDATES) 2017-10-06 20:32:32.618
*** CONTAINER ID:(15) 2017-10-06 20:32:32.618
...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CON_NAME:XXXXX_XX CON_ID:15 DB_NAME:xxxx1xx2 DB_UNIQUE_NAME:XXXX1XX2_C SERVER_HOST:iod-db-xxxx-99999 TIME:2017-10-06T20:32:32
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
*** 2017-10-06 20:32:45.307
*** ACTION NAME:(XXXXTRANSACTIONKEYS_PK {COST}) 2017-10-06 20:34:46.527
...
TABLE_NAME                     OWNER.INDEX_NAME                                         SIZE BEFORE REBUILD       ESTIMATED SIZE  EST SAVING EST SECONDS   SIZE AFTER REBUILD  ACT SAVING ACT SECONDS
------------------------------ ------------------------------------------------------- -------------------- --------------------  ---------- ----------- --------------------  ---------- -----------
XXXXTRANSACTIONKEYS            XXXXXWFR2XX2.XXXXTRANSACTIONKEYS_PK                              21,973.1 MB             880.0 MB      96.0 %       11.0s
...
*** 2017-10-06 20:37:22.683
*** ACTION NAME:(XXXXTRANSACTIONKEYS_PK {DONE}) 2017-10-06 20:37:22.683

XXXXTRANSACTIONKEYS            XXXXXWFR2XX2.XXXXTRANSACTIONKEYS_PK                              21,973.1 MB             880.0 MB      96.0 %       11.0s             809.8 MB      96.3 %       36.0s
*** ACTION NAME:(HISTORICALASSIGNMENT_PK {COST}) 2017-10-06 20:37:22.688
...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                          PDB NAME                                                 SIZE BEFORE REBUILD       ESTIMATED SIZE  EST SAVING EST SECONDS   SIZE AFTER REBUILD  ACT SAVING ACT SECONDS
------------------------------ ------------------------------------------------------- -------------------- --------------------  ---------- ----------- --------------------  ---------- -----------
TOTAL                          XXXXX_XX                                                         53,101.1 MB           2,585.0 MB      95.1 %       28.0s           2,505.2 MB      95.3 %      101.0s
...
+++ DBE IOD +++ DB_INDEXES:63 DB_BEFORE:130,111.0MB(127.1GB) DB_AFTER:12,608.1MB(12.3GB) DB_SAVING:90.3%
+++ DBE IOD +++ report: /tmp/iod_indexes_rebuild_online_2017-10-06T20-11-36.txt
+++ DBE IOD +++ trace: /u01/app/oracle/diag/rdbms/xxxx1xx2_c/xxxx1xx2/trace/xxxx1xx2_ora_40444_iod_indexes_rebuild_online.trc
+++ DBE IOD +++ host: iod-db-xxxx-99999.node.xxx.xx
+++ DBE IOD +++ iod_indexes_rebuild_online.sql end
+++ DBE IOD +++ duration: 8301 seconds
Companion scripts

Besides the two “change-nothing” indexes_2b_shrunk*.sql scripts, and the actual rebuild script, there are two more that are part of the set and can be handy.

The first one is iod_indexes_rebuild_online_mon.sql, and it helps to monitor progress when executed on a separe SQL*Plus sessions of its own. This monitoring script executes a query that if you simply keep entering the “/” command, it keeps telling you the PDB and index that is getting rebuilt (or just ended, or is next). This is kind of “are we there yet?”, but for some reason sometimes I just need to know which PDB and particular index I am working on, just in case I see something like a suspicious spike on my OEM window. I do recommend this at least during the first couple of cycles.

The seconds one is iod_indexes_rebuild_online_kill_job.sql, and before you freak out and close this page, let me state that the “kill” script does nothing but interrupt the execution of the rebuild script on a nice manner, meaning between the rebuild of two indexes, or in other words: after it completes the index getting rebuilt. This is kind of important se we do not end with half-baked indexes which may need to get recreated. I am planning on never have to use the “kill” script, but if I do, I already have a OEM job ready with it: This is like a “poisson pill”. A word of unsolicited advice here: prepare always for the worst, and hope for the best!

Disclaimer

Before you rush to implement these scripts, please review them extensively, justify their use, and fully digest them. Make them yours, customize them as needed, then test, test and test again. Only then consider implementing an “autonomous index rebuild job” on a production environment. Best wishes!


Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

First things first: Do not rebuild Oracle indexes! … Unless you have to.

If you are even considering rebuilding indexes on an autonomous manner, please stop now, and first spend some time reading some of the many things Richard Foote has to say on his well-recommended blog.

A little side story: Many, many years ago, as I was supporting EBS at Oracle, one day I got a call from a guy with an aussie accent (even today I think it was Richard). He called Support simply to set the record straight (for me): I should not be promoting dropping, recreating or rebuilding indexes with a couple of nice notes/scripts I wrote for the delight of my then EBS customers. This guy educated me back then: such practices were no longer needed, as Oracle automatically reused index blocks after massive deletions (I was being a bad hombre!)… And I agreed with him. This was back before Y2K, and I still do agree with him today. But (there is always a but), there are some corner cases where we still have to rebuild indexes ourselves. And that is the reason of my story today.

Why would you even want to automate index rebuild?

Since my background is Oracle performance,  and “if all you have is a hammer, everything looks like a nail”, I am a bit bias in that direction, and I would have to answer: because there are some valid performance reasons. There is also a lesser reason about disk space savings, which is not so relevant these days of cheap storage… Nevertheless, I will briefly justify below these two (performance and space) under my specific corner case. Maybe you have a similar case of your own to consider…

Performance

We have this key custom OLTP application installed in over 30+ servers and 700+ PDBs. It executes high-rate simple queries that due to current application constraints, often produce “optimal” executions plans with persistent fast full index scans operations on them, or some very long index range scans. The data is a time-based progressive window which contains anything between the last two hours and up the last few days of rapidly changing values (intense on inserts and deletes). All indexes are appended with a monolithically increasing numeric column seeded from a sequence.

If you are thinking “why doing fast full index scans on OLTP queries in the 1st place?” I would have to ask you to bare with me on that question for a few months (we do need to make some improvements on the application data model, but that is a story for a day in the far future yet to come). For now, we are stuck with large indexes growing on their far right (like many citizens today) of every value of their leading column(s), and we have to do something about them! One more thing worth mentioning: the number of reads exceeds by far the number of writes or deletes.

Take for example, the output of script indexes_2b_shrunk.sql (below). We can see that 4th index down WORKXXXXXXXTANCES_PK, has a current size of 114.9 MB (you may need to scroll to the right using a hidden bar at the bottom of the report), but if it were to rebuilt it, its size would go down to 43 MB, reducing then its space by 62.6%. This script basically lists indexes that consume more than 10 MB of disk space, and with a potential savings of at least 25% if they were to be rebuilt.

Output of similar script indexes_2b_shrunk_fast_full_scan.sql (which I executed but omitted from this post), includes the same index because in addition to the considerations of space and savings, such index is referenced by at least one execution plan as a full scan. Then, if I can reduce its size by 62.6%, at least some SQL performing full scans on it would benefit, performance wise. This second script is identical to the first one with one exception: it only includes indexes for which at least a recent execution plan references them with full scans.

These two scripts referenced above take no action on the database – they simply list indexes that could be rebuilt as per their current size and estimated space savings (and if they are referenced by full scans in the case of the latter). By the way, the estimated size if rebuilt is pretty accurate. Just saying.

REPORT_DATE
-------------------
2017-10-07T12-55-01
please wait...
PDB: SOMEPDB
---
TABLE_NAME               OWNER.INDEX_NAME                          SAVING %  CURRENT SIZE ESTIMATED SIZE  COMMAND
------------------------ --------------------------------------- ---------- ------------- --------------  --------------------------------------------------------------------
XXXXINSTANCES            XXXXXXXXXXXX.XXXXINSTANCES_PK              50.6 %       647.7 MB       320.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxinstances_pk REBUILD ONLINE;
XXXXTRANSACTIONKEYS      XXXXXXXXXXXX.XXXXTRANSACTIONKEYS_PK        95.9 %    18,391.6 MB       752.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxtransactionkeys_pk REBUILD ONLINE;
HISTOXXXXXXXXIGNMENT     XXXXXXXXXXXX.HISTOXXXXXXXXIGNMENT_PK       55.1 %       783.8 MB       352.0 MB  ALTER INDEX xxxxxxxxxxxx.histxxxxxxxignment_pk REBUILD ONLINE;
WORKXXXXXXXTANCES        XXXXXXXXXXXX.WORKXXXXXXXTANCES_PK          62.6 %       114.9 MB        43.0 MB  ALTER INDEX xxxxxxxxxxxx.workxxxxxxxtances_pk REBUILD ONLINE;
XXXXTRANSACTIONKEYS      XXXXXXXXXXXX.XXXXTRANSACTIONKEYS_AK        96.9 %    23,896.1 MB       752.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxtransactionkeys_ak REBUILD ONLINE;
XXXXGCWORKSPACE          XXXXXXXXXXXX.XXXXGCWORKSPACE_PK            31.3 %       745.1 MB       512.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxgcworkspace_pk REBUILD ONLINE;
WORKXXXXXXXTANCESINDEX   XXXXXXXXXXXX.WORKXXXXXXXTANCESINDEX_PK     51.8 %        70.5 MB        34.0 MB  ALTER INDEX xxxxxxxxxxxx.workxxxxxxxtancesindex_pk REBUILD ONLINE;
Space

Take same output above, and look now at 5th index down XXXXTRANSACTIONKEYS_AK, currently consuming 23,896.1 MB of space. This is on an X5-2 with NVMe SSD, where space is kind of premium. If we can reduce the size of this index by a factor of 96.9% to only 752 MB, even if there are no execution plans referencing this index with a full scan, we would simply recover close to 23 GB of space! This is good news for operations DBAs who get paged when a Tablespace is running short (kind of annoying, specially at night!).

Of course there are two follow-up questions here, which I may just mention and leave for future blog posts: 1) do we really need to append every index with a monolithically increasing numeric column based on a sequence? (most probably a “no”); and b) considering that most leading columns have moderate cardinality and super wide values, would index compression help to reduce index size? (spoiler alert: definitely a “yes”).

Autonomous Online Index Rebuild for Oracle Multitenant

Script iod_indexes_rebuild_online.sql is what I use today to automate indexes rebuild, through a weekly Oracle Enterprise Manager (OEM) job executed in +30 12c Oracle multitenant CDBs (with +700 PDBs). This script rebuilds indexes that are referenced by full scans, are larger than 10 MB, and with an estimated space reduction of at least 25%. It excludes one particular table that for a very sticky reason needs to be avoided (an explicit lock that produces kind of an ugly concurrency contention)… The script also includes the indexes of a known big table, even if its indexes are not referenced by full scans. Both, the one exclusion and the one inclusion, are clearly marked on the script. You may need your own exclusions and inclusions (like I do on my ThanksGiving dinner), in case you decide to write your own script based on this.

This rebuild script is written in such a way that can be executed at the root container level either through OEM or SQL*plus, or at the PDB level, also from OEM or SQL*Plus.

Output of the autonomous online index rebuild script is recorded in 3 places: a spool file directed to the tmp directory, a trace file, and into the alert log. The content of each varies. The shortest is on the alert log, so we can see we ran something in the event we need to audit that timeframe. And the longest is the trace, which records accurate times and messages. Edited output below is from a sample trace file. On it we can see how an index with a size of 21,973.1 MB was reduced to 809.8 MB (96% reduction). Such index took 36 seconds to be rebuilt online. The estimated time is rubbish, but the estimated size is very accurate in my opinion.

+++ DBE IOD +++ iod_indexes_rebuild_online.sql begin
+++ DBE IOD +++ host: iod-db-xxxx-99999.node.xxx.xx
+++ DBE IOD +++ trace: /u01/app/oracle/diag/rdbms/xxxx1xx2_c/xxxx1xx2/trace/xxxx1xx2_ora_40444_iod_indexes_rebuild_online.trc
+++ DBE IOD +++ report: /tmp/iod_indexes_rebuild_online_2017-10-06T20-11-36.txt
...
*** MODULE NAME:(IOD_INDEXES_REBUILD_ONLINE (15) XXXXX_XX) 2017-10-06 20:32:32.618
*** ACTION NAME:(SELECTING PDB INDEX CANDIDATES) 2017-10-06 20:32:32.618
*** CONTAINER ID:(15) 2017-10-06 20:32:32.618
...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CON_NAME:XXXXX_XX CON_ID:15 DB_NAME:xxxx1xx2 DB_UNIQUE_NAME:XXXX1XX2_C SERVER_HOST:iod-db-xxxx-99999 TIME:2017-10-06T20:32:32
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
*** 2017-10-06 20:32:45.307
*** ACTION NAME:(XXXXTRANSACTIONKEYS_PK {COST}) 2017-10-06 20:34:46.527
...
TABLE_NAME                     OWNER.INDEX_NAME                                         SIZE BEFORE REBUILD       ESTIMATED SIZE  EST SAVING EST SECONDS   SIZE AFTER REBUILD  ACT SAVING ACT SECONDS
------------------------------ ------------------------------------------------------- -------------------- --------------------  ---------- ----------- --------------------  ---------- -----------
XXXXTRANSACTIONKEYS            XXXXXWFR2XX2.XXXXTRANSACTIONKEYS_PK                              21,973.1 MB             880.0 MB      96.0 %       11.0s
...
*** 2017-10-06 20:37:22.683
*** ACTION NAME:(XXXXTRANSACTIONKEYS_PK {DONE}) 2017-10-06 20:37:22.683

XXXXTRANSACTIONKEYS            XXXXXWFR2XX2.XXXXTRANSACTIONKEYS_PK                              21,973.1 MB             880.0 MB      96.0 %       11.0s             809.8 MB      96.3 %       36.0s
*** ACTION NAME:(HISTORICALASSIGNMENT_PK {COST}) 2017-10-06 20:37:22.688
...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                          PDB NAME                                                 SIZE BEFORE REBUILD       ESTIMATED SIZE  EST SAVING EST SECONDS   SIZE AFTER REBUILD  ACT SAVING ACT SECONDS
------------------------------ ------------------------------------------------------- -------------------- --------------------  ---------- ----------- --------------------  ---------- -----------
TOTAL                          XXXXX_XX                                                         53,101.1 MB           2,585.0 MB      95.1 %       28.0s           2,505.2 MB      95.3 %      101.0s
...
+++ DBE IOD +++ DB_INDEXES:63 DB_BEFORE:130,111.0MB(127.1GB) DB_AFTER:12,608.1MB(12.3GB) DB_SAVING:90.3%
+++ DBE IOD +++ report: /tmp/iod_indexes_rebuild_online_2017-10-06T20-11-36.txt
+++ DBE IOD +++ trace: /u01/app/oracle/diag/rdbms/xxxx1xx2_c/xxxx1xx2/trace/xxxx1xx2_ora_40444_iod_indexes_rebuild_online.trc
+++ DBE IOD +++ host: iod-db-xxxx-99999.node.xxx.xx
+++ DBE IOD +++ iod_indexes_rebuild_online.sql end
+++ DBE IOD +++ duration: 8301 seconds
Companion scripts

Besides the two “change-nothing” indexes_2b_shrunk*.sql scripts, and the actual rebuild script, there are two more that are part of the set and can be handy.

The first one is iod_indexes_rebuild_online_mon.sql, and it helps to monitor progress when executed on a separe SQL*Plus sessions of its own. This monitoring script executes a query that if you simply keep entering the “/” command, it keeps telling you the PDB and index that is getting rebuilt (or just ended, or is next). This is kind of “are we there yet?”, but for some reason sometimes I just need to know which PDB and particular index I am working on, just in case I see something like a suspicious spike on my OEM window. I do recommend this at least during the first couple of cycles.

The seconds one is iod_indexes_rebuild_online_kill_job.sql, and before you freak out and close this page, let me state that the “kill” script does nothing but interrupt the execution of the rebuild script on a nice manner, meaning between the rebuild of two indexes, or in other words: after it completes the index getting rebuilt. This is kind of important se we do not end with half-baked indexes which may need to get recreated. I am planning on never have to use the “kill” script, but if I do, I already have a OEM job ready with it: This is like a “poisson pill”. A word of unsolicited advice here: prepare always for the worst, and hope for the best!

Disclaimer

Before you rush to implement these scripts, please review them extensively, justify their use, and fully digest them. Make them yours, customize them as needed, then test, test and test again. Only then consider implementing an “autonomous index rebuild job” on a production environment. Best wishes!


Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Having a large number of child cursors can affect parsing performance as hinted by Abel Macias on his blog post about Diagnosis of a High Version Count (HVC). On his post, Abel also refers to a note on MOS which includes a script that dives into the reasons WHY our cursors are not getting shared. Then, for deep-dives in this area, I strongly suggest to read his post and use the referenced script provided at MOS.

Besides longer parse times, and potential library cache contention, manifested by some waits (such as on mutex), there is another side effect that may bite us: CBO may produce a different plan when a SQL statement is hard-parsed while creating a new child cursor. This latter side effect can be critical for transactional applications with SLA depending on very short latencies of some queries.

This post is about a poor’s man script, that with no installation whatsoever, it lists an aggregated summary of the reasons why our cursors are not shared, including child cursor counts and distinct SQL_IDs counts for each reason (see sample output below). I had to write such script since in our environments we cannot simply run diagnostics scripts that create objects in the database, such as the one provided by MOS.

   CURSORS    SQL_IDS REASON_NOT_SHARED
---------- ---------- -----------------------------
    226916       7826 ROLL_INVALID_MISMATCH
     29387        105 BIND_EQUIV_FAILURE
     21794       4027 HASH_MATCH_FAILED
     11588       2134 OPTIMIZER_MISMATCH
     11027        413 BIND_LENGTH_UPGRADEABLE
     11008        384 BIND_MISMATCH
     10125       2697 USE_FEEDBACK_STATS
      4540        109 OPTIMIZER_MODE_MISMATCH
      1652         72 PURGED_CURSOR
      1245         81 BIND_UACS_DIFF
      1062        316 LANGUAGE_MISMATCH
       771        103 LOAD_OPTIMIZER_STATS
       500         52 STATS_ROW_MISMATCH
       238         86 MV_QUERY_GEN_MISMATCH
        94          3 MULTI_PX_MISMATCH
        28          4 AUTH_CHECK_MISMATCH
        23          1 INSUFF_PRIVS

Once I get to see some reasons for not sharing, some responsible for a large number of child cursors (and distinct SQL_IDs), then I can search on MOS as Abel suggested. Ideally, if you are interested in plan stability, you may want to reduce the times the CBO is tasked to create a new child cursor (and potentially a new Execution Plan).

In output sample above, top in our list is ROLL_INVALID_MISMATCH, causing 226,916 child cursors in as many as 7,826 SQL statements. This particular reason for not sharing cursors is due to a persistent gathering of schema object statistics with the explicit request to invalidate cursors. Since we want to promote plan stability, we would need to suspend such aggressive gathering of CBO statistics and validate reason ROLL_INVALID_MISMATCH is reduced.

Anyways, free script used is below. Enjoy it!

-- sql_shared_cursor.sql
SET HEA OFF LIN 300 NEWP NONE PAGES 0 FEED OFF ECHO OFF VER OFF TRIMS ON TRIM ON TI OFF TIMI OFF SQLBL ON BLO . RECSEP OFF;
SPO union_all.sql
SELECT CASE WHEN ROWNUM > 1 THEN 'UNION ALL ' ELSE '          ' END||
       'SELECT '''||column_name||
       ''' reason_not_shared, sql_id, child_number FROM mv WHERE '||
       LOWER(column_name)||' = ''Y''' big_union_all
  FROM dba_tab_columns
 WHERE table_name = 'V_$SQL_SHARED_CURSOR'
   AND owner = 'SYS'
   AND data_length = 1
/
SPO OFF;
GET union_all.sql
A ) -
SELECT COUNT(*) cursors, -
COUNT(DISTINCT sql_id) sql_ids, -
reason_not_shared -
FROM union_query -
GROUP BY reason_not_shared -
ORDER BY cursors DESC, sql_ids DESC;
0 union_query AS (
0 mv AS (SELECT /*+ MATERIALIZE NO_MERGE */ * FROM v$sql_shared_cursor),
0 WITH  
SET HEA ON NEWP 1 PAGES 30
/
!rm union_all.sql

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Having a large number of child cursors can affect parsing performance as hinted by Abel Macias on his blog post about Diagnosis of a High Version Count (HVC). On his post, Abel also refers to a note on MOS which includes a script that dives into the reasons WHY our cursors are not getting shared. Then, for deep-dives in this area, I strongly suggest to read his post and use the referenced script provided at MOS.

Besides longer parse times, and potential library cache contention, manifested by some waits (such as on mutex), there is another side effect that may bite us: CBO may produce a different plan when a SQL statement is hard-parsed while creating a new child cursor. This latter side effect can be critical for transactional applications with SLA depending on very short latencies of some queries.

This post is about a poor’s man script, that with no installation whatsoever, it lists an aggregated summary of the reasons why our cursors are not shared, including child cursor counts and distinct SQL_IDs counts for each reason (see sample output below). I had to write such script since in our environments we cannot simply run diagnostics scripts that create objects in the database, such as the one provided by MOS.

   CURSORS    SQL_IDS REASON_NOT_SHARED
---------- ---------- -----------------------------
    226916       7826 ROLL_INVALID_MISMATCH
     29387        105 BIND_EQUIV_FAILURE
     21794       4027 HASH_MATCH_FAILED
     11588       2134 OPTIMIZER_MISMATCH
     11027        413 BIND_LENGTH_UPGRADEABLE
     11008        384 BIND_MISMATCH
     10125       2697 USE_FEEDBACK_STATS
      4540        109 OPTIMIZER_MODE_MISMATCH
      1652         72 PURGED_CURSOR
      1245         81 BIND_UACS_DIFF
      1062        316 LANGUAGE_MISMATCH
       771        103 LOAD_OPTIMIZER_STATS
       500         52 STATS_ROW_MISMATCH
       238         86 MV_QUERY_GEN_MISMATCH
        94          3 MULTI_PX_MISMATCH
        28          4 AUTH_CHECK_MISMATCH
        23          1 INSUFF_PRIVS

Once I get to see some reasons for not sharing, some responsible for a large number of child cursors (and distinct SQL_IDs), then I can search on MOS as Abel suggested. Ideally, if you are interested in plan stability, you may want to reduce the times the CBO is tasked to create a new child cursor (and potentially a new Execution Plan).

In output sample above, top in our list is ROLL_INVALID_MISMATCH, causing 226,916 child cursors in as many as 7,826 SQL statements. This particular reason for not sharing cursors is due to a persistent gathering of schema object statistics with the explicit request to invalidate cursors. Since we want to promote plan stability, we would need to suspend such aggressive gathering of CBO statistics and validate reason ROLL_INVALID_MISMATCH is reduced.

Anyways, free script used is below. Enjoy it!

-- sql_shared_cursor.sql
SET HEA OFF LIN 300 NEWP NONE PAGES 0 FEED OFF ECHO OFF VER OFF TRIMS ON TRIM ON TI OFF TIMI OFF SQLBL ON BLO . RECSEP OFF;
SPO union_all.sql
SELECT CASE WHEN ROWNUM > 1 THEN 'UNION ALL ' ELSE '          ' END||
       'SELECT '''||column_name||
       ''' reason_not_shared, sql_id, child_number FROM mv WHERE '||
       LOWER(column_name)||' = ''Y''' big_union_all
  FROM dba_tab_columns
 WHERE table_name = 'V_$SQL_SHARED_CURSOR'
   AND owner = 'SYS'
   AND data_length = 1
/
SPO OFF;
GET union_all.sql
A ) -
SELECT COUNT(*) cursors, -
COUNT(DISTINCT sql_id) sql_ids, -
reason_not_shared -
FROM union_query -
GROUP BY reason_not_shared -
ORDER BY cursors DESC, sql_ids DESC;
0 union_query AS (
0 mv AS (SELECT /*+ MATERIALIZE NO_MERGE */ * FROM v$sql_shared_cursor),
0 WITH  
SET HEA ON NEWP 1 PAGES 30
/
!rm union_all.sql

Read Full Article
Visit website

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview