Loading...

Follow Josip Pojatina Blog on Feedspot


Valid
or
Continue with Google
Continue with Facebook

Quite recently I had to trace/profile an extremely intensive batch job to provide information to Oracle for a furter analysis.

At very first moment it was obvious that this is a wrong approach and whatever we’ll get, it cannot server as evidence of performance issue that we have, as TKProf is designed to trace single SQL statement or short duration of some job, it seems this is a standard procedure when dealing with Oracle Support.

Point of this article is not to elaborate if unified approach is good for all kind of performance issues, but rather to show what you can expect as a result and provide technical evidence.

When you enable tracing, either by using AFTER LOGON trigger (old fashion) or by using services, first you need to know what are you tracing and how long you are planning to trace it.

In this case, we are trying to trace job that is running several hours, and that is running in parallel.

This is a extreme example what you should never trace.

What are the consequences of using wrong method to trace such job?

The most obvious one is 2MB/sec grow of trace files.

Second one is heavy load on database server, where the top processes are ksoftirqd.

You can easily find on Web that ksoftirqd is system internal process that enqueues interrupts that are coming from the attached devices.

When you see ksoftirqd among top processes, it is a sign that your system is under heavy interrupt load.

The following picture can support that (red are system calls that are due to the ksoftirqd activity).

Next picture is showing the same job after the tracing has been disabled.

Difference is more than a visible (green are user processes that are coming from the database).

The last issue with misusing trace mechanism for tracing heavy, long duration jobs that are running in parallel is the result itself.

In AWR report you can expect to see unusual order of Top SQL queries for example.

When you tracing, at the top will be queries that are coming through the several function calls (one procedure/package/function is calling another one that is calling the third one for example), not the queries that you will normally see when tracing is disabled.

Hopefully this article will help you to avoid similar mistakes and convince you to use proper tool to do a job.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

In this article I’m going to disclose Oracle Resource Manager Internal plan and how it can influence on database performances.

Many of Oracle DBA are convinced that Resource Manager is not active, and that RM has to be enabled to be active.

To confirm that theory, DBA can execute the following statement:

select name, display_value, description from v$parameter where name like 'resource_manager_plan';

resource_manager_plan	Null	resource mgr top plan

But that is only partly correct, as one of my client observe the following wait event in AWR report:

resmgr:cpu quantum

By searching the net, it’s clear that event I srelated to session waiting to be allocated a quantum of CPU time.

This comes as a big surprise as that event is connected with Resource Manager which is disabled, and DBA claims RM is disabled.

Although user defined RM plan is not enabled by default, RM have two additional plans that are enabled by default.

– DEFAULT_MAINTENANCE_PLAN
– INTERNAL_PLAN

While the former one is well known by now (it is responsible for statistics maintenance, segment adviser etc., and is enabled by defined Scheduler Window), INTERNAL_PLAN is still a mystery.

First let’s check if that plan is enabled or not, by executing the following SQL:

select name, is_top_plan, cpu_managed, instance_caging, parallel_servers_total from v$rsrc_plan;
INTERNAL_PLAN	TRUE	OFF	OFF	10

Even if instance caging (might be covered in future) is not enabled, we have limits for PARALLEL_SERVERS_TOTAL, which is currently 10.

Now, if I reduce the number of PARALLEL_SERVER_TOTAL value from 10 → 2:

alter system set parallel_servers_target = 2 scope = memory;

and execute the same command again:

select name, is_top_plan, cpu_managed, instance_caging, parallel_servers_total from v$rsrc_plan;
INTERNAL_PLAN	TRUE	OFF	OFF	2

PARALLEL_SERVER_TOTAL is now 2.

As RM custom plan is not enabled, and instance caging is also off, when that event can be seen?

As you can see that I can influence on PARALLEL_SERVER_TOTAL column by changing the parameter PARALLEL_SERVERS_TARGET, which is the number of parallel server processes allowed to run in parallel before statement queuing will be used, and it has to be set to:
0.75x PARALLEL_MAX_SERVERS (maximum number of parallel servers processes allowed to run on the system),
wait event will be visible only when running SQL statements in parallel.

If you see in AWR report the following wait event among top events:

resmgr:cpu quantum

I hope you’ll know what it is, why you can see it and how to resolve it.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This time I’ll explain a challenge that I had with optimization of RPM worksheet tables.

Even if you have all elements of enterprise architecture tuned, there is still something you can do to speed up the Query or DML operations.

Here I’ll present what impact from the physical design you can expect.

All examples will use RPM_WORKSHEET_ZONE_DATA table from Oracle Retail Price Management module as a perfect example.

From that table I’ll create custom XX_RPM_WRKSHT_ZONE_DATA table, that will have same number of records as RPM_WORKSHEET_ZONE_DATA, but different number of columns.

While RPM_WORKSHEET_ZONE_DATA has 238 columns, our custom table XX_RPM_WRKSHT_ZONE_DATA will have only 8 columns.

create table XX_RPM_WRKSHT_ZONE_DATA
(	WORKSHEET_ITEM_DATA_ID,
	STATE,  
	ACTION_FLAG, 
	WORKSHEET_STATUS_ID,
	ZONE_ID, 
	DEPT, 
	CLASS, 
	SUBCLASS
   ) 
as
select 
WORKSHEET_ITEM_DATA_ID, 
STATE, 
ACTION_FLAG, 
WORKSHEET_STATUS_ID, 
ZONE_ID, 
DEPT, 
CLASS, 
SUBCLASS
from RPM_WORKSHEET_ZONE_DATA;   

1.
In the first test, I’ll show difference when updating width (238 columns) vanilla RPM_WORKSHEET_ZONE_DATA table comparing with same update against custom, narrow (8 columns only) XX_RPM_WRKSHT_ZONE_DATA table that have the same number of records as vanilla RPM table.

update RPM_WORKSHEET_ZONE_DATA set action_flag = 4 , state =1 where dept =999;

610792 rows updated.
Elapsed: 00:01:16.73

Executing the same DML on custom table:

update xx_rpm_wrksht_zone_data set action_flag = 4 , state =1 where dept =999;

610792 rows updated.
Elapsed: 00:00:08.59

From results it’s visible that we updated exactly the same number of records (610.792 records in both cases).

But if you look at the time needed for UPDATE to complete, in the first case Oracle needs 1 minute and 16 seconds, while in case of our custom, narrow (8 columns width) table, the same task completes in a little less than 9 seconds.

That is almost 9x difference. If I repeat the same test, difference will jump to almost 11x.

2.
Next test will confront the same two tables from previous example, but this time I’ll execute ordinary SELECT statement.

First test is against vanilla table.

select /*+ full(a) */ * from RPM_WORKSHEET_ZONE_DATA a where dept = 999;

610792 rows selected.

Elapsed: 00:01:56.00

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     539350  consistent gets
     539333  physical reads
          0  redo size
  142799428  bytes sent via SQL*Net to client
     448429  bytes received via SQL*Net from client
      40721  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     610792  rows processed

I’m repeating the same test with custom table with only 8 columns.

select /*+ full(a) */ * from xx_rpm_wrksht_zone_data a where dept = 999;

610792 rows selected.

Elapsed: 00:00:08.12

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      67490  consistent gets
          0  physical reads
          0  redo size
   15851696  bytes sent via SQL*Net to client
     448429  bytes received via SQL*Net from client
      40721  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     610792  rows processed

Again, we are selecting the same number of records in both case (like in the previous UPDATE example).

Time elapsed in the first case is almost two minutes (1 minute 56 seconds).

In the second test against my custom table, time elapsed is only 8 second (8,12 seconds to be precise).

This is 14,3x faster than in vanilla case.

Although it looks like artifical, this case comes from real life performance tuning task that was assigned to me.

Now that I provide evidence of what impact can you expect in similar cases and how important role of physical design is, let’s take a look at what is the cause for such difference.

Part of the answers I’m exposing here, by presenting more Oracle statistics where can we see two most important points.

Number of consistent gets is almost 8x smaller in the second case, which means that Oracle is reading significantly less number of block reads.

The number of bytes transfer to client is also 9x smaller in the second case.

This article can trigger discussion in many directions, from RPM – Retail Price management design questions to the how to find the optimal column order.

That can be covered sometimes in the future.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

In all books that I red about performance tuning in Oracle (and other major vendor) relational databases, you can find that when optimizing SQL, you remove the select part and concentrate on part starting with FROM clause to inspect what tables have been involved along with the WHERE clause.

Recently I have a interesting case where I need to join 4 large (several hundreds of million) tables, with poor (not selective) joins among them.

That query take a week to end, and it take a while for me to figure out what is going on.

Here I’ll create a very simple version of the issue I had, to concentrate only on important part of the issue, that will prove that statement from optimization books, although valid in majority of cases, are far from truth in this particular case.

I’ll use EMP table from the SCOTT sample schema.

grant execute on dbms_lock to scott;

All commands to the end of the blog can be executed as SCOTT user.

create or replace function flock 
return number 
as
begin 
dbms_lock.sleep(5);
return trunc(dbms_random.value(1,10));
end;

For each call, function flock will wait for a 5 seconds, and then return random number in the interval from 1 – 10.

Then I need to call the function from the SQL:

select /*+ test1 */ flock, e.* from emp e where rownum < 3;
Elapsed 10,112 sec 

To return 2 rows, query runs for 10,1 second.

Let’s see the execution plan (with gather hint):

SQL_ID  4fbxxh18mw6fv, child number 0
-------------------------------------
select /*+ test1 */ flock, e.* from emp e where rownum < 3
 
Plan hash value: 1973284518
 
-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      2 |00:00:00.01 |       3 |
|*  1 |  COUNT STOPKEY     |      |      1 |        |      2 |00:00:00.01 |       3 |
|   2 |   TABLE ACCESS FULL| EMP  |      1 |      2 |      2 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------

From execution plan, as can be seen, with all timings enabled, I can’t see where the times is spent.

The only option here is to trace statement execution.

After the tracing and after format trace file with TKPROF, I can see that 10,01 second is elapsed time for the statement, but I still can’t see where the time is spent.

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00      10.01          0          3          0           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00      10.01          0          3          0           2

Fortunately, the following section of TKPROF output provides the clue what is really going on.

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  PL/SQL lock timer                               2        5.01         10.01
  SQL*Net message from client                     1        0.00          0.00



********************************************************************************

On the following picture (based on Trace profiling), you can see PL/SQL lock timer event (2 executions, once per each row, where each execution takes 5 seconds).

Now it’s clear that almost all time has spent on PL/SQL dbms_lock waiting.

To see exactly function name, you can use HPROF and put SQL inside the procedure.

Summary:

It is always necessary to take a deep look at the query you try to optimize, to get familiar with.

Although it will postpone real work on tuning, it can save hours in more complex query.

And it is important not to forget that calling PL/SQL function from the SQL is row-by-row operation, not set-by-set as we expect from the SQL.

There are various ways to optimize that part also, but that will be covered in one of the future blogs.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Although I already published one article about NPAPI support and Firefox, it is not valid now, as all major web browsers are stop supporting add-ons like Java, Flash and Silverlight in recent versions.

Mozilla Firefox in the next version 57, will have no add-ons at all, while the Microsoft Edge from first release has no add-ins capability.

To summarize, with new releases of the most popular web browsers, it becomes even more difficult to run applications that still requires popular Add-ins like Adobe Flash and Java.

Oracle users are definitely affected, as there are many Oracle Forms applications around.

Although Oracle recommends to use Oracle ADF for a new apps, it will take a years (or even decade) till all Oracle Forms apps will be replaced/upgraded with a newer technology.

Oracle Retail is one such example, where the central module – RMS (Retail Merchandising System) along with a couple of smaller modules (ReSA – Sales Audit and RTM) are, even in the latest releases, still created using Oracle Forms technology.

Oracle Retail 16 is the first release that fully replace Oracle Forms.

As Oracle Retail is one of the most complex SW around, it will take a years till all clients will upgrade it to use ADF instead of Oracle Forms.

Adobe Flash is in extensive use by Oracle, as Oracle EM (Enterprise Manager) and many monitoring packages are based on it.

To conclude, many users will still need to use Java and Flash plugins for many years.

Even some of users will need it to run online banking software, as Java plugins will be required for pairing USB stick.

If you are running Firefox web browser, instead to use the latest version (which is at time of writing this article version 56.0), you can install and use the latest Firefox ESR 52.8.0 where you can still user Java Plug-in, as that is the last version where you can still run popular plugins.

Unfortunately, official support for 52.8.0 ESR will expire by the June 2018.

For that reason, you cannot rely on Firefox ESR as a long term solution.

Other option is to use Internet Explorer 11 if you need to have Java/Flash or Silverlight at your disposal.

There are two main disadvantages of using Internet Explorer:

1.

Microsoft has already released a new web browser (Edge which is default option from Windows 10), where plugin support is disabled (meaning there are no more Java/Flash/Silverlight inside the browser), and Edge will be the only browser Microsoft will continue to develop/support.

Internet Explorer, on the other hand, will continue to receive only critical security patches, but no new functionality will be developed for that browser.

2.

Internet Explorer is bind to Windows OS. Linux and MacOS users cannot use it.

As Windows users still have some time in front to use popular plugins, situation with other OS users are not so bright.

Thanks to the Pale Moon web browser, situation is better then most of people think.

What is Pale Moon web browser?

In Wikipedia you can find their motto: “Your browser, Your way.”

Pale Moon is fork of well known Mozilla Firefox web browser, but with a few crucial differences.

One of them is add-on capability, which Firefox will remove entirely in the next release (57).

You may also notice old-fashion, classical, adjustable user interface.

Although with very low market share, this browser has several advantages in case, for some reason, you need to use Java or Flash Add-ons.

Browser support Linux, Windows and unofficially MacOS platform.

Why I recommend Pale Moon browser?

1.

It has his own profile directory.

If you click on Help —> Troubleshooting Information and then click on button Profile directory, your file manager will opens at the following direcotry:

/home/USERNAME/.moonchild productions/pale moon/generic_alphanumeric.default/

This is a good news, as you can run Firefox and Pale Moon side by side, as there are no influence between them (separate profile directory under the user HOME).

2.

If you are running 64bit Linux, your Firefox library file are probably in (for Red Hat branch of Linux OS):

/usr/lib64/mozilla/plugins

I assume you already have: Oracle JDK installed. If not, install the latest Oracle JDK.

As Pale Moon uses 32bit libraries (which is actually a good news, as your add-ins will be separated from the Mozilla libraries), you need to create the following directory (as a root user):

mkdir -p /usr/lib/mozilla/plugins

and to create the following soft links (you should change path and Java version name to match with version that you have installed):

ln -s /usr/java/jdk1.8.0_144/jre/lib/amd64/libnpjp2.so libjavaplugin.so

ln -s /usr/java/jdk1.8.0_144/jre/lib/amd64libnpjp2.so libjavaplugin.so.x86_64

ln -s /usr/lib64/flash-plugin/libflashplayer.so libflashplayer.so

When you list directory, you should see the following files:

libflashplayer.so -> /usr/lib64/flash-plugin/libflashplayer.so

libjavaplugin.so -> /usr/java/jdk1.8.0_144/jre/lib/amd64/libnpjp2.so

libjavaplugin.so.x86_64 -> /usr/java/jdk1.8.0_144/jre/lib/amd64libnpjp2.so

Now Flash and Java plug-ins should be visible in your Pale Moon web browser.

To conclude, with Pale Moon you’ll get both: updated web browser that includes new features, but at the same time, you still have supported technologies that you need.

I expect that users will still have several browsers installed, one for general surfing like any from major ones like Firefox/Chrome/Internet Explorer/Edge, while Pale Moon (or Internet Explorer) will be use to run apps that require Java/Flash functionality.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Title of this article is from the Oracle Metalink note (Doc ID 444822.1) that describes how to collect performance related data when opening a new SR for Oracle Retail Performance issue.

Inside that note you can find the following:

For additional help in enabling tracing for Oracle Retail Java-based applications, please see Document 568004.1 – How to Trace and Debug Retail ‘Java’ Applications.

The next logical step is to open document 568004.1 that describes how to debug Oracle Retail Java applications.

From the name of the document it is clear that the title is wrong. Here the term Oracle Retail Java applications reference all Retail applications that are using connection pool to access database resources.

Up to the version 16 of Oracle Retail, the only application that is not Java based is the central one: RMS (Retail Merchandising System). There are RTS (Retail Trade Management) and ReSA (Sales Audit) besides, but the RMS is most important and that is Oracle Forms based application.

With Oracle Forms, it is very simple to track RMS users, as in v$session you can see the end user name exposed.

Java connection pool is much efficient than Oracle Forms, but it is difficult to track end users, and application itself.

The document 568004.1 should provide solution how to trace, RPM (Retail Price Management) or ReIM for example.

This is what you’ll find as suggested solution from Oracle Support for tracing “Java Apps”:

CREATE OR REPLACE TRIGGER SQLTRACE_TRIGGER AFTER LOGON ON DATABASE
DECLARE
...

   CURSOR c_session_info IS
      SELECT 'x'
        FROM sys.v_$session
       WHERE sid = (SELECT sid FROM sys.v_$mystat WHERE ROWNUM=1)
           AND program LIKE 'JDBC%'
             AND client_info = L_client_info_set;
...
      EXECUTE IMMEDIATE 'alter session set events ''10046 trace name context forever, level 8''';
	-- execute immediate 'alter TRIGGER SQLTRACE_TRIGGER disable';

...
   EXCEPTION

     WHEN OTHERS THEN
RAISE;
...

It’s really hard to believe that someone has written that script.

You should first ask yourself, should I follow it, as it is official Oracle document, in which case I’ll be compliant with Oracle?

I believe it is always the best to think in rational and logical manner with your own head, and NOT to follow that document, as there are much better solution that doesn’t require creation of the AFTER LOGON trigger, which is one of the most dangerous commands in the entire Oracle database.

Solution is based on dbms_monitor package, which doesn’t require to create a trigger at all, and it is at your disposal from 10g version of Oracle database (meaning 10+ years).

Now when you (hopefully) understand why it is dangerous to track Java apps in general, not only Oracle Retail apps, by using AFTER LOGON trigger, look at the next section of code:

AND program LIKE 'JDBC%'

It is clear that you can’t track some particular user with such code.

Problem is that you can’t even track the whole application.

For example, If you have RPM and ReIM installed, sessions from both of those two apps will be returned with such WHERE clause filter condition.

Let’s move on EXECUTE IMMEDIATE part.

EXECUTE IMMEDIATE 'alter session set events ''10046 trace name context forever, level 8''';

Again, why use that form when you have much more powerful way to do the same by using dbms_monitor package.

The last point is exception handler.

...
   EXCEPTION

     WHEN OTHERS THEN
RAISE;
...

In 11g version of the database if you set compiler warning

ALTER SESSION SET PLSQL_WARNINGS='ENABLE:ALL'

you’ll get the following comilation error:

PL/SQL Compiler warning "PLW-06009: procedure “string” OTHERS handler does not end in RAISE or RAISE_APPLICATION_ERROR"

Fortunately, there is a RAISE command, but that part of the code is still poorly written.

In Oracle Support there are really excellent documents, and as long-term user I can testify there are lot of improvements in MOS portal, but this particular document that I’ve described is not one of that kind.

Hope someone from Oracle will read this, and correct the document.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Goal of this exercise is to clone Oracle database that is using Grid / Clusterware components to a separate server.

Although you can find some articles on the Internet, problem is they are covering how to change the host name and/or IP address in case of Oracle Restart.

This article will show you much more: how to clone complete Oracle database, that is using Oracle Grid parts of infrastructure to a new server.

First I want to explain why you would do that?

First reason could be that you want to have identical environment as production.
Identical means in terms of production data and production configuration, not the CPU (number of CPU cores can be lower). Here I suggest to leave memory as is.

Having the same environment (besides the number of CPU cores) allows you to test database upgrade scenario.

You can also perform performance test as you have complete set of data.

Or you need to setup a new environment for a new customer where you will clean some data while majority of the other settings will remain the same.

Whatever the reason you have, you have several options, which can be further reduced to only two:
– to install Oracle SW (Grid & Db) along with a new server setting, create ASM disks and to restore database from the backup on the new server.

Other variants of this option influence the production system in terms of performance overhead which is not acceptable in mission critical systems (for calm and small databases you can use such variants).
For that reason I’ve eliminated that option.

– to perform VM clone (LPAR in case of IBM AIX) and to clone all VG (volume groups) at some point in time.
The main advantage of this option is that you get ready OS with all of the SW same as in production , and disk mounted.

In case of regular database that is using the regular file system (no ASM/Grid/Clusterware), that is all you have to do.

Unfortunately not in case of Oracle Grid/Clusterware based databases like RAC, RAC One Node and even Oracle Restart, as you are about 100+ commands away from the finish.

This is where my journey begin.

In this blog I’ll describe only the main points you need to do when using storage based cloning of the production database that is running all the time with no interruption.

This example will be based on IBM AIX OS and Oracle Restart database.
Although Oracle Restart is the simplest case of Oracle database that is using Grid and Clusterware infrastructure, it is not much simpler than the regular RAC or RAC One Node, that I also hope to cover once in the future (main difference is more Grid/Clusterware services along with a virtual private network).
When I login into the cloned host, first I checked to see what services are up and running by executing the following command:

crsctl stat res -t

As the command returns error message, I was aware it won’t be that easy as I thought, as Grid and Clusterware are deeply integrated with the OS and now I have a new IP address and a new hostname, which are the basis for many services provided by Grid&Clusterware.
As everything was dead, you need to follow the steps provided below:

1. 
First step is to de-configure Oracle Restart. To do that execute the following as a root user:

./roothas.pl -deconfig -force


2. 
Configure the CSS & OHAS services as root user as follows (as a root user):

./roothas.pl


3.
Please perform the next steps as grid OS user (as the Grid Infrastructure OS owner): 

crsctl modify resource "ora.cssd" -init -attr "AUTO_START=1"


4.
Stop has

crsctl stop has



5.
Restart the OHAS stack as grid or oracle OS user (as the Grid Infrastructure OS owner):

crsctl start has


6.
Check the CSS & OHAS state:

crsctl check has

crsctl check css

crsctl stat resource

crsctl stat res -t


cssd should be ONLINE (and evmd)

7.
Re-create the default listener (LISTENER) using port 1521 (or using your desired port), thru the NETCA GUI located on the new Grid Infrastructure Oracle Home (or manually if you do not have graphical access) as grid or oracle OS user (as the Grid Infrastructure OS owner):

srvctl add listener

srvctl start listener

CRS-2674: Start of 'ora.LISTENER.lsnr' on 'hostname' failed

As you can see, I got the error and listener was not started.

By executing the following command, I can  reveal what's behind the error.

lsnrctl start listener

TNSLSNR for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production
System parameter file is /network/admin/listener.ora
Log messages written to /xxx/xxx/oracle/diag/tnslsnr/hostname/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=old_hostname)(PORT=1521)))
TNS-12545: Connect failed because target host or object does not exist
 TNS-12560: TNS:protocol adapter error
  TNS-00515: Connect failed because target host or object does not exist
   IBM/AIX RISC System/6000 Error: 68: Can't assign requested address

Listener failed to start. See the error message(s) above...


To fix this, follow the steps:

cd , and then execute:

vi listener.ora 

(ADDRESS = (PROTOCOL = TCP)(HOST = OLD_SERVER_NAME)(PORT = 1521)) ----> (ADDRESS = (PROTOCOL = TCP)(HOST = NEW_SERVER_NAME)(PORT = 1521))


Save the file, and try to start the listener again.


srvctl start listener

This time you should have running listener. The following command will confirm that:

srvctl status listener

Listener LISTENER is enabled
Listener LISTENER is running on node(s): new_hostname




8. 
Now it's a time to create init+ASM.ora file on the /dbs directory with the next parameters:

asm_diskgroups= 

asm_diskstring= '/dev/oracleasm/disks/*'  

instance_type='asm'

large_pool_size=12M




9.
Add the ASM instance as grid or oracle OS user (as the Grid Infrastructure OS owner) by executing the following:

srvctl add asm



10.
Enable ASM instance Auto Start as grid or oracle OS user (as the Grid Infrastructure OS owner) as follow: 

/bin/crsctl modify resource "ora.asm" -attr "AUTO_START=1"  


11.
Make sure the disks are discovered by kfod as grid or oracle OS user (as the Grid Infrastructure OS owner) as follow:

/bin/kfod asm_diskstring='ORCL:*' disks=all
or 
/bin/kfod asm_diskstring='/*' disks=all


[root@+ASM@hostname:/] lspv
hdisk0          00fa0d66205e7c3b                    rootvg          active      
hdisk32         00fa0d6625405efc                    logsVG       active      
hdisk33         00fa0d662535fd8b                    dataVG           active      
hdisk34         none                                None                        
hdisk35         none                                None                        
hdisk36         none                                None                        
hdisk37         none                                None                        
hdisk38         none                                None                        
hdisk39         none                                None                        
hdisk40         none                                None                        
hdisk41         none                                None                        
hdisk42         none                                None                        
hdisk43         none                                None                        
hdisk44         none                                None                        
hdisk45         none                                None                        
hdisk46         none                                None                        


Change the permission for the ASM disks:

chown oracle:oracle /dev/rhdisk34
chmod 660 /dev/rhdisk34

chown oracle:oracle /dev/rhdisk35
chmod 660 /dev/rhdisk35
...

After that I get the expected result:

kfod asm_diskstring='ORCL:*' disks=all
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME                                                          
================================================================================
      +ASM                      



Now you should be able to start the ASM instance.


12.
Validate that the candidate disks are being discovered:

SYS@+ASM> select  path  from v$asm_disk;

PATH
----------------------------------------------------------------------------------------------------
/dev/rhdisk34
/dev/rhdisk35
/dev/rhdisk36
/dev/rhdisk37
/dev/rhdisk38


13.
And create spfile (this part I assume every decent DBA should be able to create).



14. 
Add the new ASM spfile and listener to the new ASM instance resource:

$> /bin/srvctl modify asm  -p 

$> /bin/srvctl modify asm  -l LISTENER



15.
At the end, you need to add database as an resource under the control of the Grid infrastructure.

srvctl add database -d db_name -o  -p  -a "data,fra,temp"

srvctl add service -s service_name -d db_name -P BASIC

srvctl start database -d db_name

I suggest you to rename the database and change the DB ID, but won’t be described here due to the too many steps.

Generally, it’s standard procedure with addition of deleting/renaming/adding Clusterware resources.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

This time I’ll explain AWR statistics migration between databases.

This is not a new feature (available from 10g) but many DBA/Architects doesn’t know how to use it properly or even that this feature exist.

Basically, the migration itself is very simple, and Oracle has provided scripts (awrextr.sql & awrload.sql) to make migartion as simple as possible.

First question should be when I need to use this feature?

Short answer is to transfer/migrate execution statistics from production into the test or dev environment with minimal impact.

The second question would probably be something like: Why should I migrate execution statistics into the test/dev environment?

Short answer is to get many valuable input data from production, that can help your DBA/Database designers and developers to better understand behavior of the production system as well as to develop a new code that should be scalable and fast.

Besides it’s very useful for troubleshooting and tuning optimization issues on production as well in dev environment where is safe to try your ideas to fix the issue.

What is also important to mention, this method has very low impact and is straightforward and simple, comparing with numerous alternative methods that requires to transfer complete or portion of the production database to dev/test environments.

Let’s first go through the migration process itself briefly. You can find on many places on the web description of this very simple process.

After that I’ll show some interesting points that are not described well on other sites.

--source database (for example production)
cd $ORACLE_HOME/rdbms/admin

SYS@prod> @awrextr.sql


~~~~~~~~~~~~~
AWR EXTRACT
~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~  This script will extract the AWR data for a range of snapshots  ~
~  into a dump file.  The script will prompt users for the         ~
~  following information:                                          ~
~     (1) database id                                              ~
~     (2) snapshot range to extract                                ~
~     (3) name of directory object                                 ~
~     (4) name of dump file                                        ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Databases in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

   DB Id     DB Name      Host
------------ ------------ ------------
* 1339390690 prod        xxx-rms-db-prod

  782550019  MOM2D        xxx-rms-db-dev


The default database id is the local one: '1339390690'.  To use this
database id, press  to continue, otherwise enter an alternative.


Using 1339390690 for Database ID


Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed.  Pressing  without
specifying a number lists all completed snapshots.


Enter value for num_days: 2

Listing the last 2 days of Completed Snapshots

DB Name        Snap Id    Snap Started
------------ --------- ------------------
prod           232850 05 Sep 2017 00:00
                232851 05 Sep 2017 00:02
                232852 05 Sep 2017 00:15
                232853 05 Sep 2017 00:30
                232854 05 Sep 2017 00:45
                232855 05 Sep 2017 01:00
...


Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 232850
Begin Snapshot Id specified: 232850

Enter value for end_snap: 233035


Specify the Directory Name
------------------------------ -------------------------------------------------
DUMP_DIR                       /u01/oradata/dump_dir

Choose a Directory Name from the above list (case-sensitive).

Enter value for directory_name: EXP_DIR


Specify the Name of the Extract Dump File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The prefix for the default dump file name is awrdat_232850_233035.
To use this name, press  to continue, otherwise enter
an alternative.

Enter value for file_name: 

Using the dump file prefix: awrdat_232850_233035
|
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|  The AWR extract dump file will be located
|  in the following directory/file:
|   /archive/export
|   awrdat_232850_233035.dmp
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|  *** AWR Extract Started ...
|
|  This operation will take a few moments. The
|  progress of the AWR extract operation can be
|  monitored in the following directory/file:
|   /archive/export
|   awrdat_232850_233035.log

--TARGET SYSTEM 

scp dmp file to the target box. 

create or replace directory awr_migration as '/tmp/';

cd $ORACLE_HOME/rdbms/admin

SYS@test> @awrload.sql

~~~~~~~~~~
AWR LOAD
~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~  This script will load the AWR data from a dump file. The   ~
~  script will prompt users for the following information:    ~
~     (1) name of directory object                            ~
~     (2) name of dump file                                   ~
~     (3) staging schema name to load AWR data into           ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Specify the Directory Name
~~~~~~~~~~~~~~~~~~~~~~~~~~

Directory Name                 Directory Path
------------------------------ -------------------------------------------------
AWR_MIGRATION                  /dbarchive/expdp/


Choose a Directory Name from the list above (case-sensitive).


Enter value for directory_name: AWR_MIGRATION
Using the dump directory: AWR_MIGRATION

Specify the Name of the Dump File to Load
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please specify the prefix of the dump file (.dmp) to load:

Enter value for file_name: awrdat_232850_233035 
Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A couple of days ago, I’ve been called to solve integration interface issue between SAP and ReIM (ReIM is application within the Oracle Retail Suite).
The main challenge when solving such issue is to find out where you should start with your analysis and to distinguish between what is the cause of the issue and what is the consequence.
To be able to find out, you really need to know much not only about Oracle Db (from dev/dba/performance tuning perspective along with internal Db stuff), but the architecture of the system along with how integration is working (Oracle Service Bus in this case).

Problem can be best described by showing you the picture of how it looks like.

The picture points to the following SQL as the guilty one, as consumes of all enq – TX wait event.
But that SQL is only the consequence, not the real cause.

First attempt is to gather stats and rebuild indexes. But that doesn’t help this time (it would be to easy and I wouldn’t describe it here).

After a while, I’ve found the line in the code PL/SQL package that executes the DML with enq – TX wait.

...
      l_im_doc_head_row           IM_DOC_HEAD%ROWTYPE;
          UPDATE IM_DOC_HEAD SET ROW = l_im_doc_head_row
          where doc_id = l_im_doc_head_row.doc_id;
...

What is the problem with that code?
You should better ask why developer has written such a code.
There could be two reasons:
– he/she was not sure which columns from the integration point is changed between two systems (SAP & Oracle Retail in this case)
– he/she found the code like this as convenient way/shortcut to code the statement like this, where only couple of columns need to be updated:

Parsed call, as a result of the UPDATE … SET ROW = l_im_doc_head_row statement is here:

UPDATE im_doc_head
    SET
        "DOC_ID" =:b1,
        "TYPE" =:b2,
        "STATUS" =:b3,
        "ORDER_NO" =:b4,
        "LOCATION" =:b5,
        "LOC_TYPE" =:b6,
        "TOTAL_DISCOUNT" =:b7,
        "GROUP_ID" =:b8,
        "PARENT_ID" =:b9,
        "DOC_DATE" =:b10,
        "CREATE_DATE" =:b11,
        "CREATE_ID" =:b12,
        "VENDOR_TYPE" =:b13,
        "VENDOR" =:b14,
        "EXT_DOC_ID" =:b15,
        "EDI_UPLOAD_IND" =:b16,
        "EDI_DOWNLOAD_IND" =:b17,
        "TERMS" =:b18,
        "TERMS_DSCNT_PCT" =:b19,
        "DUE_DATE" =:b20,
        "PAYMENT_METHOD" =:b21,
        "MATCH_ID" =:b22,
        "MATCH_DATE" =:b23,
        "APPROVAL_ID" =:b24,
        "APPROVAL_DATE" =:b25,
        "PRE_PAID_IND" =:b26,
        "PRE_PAID_ID" =:b27,
        "POST_DATE" =:b28,
        "CURRENCY_CODE" =:b29,
        "EXCHANGE_RATE" =:b30,
        "TOTAL_COST" =:b31,
        "TOTAL_QTY" =:b32,
        "MANUALLY_PAID_IND" =:b33,
        "CUSTOM_DOC_REF_1" =:b34,
        "CUSTOM_DOC_REF_2" =:b35,
        "CUSTOM_DOC_REF_3" =:b36,
        "CUSTOM_DOC_REF_4" =:b37,
        "LAST_UPDATE_ID" =:b38,
        "LAST_DATETIME" =:b39,
        "FREIGHT_TYPE" =:b40,
        "REF_DOC" =:b41,
        "REF_AUTH_NO" =:b42,
        "COST_PRE_MATCH" =:b43,
        "DETAIL_MATCHED" =:b44,
        "BEST_TERMS" =:b45,
        "BEST_TERMS_SOURCE" =:b46,
        "BEST_TERMS_DATE" =:b47,
        "BEST_TERMS_DATE_SOURCE" =:b48,
        "VARIANCE_WITHIN_TOLERANCE" =:b49,
        "RESOLUTION_ADJUSTED_TOTAL_COST" =:b50,
        "RESOLUTION_ADJUSTED_TOTAL_QTY" =:b51,
        "CONSIGNMENT_IND" =:b52,
        "DEAL_ID" =:b53,
        "RTV_IND" =:b54,
        "DISCOUNT_DATE" =:b55,
        "DEAL_TYPE" =:b56,
        "HOLD_STATUS" =:b57,
        "TOTAL_COST_INC_TAX" =:b58,
        "TAX_DISC_CREATE_DATE" =:b59,
        "DSD_IND" =:b60,
        "ERS_IND" =:b61,
        "SUPPLIER_SITE_ID" =:b62,
        "CONSOLIDATION_START_DATE" =:b63,
        "CONSOLIDATION_END_DATE" =:b64,
        "MANUALLY_CREATED_IND" =:b65,
        "INTERNAL_DOCUMENT" =:b66
WHERE
    doc_id =:b1;

To translate in plain English, this means either developer has lack of business knowledge to solve the issue, or he/she was lazy to code an update statement that includes a couple of columns.

After analyzing the IM_DOC_HEAD table, I’ve found many chained rows as shown on the following picture.

To solve the chained rows issue, It’s enough just to execute ALTER TABLE MOVE.

Unfortunately, this is not the only issue.

Second point is to rewrite the problematic DML like this, where only necessary columns will be updated:

          UPDATE IM_DOC_HEAD SET    TERMS                           = l_im_doc_head_row.TERMS,
                                    DOC_DATE                        = l_im_doc_head_row.DOC_DATE,
                                    DUE_DATE                        = l_im_doc_head_row.DUE_DATE,
                                    ORDER_NO                        = l_im_doc_head_row.ORDER_NO, 
                                    TYPE                            = l_im_doc_head_row.TYPE,
                                    CONSOLIDATION_START_DATE        = l_im_doc_head_row.CONSOLIDATION_START_DATE,
                                    CONSOLIDATION_END_DATE          = l_im_doc_head_row.CONSOLIDATION_END_DATE,
                                    CREATE_ID                       = l_im_doc_head_row.CREATE_ID,
                                    VENDOR                          = l_im_doc_head_row.VENDOR,
                                    EXT_DOC_ID                      = l_im_doc_head_row.EXT_DOC_ID,                                    
                                    CREATE_DATE                     = l_im_doc_head_row.CREATE_DATE, 
                                    PRE_PAID_IND                    = l_im_doc_head_row.PRE_PAID_IND,
                                    LOCATION                        = l_im_doc_head_row.LOCATION,
                                    LOC_TYPE                        = l_im_doc_head_row.LOC_TYPE,
                                    STATUS                          = l_im_doc_head_row.STATUS,
                                    RTV_IND                         = l_im_doc_head_row.RTV_IND,
                                    CURRENCY_CODE                   = l_im_doc_head_row.CURRENCY_CODE,
                                    EXCHANGE_RATE                   = l_im_doc_head_row.EXCHANGE_RATE,
                                    TOTAL_COST                      = l_im_doc_head_row.TOTAL_COST,
                                    CUSTOM_DOC_REF_1                = l_im_doc_head_row.CUSTOM_DOC_REF_1, 
                                    CUSTOM_DOC_REF_2                = l_im_doc_head_row.CUSTOM_DOC_REF_2,
                                    CUSTOM_DOC_REF_3                = l_im_doc_head_row.CUSTOM_DOC_REF_3,
                                    CUSTOM_DOC_REF_4                = l_im_doc_head_row.CUSTOM_DOC_REF_4,
                                    LAST_UPDATE_ID                  = l_im_doc_head_row.LAST_UPDATE_ID,
                                    LAST_DATETIME                   = l_im_doc_head_row.LAST_DATETIME,
                                    HOLD_STATUS                     = l_im_doc_head_row.HOLD_STATUS,
                                    TOTAL_COST_INC_TAX              = l_im_doc_head_row.TOTAL_COST_INC_TAX,
                                    RESOLUTION_ADJUSTED_TOTAL_COST  = l_im_doc_head_row.RESOLUTION_ADJUSTED_TOTAL_COST,
                                    RESOLUTION_ADJUSTED_TOTAL_QTY   = l_im_doc_head_row.RESOLUTION_ADJUSTED_TOTAL_QTY 
                             WHERE DOC_ID = l_im_doc_head_row.DOC_ID;

There are two more issues in this case, but that will be covered in some other blog.

The main point is to change as little data as you can, if you want the system to scale and run fast.

Read Full Article
Visit website
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

While helping with migration of one Retail production system, I need to export certain schemas by using Data Pump utility.

As the database is pretty small (about 3TB), I’ve just execute expdp command and expected it will finish within 2 to 3 hours.

When I checked in the evening, on my surprise, Data Pump job was still running.

After monitoring the Data Pump session, soon I’ve realized what is going on.

Almost 94% of time was spent on the “Streams AQ: enqueue blocked on low memory” wait event.

After digging, Oracle MOS portal helped me in this case.

In database version 11.2.0.4, Oracle has implemented a new algorithm for modifying components of the SGA memory, which has issue while changing the size of the stream pool size.

You can see on the picture below how it look like.

To resolve the problem, there are several workarounds, but this is what is working in my case.

select value from v$parameter where name like '%stream%';
0

select current_size/1024/1024 as MB from v$sga_dynamic_components where component = 'streams pool';
128 MB

I’ve changed the minimum size of the stream pool to 256MB by executing the following command:

alter system set streams_pool_size=256m scope=both sid = '*';

This value is dependent of the number of parallel processes (parameter parallel=4 in my case).
If your database is very busy, you’ll have to do the following:

alter system set streams_pool_size=256m scope=spfile sid = '*';

and restart the database.

If you’ll still have the same wait event even after increasing minimum amount of the stream pool size, you’ll have to execute the following command:

alter system set "_disable_streams_pool_auto_tuning"=TRUE scope=spfile sid = '*';

and bounce the database.

After increasing the value and disabling auto tuning, performance of the Data Pump were back to normal.

The same database export that I have to kill after 24 hours, now takes only 16.5 minutes with parallel degree of 4.

Read Full Article
Visit website

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview