Oracle – Install Clusterware – Test udp multicast on Private Interconnect (and get maximum MTU)

Test Multicast on Private Interconnect (and get maximum MTU)

There are several ways to test multicast communication using udp. I was debugging an oracle Clusterware installation – the clusterware on the second node would not completely start (specifically ocssd and crsd would not start on node2).

All standard/published Oracle tests passed. See Oracle – Clusterware Installation and Configuration

Without going into too much detail, the Clusterware daemons would not start on the second node because the MTU on some of the switches in the network was not configured correctly.

The two java programs below can be used to test whether multicast is working on the network between two servers. More specifically, they can be used to determine the maximum MTU that can be used between the two servers, taking into consideration all network elements (NICs, switches, routers).

Switches/routers may be configured to support Jumbo frames. MTU 9000 (maybe set to 9150 on switch/routers)

The default MTU on switches may be 1400!

Normally, the NICs on the Linux server are set to 1500 MTU.

Private interconnect NICs/bond can be set to 9000 if all intermediate network elements support and are configured for Jumbo frames

MultiCastTestReceiveLoop.java

This program is used to receive udp datagrams on a socket (max. 9150 – covers Jumbo frames).

The communication follows a similar path as that used between Oracle Clusterware processes.

To compile: /jdk/bin/javac MultiCastTestReceiveLoop.java

To execute: /jdk/bin/java -classpath . MultiCastTestReceiveLoop 230.0.1.0 42424

import sun.net.*;
import java.net.*;

public class MultiCastTestReceiveLoop {
    public MultiCastTestReceiveLoop() {
        MultiCastTestReceiveLoop MultiCastTestReceiveLoop = new MultiCastTestReceiveLoop(42424,"230.0.1.0");
    }
    
    public MultiCastTestReceiveLoop(int port,String group) {

	try {
	while (true) { 
	
        // Create the socket and bind it to port 'port'.
        MulticastSocket s = new MulticastSocket(port);

        // join the multicast group
	s.setNetworkInterface(NetworkInterface.getByName("<interconnect interface>"));
        s.joinGroup(InetAddress.getByName(group));
 
        byte buf[] = new byte[9500];
        DatagramPacket pack = new DatagramPacket(buf, buf.length);


	s.receive(pack);
	System.out.println("Received data from: " + pack.getAddress().toString() + ":" + pack.getPort() + " with length: " + pack.getLength());
	System.out.write(pack.getData(),0,pack.getLength());
	System.out.println();

        s.leaveGroup(InetAddress.getByName(group));
        s.close();
	} // end while true

        }


        // Adding just a catch-all 
        catch (Exception e)
        {
           e.printStackTrace();
        }

    }

    public static void main(String[] args) {
        if (args.length == 2) { 
        MultiCastTestReceiveLoop MultiCastTestReceiveLoop = new MultiCastTestReceiveLoop(Integer.parseInt(args[1]),args[0]);
        }
        else {
            MultiCastTestReceiveLoop MultiCastTestReceiveLoop = new MultiCastTestReceiveLoop();
        }
    }
}

MultiCastTestSend.java

This program is used to send udp datagrams on a socket. The buffer is filled with the letter ‘a’ (length set with the nbytes argument on the command line

The communication follows a similar path as that used between Oracle Clusterware processes.

To compile: /jdk/bin/javac MultiCastTestSend.java

To execute: /jdk/bin/java -classpath . MultiCastTestSend 230.0.1.0 42424

import sun.net.*;
import java.net.*;

public class MultiCastTestSend {
    public MultiCastTestSend() {
        MultiCastTestSend multiCastTestSend = new MultiCastTestSend(42424,"230.0.1.0",10);
    }

    public MultiCastTestSend(int port,String group,int nbytes) {

        int ttl = 1;
        try {
        // Create the socket
        MulticastSocket s = new MulticastSocket();

	// set LAN interface to send on
	s.setNetworkInterface(NetworkInterface.getByName("<interconnect interface>"));

        // Fill the buffer with some data
        byte buf[] = new byte[nbytes];
        for (int i=0; i<buf.length; i++) buf[i] = (byte)i;
        // Create a DatagramPacket 
        DatagramPacket pack = new DatagramPacket(buf, buf.length,
                                                 InetAddress.getByName(group), port);

        s.setTimeToLive(ttl);
        s.send(pack);
        // And when we have finished sending data close the socket
        System.out.println("Sent bytes to " + pack.getAddress().toString());
        s.close();
          }
        catch (Exception e)
        {
        e.printStackTrace();
        }


    }

    public static void main(String[] args) {
        if (args.length == 3) { 
        MultiCastTestSend multiCastTestSend = new MultiCastTestSend(Integer.parseInt(args[1]),args[0],Integer.parseInt(args[2]));
        }
        else {
            MultiCastTestSend multiCastTestSend = new MultiCastTestSend();
        }
    }
}

Oracle – listener.log file

Rename and compress listener.log file

[grid@server]$ lsnrctl
LSNRCTL> set current_listener listener
LSNRCTL> set log_status off
 
[grid@server]$ mv listener.log listener.log.temp
 
[grid@server]$ lsnrctl
LSNRCTL> set current_listener listener
LSNRCTL> set log_status on

-- compress (or delete) 
[grid@server]$ gzip listener.log.temp
[grid@server]$ lsnrctl
LSNRCTL> set current_listener ASMNET1LSNR_ASM
LSNRCTL> set log_status off
 
[grid@server]$ mv asmnet1lsnr_asm.log asmnet1lsnr_asm.log.temp
 
[grid@server]$ lsnrctl
LSNRCTL> set current_listener ASMNET1LSNR_ASM
LSNRCTL> set log_status on

-- compress (or delete) 
[grid@server]$ gzip asmnet1lsnr_asm.log.temp

Oracle – Clusterware Installation and Configuration

Installing and Configuring Oracle Clusterware

The installation procedure for Oracle clusterware (Grid Infrastructure) is well documented in Oracle documentation. This post contains the titles of MOS documents that will be useful before, during, or after Oracle clusterware installation.

MOS Oracle Account is required to access the documents listed in this post.

Network and Network Interfaces Validation

How to Validate Network and Name Resolution Setup for the Clusterware and RAC (Doc ID 1054902.1)

Configure/Reconfigure

Attention – GridSetup.sh for 12.2

How to Configure or Re-configure Grid Infrastructure With config.sh/config.bat (Doc ID 1354258.1)

Multicast Test Script

Look for mcasttest.pl-tool in the MOS document:

Grid Infrastructure Startup During Patching, Install or Upgrade May Fail Due to Multicasting Requirement (Doc ID 1212703.1)

Create/Recreate GIMR database (MGMTDB)

MDBUtil: GI Management Repository configuration tool (Doc ID 2065175.1)

 

Install OS Packages

For convenience

yum -y install xorg-x11-utils  (xdpyinfo (DISPLAY issue with mobaxterm)

yum -y install psmisc (for opatching if required)

 

Add First Node

Unizip Oracle sw or gold image in $GRID_HOME (owned by and permissions set for grid user)

(as grid user)

$GRID_HOME/GridSetup.sh

Add second node

After install of first node:
(as grid user)
$GRID_HOME/bin/cluvfy stage -pre nodeadd -n <node2> -fixup -fixupnoexec -verbose

 

Trace/Log Files

$GRID_BASE/diag/crs/<host>/crs/trace

alert.log
gipcd.trc
ocssd.trc
evmd.trc
gpnpd.trc
mdnsd.trc

$GRID_BASE/crsdata/<host>/crsconfig/

Useful Commands

/oracle/grid/12.2.0/bin/crsctl stat res -t -init -w "((TARGET != ONLINE) or (STATE != ONLINE))"

Oracle – Direct NFS with a Windows NFS Server

Scenario : Oracle Direct NFS, NFS on Windows Server

you are using Oracle Direct NFS, and the NFS is running on Windows, there is a potential issue with permissions and ownership changes required.

Example

An example of issues with Oracle Direct NFS using datapump with the DATA_PUMP_DIR located on an NFS share server by NFS on Windows:

Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 2.625 MB
Processing object type SCHEMA_EXPORT/USER
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.CREATE_OBJECT_ROWS [USER:"XYZ"]
ORA-19505: failed to identify file "/u01/ORCL/ORCL_testexp.dp"
ORA-17503: ksfdopn:4 Failed to open file /u01/ORCL/ORCL_testexp.dp
ORA-17500: ODM err:File does not exist
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPW$WORKER", line 11014
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name

Relevant MOS Documents

RMAN Backup Fail to NFS Shares From a Windows Server When DNFS is Enabled (Doc ID 2171297.1)
Datapump Dump File Permission In DNFS Environment (Doc ID 2049012.1)

Solution

Change the value of the following registry key (on the Windows Server providing NFS) to 0 and restart Server for NFS:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ServerForNfs\CurrentVersion\Exports\0\RestrictChown = 0 (DWORD)

Workaround

You could also disable Direct NFS:
set ORACLE environment.
cd $ORACLE_HOME/rdbms/lib/
make -f ins_rdbms.mk dnfs_off

— restart the instance

Oracle – RMAN DROP DATABASE

-- set ORACLE env first

export ORACLE_SID=ORCL12
export ORACLE_HOME=/oracle/.....

oracle@oravm:~/ [ORCL12] rman target /

Recovery Manager: Release 12.1.0.1.0 - Production on Wed Dec 27 09:45:33 2017

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

connected to target database (not started)

RMAN> startup force mount;

Oracle instance started
database mounted

Total System Global Area    4275781632 bytes

Fixed Size                     5218048 bytes
Variable Size               2415919360 bytes
Database Buffers            1845493760 bytes
Redo Buffers                   9150464 bytes

RMAN> show db_unique_name;

using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name ORCL12 are:
RMAN configuration has no stored or default parameters

RMAN> SQL 'ALTER SYSTEM ENABLE RESTRICTED SESSION';

sql statement: ALTER SYSTEM ENABLE RESTRICTED SESSION

RMAN> DROP DATABASE INCLUDING BACKUPS NOPROMPT;

database name is "ORCL12" and DBID is 2242343723

allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=51 device type=DISK
specification does not match any backup in the repository

released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=51 device type=DISK
specification does not match any datafile copy in the repository
specification does not match any control file copy in the repository
specification does not match any control file copy in the repository
List of Archived Log Copies for database with db_unique_name ORCL12
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - --------
1       1    1       A 30.11.17
        Name: +ORA_DATA/ORCL12/ARCHIVELOG/2017_11_30/thread_1_seq_1.407.961431639

2       1    2       A 30.11.17
        Name: +ORA_DATA/ORCL12/ARCHIVELOG/2017_11_30/thread_1_seq_2.414.961431685

3       1    3       A 30.11.17
        Name: +ORA_DATA/ORCL12/ARCHIVELOG/2017_12_01/thread_1_seq_3.316.961561207

deleted archived log
archived log file name=+ORA_DATA/ORCL12/ARCHIVELOG/2017_11_30/thread_1_seq_1.407.961431639 RECID=1 STAMP=961431639
deleted archived log
archived log file name=+ORA_DATA/ORCL12/ARCHIVELOG/2017_11_30/thread_1_seq_2.414.961431685 RECID=2 STAMP=961431684
deleted archived log
archived log file name=+ORA_DATA/ORCL12/ARCHIVELOG/2017_12_01/thread_1_seq_3.316.961561207 RECID=3 STAMP=961561206
Deleted 3 objects


database name is "ORCL12" and DBID is 2242343723
database dropped

RMAN>