Category Archives: Critical Issues

New Exadata Critical Issues – EX23 and EX24

Over the weekend, Oracle announced two new Critical Issues for Exadata Storage Server (EX23 and EX24), both impacting version 12.1.2.1.

Both the new Critical Issues can be patched by either applying the one-off patch 21251493 or by upgrading the Exadata Storage Server software to 12.1.2.1.2.

 

Critical Issue EX23
Affects Exadata Storage Server 12.1.2.1.0 and 12.1.2.1.1

Bug 21174310 – wrong results, ORA-1438 errors or other internal errors (ORA-00600 and ORA-07445) are possible from smart scan offloaded queries against HCC or OLTP compressed tables stored on Exadata storage if:

  • Exadata Storage Server is version 12.1.2.1.0 or 12.1.2.1.1 AND
  • Oracle Database was upgraded from 11.2 to 12.1 AND
  • A smart scan offloaded query is issued against an OLTP compressed table or an HCC table containing OLTP compressed blocks

The workaround is to recreate the table.

The recommended action is to upgrade to Exadata Storage Server software version 12.1.2.1.2 (or higher).

Alternatively, apply patch 21251493 to Exadata Storage Servers running version 12.1.2.1.1.  Note that patch 21251493 contains additional fixes required to resolve other critical issues.

MOS 2032464.1 has additional details.

 

Critical Issue EX24
Affects Exadata Storage Server 12.1.2.1.1

After replacing a failed system disk (disk 0 or disk 1), the new disk is not correctly configured leaving the system vulnerable to the other system disk failing. The likelihood of occurrence is high when running Exadata version 12.1.2.1.1 and a failed system disk is replaced.

The workaround is to follow the instructions in MOS 2003674.1.

The recommended action is to upgrade to Exadata Storage Server software version 12.1.2.1.2 (or higher).

Alternatively, apply Patch 21251493 to Exadata Storage Servers running version 12.1.2.1.1. Note that patch 21251493 contains additional fixes required to resolve other critical issues.

MOS 2032402.1 has additional details.

 

References

  • Bug 21174310 – Wrong results or ORA-1438 errors possible from smart scan offloaded queries against HCC or OLTP compressed tables stored on Exadata storage (Doc ID 2032464.1)
  • Important Fixes Required for System Disk Replacement on Exadata Storage Servers Running Version 12.1.2.1.1 (Doc ID 2032402.1)
  • Exadata Storage Software 12.1.2.1.0 and 12.1.2.1.1 System Disk Replacement Issues (Doc ID 2003674.1)
  • Exadata Critical Issues (Doc ID 1270094.1)
  • Patch 21251493
Tagged , , , , , , , , , , ,

Oracle Critical Patch Update for July 2015

Oracle’s Critical Patch Update is out for July 2015:

http://www.oracle.com/technetwork/topics/security/cpujul2015-2367936.html

Affected are database versions 11.1.0.7, 11.2.0.3, 11.2.0.4, 12.1.0.1 and 12.1.0.2.

This is the final patch for both the 11.1.0.7 and 11.2.0.3 releases. The final patch for 12.1.0.1 will be released in January 2016.

The most prominent bug on the risk matrix is CVE-2015-2629 whereby a remote authenticated user can exploit a flaw in the Java VM component to gain elevated privileges.

For the 11.2.0.4 patches, you can apply one of the following:

11.2.0.4 SPU for UNIX: patch 20803583
11.2.0.4.7 PSU for UNIX: patch 20760982
11.2.0.4.17 Quarterly Database Patch for Exadata (July 2015): patch 21142006
July 2015 Quarterly Full-Stack Patch for Exadata: patch 21186703

Don’t forget your Grid Infrastructure patching:

11.2.0.4 PSU for UNIX: patch 20996923

And, of course, ever since those Java bugs were discovered, we should also patch the JVM:

11.2.0.4.4 Database PSU for UNIX: patch 21068539

Happy patching!

Tagged , , ,

Exadata Critical Issue EX21

Oracle announced a new Exadata Critical Issue this morning (EX21) which applies to the ESS software versions 12.1.1.1.2 and 12.1.2.1.1.

“This issue is encountered only when a disk media error occurs while synchronous I/O is performed. Because the majority of I/O operations issued with Exadata storage are done asynchronously, and this problem is possible only when disk media errors are experienced while synchronous I/O is performed, the likelihood of experiencing this problem is low. However, the impact of hitting this problem can potentially be high.

This problem affects Exadata Storage Server software versions 12.1.2.1.1 and 12.1.1.1.2.

Disk corruption symptoms are varied. Some corruptions will be resolved automatically by Oracle Database, while other corruptions will lead to unexpected process shutdown due to internal errors.”

ESS 12.1.1.2.1 DOES have a patch available, but 12.1.2.1.1 does not at the moment (the patch is “pending”).  I’m sure it will become available soon.

I have MOS email me whenever the Exadata Critical Issues document (1270094.1) is updated so I’m quickly aware of the latest important bugs. It’s pretty neat and I’d advise other Exadata types to make use of it as well.

Tagged , , , ,

Exadata Critical Issue DB20

A new Exadata Critical Issue – EX20 – has been announced on MOS note 1270094.1 and applies to Exadata Storage Server versions 12.1.1.1.0 and 12.1.1.1.1.

The issue is caused by bug 19211091:

CELLSRV Internal Error ORA-600 [DiskIOSched::GetCatIndex:2]

Further details can be found in MOS 1967985.1

You might hit this bug if your database resource manager plan contains sub-plans and OTHER_GROUPS is present in a sub-plan instead of the top plan.

The CELLSRV trace file will contain one or more entries indicating CELLSRV process failure similar to the following:

ORA-00600: internal error code, arguments: [DiskIOSched::GetCatIndex:2], [4294967295], [], [], [], [], [], [], [], [], [], []

CELLSRV encountered a fatal signal 11. LWPID: 28000 userId: 80 kernelId: 80 pthreadID: 139785595115840
Ignoring fatal signal encountered during Cellsrv state dump LWPID: 28000 userId: 80 kernelId: 80 pthreadID: 139785595115840

If CELLSRV fails on multiple cells simultaneously, then the ASM disk groups may dismount or ASM instances may crash, potentially causing databases to crash.

Typically, the Restart Server (RS) process will restart CELLSRV after it fails.  However, too many CELLSRV failures will trigger “flood control” and prevent further CELLSRV restarts.  Flood control is indicated in the trace file with entries similar to the following:

[RS] monitoring process /opt/oracle/cell/cellsrv/bin/cellrsomt (pid: 26763) returned with error: 126
[RS] Monitoring process for service CELLSRV detected a flood of restarts. Disable monitoring process.
RS-7445 [CELLSRV monitor disabled] [Detected a flood of restarts] [] [] [] [] [] [] [] [] [] []

Workarounds
The recommended action is to upgrade to Exadata Storage Server software version 12.1.1.1.2 (or higher) or 12.1.2.1.1 (or higher).

Alternately, you can apply patch 19211091.

As a temporary workaround, you can disable the Resource Manager on the affected databases, modify the appropriate plan so that the OTHER_GROUPS directive is in the top plan (and not any sub-plan) and re-enable the Resource Manager:

ALTER SYSTEM SET resource_manager_plan=” SCOPE=both SID=’*’;

SELECT unique name
FROM v$rsrc_plan_history
WHERE name NOT IN (
SELECT plan
FROM dba_rsrc_plan_directives
WHERE plan IN (
SELECT unique name
FROM v$rsrc_plan_history)
AND group_or_subplan = ‘OTHER_GROUPS’);

SYS.DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
plan => ‘MY_PLAN’,
group_or_subplan => ‘OTHER_GROUPS’,
mgmt_p2 => 80,
switch_estimate => FALSE,
comment => NULL);

ALTER SYSTEM SET resource_manager_plan=’MY_PLAN’ SCOPE=both SID=’*’;

Tagged , , ,

Exadata Critical Issue DB27

Oracle announced a new Exadata Critical Issue yesterday (DB27) as per MOS 2004572.1.

11.2.0.4 databases running with Grid Infrastructure 12.1 (either 12.1.0.1 or 12.1.0.2) will crash whenever a health update is received (such as when a cell disk is marked “predictive failure”).

The database ASMB process terminates causing the database instance to crash.  The following errors are reported in the database alert.log:

ORA-15064: communication failure with ASM instance
ORA-03115: unsupported network datatype or representation
ASMB: terminating the instance due to error 15064

Perform one of the following actions to prevent bug 20361671:

  1. Upgrade the Grid Infrastructure home to 12.1.0.2.7 (Database Patch for Engineered Systems and DB In-Memory 12.1.0.2.7) or later.
  2. Apply patch 20361671 to the Grid Infrastructure home.

At the time of writing, the patch README incorrectly omits the rootcrs.pl commands required to unlock and lock the Grid Infrastructure home before and after patching, respectively.

Prior to running the opatch command to apply the patch run the following rootcrs.pl command as the root user to unlock the Grid Infrastructure home:

$GI_HOME/crs/install/rootcrs.pl -unlock

After applying the patch run the following rootcrs.pl command as the root user to lock the Grid Infrastructure home:

$GI_HOME/crs/install/rootcrs.pl -patch

Tagged , , ,

Oracle OpenWorld 2015

Submitted my Oracle OpenWorld 2015 presentation earlier.  Today is the last day to submit proposals for presentations or tutorials.

Oracle have extended their deadline for proposals until May 6th!

 

 

Tagged , , , , ,

Exadata Critical Issue EX19

Overnight, Oracle announced a new Exadata Critical Issue (EX19) which applies to storage cells running 12.1.1.1.1 or earlier of the ESS software.

The bug is 19695225 and more information can be found on MOS 1991445.1.

Cell disk metadata corruption and loss of cell disk content (i.e. grid disk, ASM disk) will occur if many CREATE GRIDDISK or ALTER GRIDDISK commands that modify cell disk space configuration are run over time for the same cell disk.

If CellCLI griddisk commands are typically run in parallel on all storage servers simultaneously, which is a common maintenance practice, and the issue occurs on multiple storage servers at the same time such that all redundant disk extents are lost for files in an ASM disk group, then the disk group will dismount and database will crash, and will require restoring files from backup.

Rolling cell maintenance commands that change grid disk state, such as ALTER GRIDDISK INACTIVE and ALTER GRIDDISK ACTIVE, do not contribute to this issue.

Since initial system deployment if you have recreated or reconfigured grid disks using CellCLI commands CREATE GRIDDISK or ALTER GRIDDISK more than 31 times, then the likelihood of occurrence is high.

 

Risk and Detection
The risk to test and development systems is expected to be higher than production systems due to the dynamic manner in which they may be reconfigured.

To determine if your system is exposed to this issue, and how close the system is to having cell disk metadata corruption, download and run the script attached to this document on all storage servers as the root user.

Possible symptoms that cell disk metadata corruption has occurred as a result of this bug include the following:

  • ASM disk group(s) dismount and database crash following CREATE GRIDDISK or ALTER GRIDDISK.
  • ASM disk group(s) cannot be mounted following the disk group dismount.
  • Error ORA-600 [addNewSegmentsToGDisk_2] is reported in the cell alert.log.

 

The cell disk corruption cannot be repaired once it occurs.  Recovery requires recreating cell disks, grid disks, and ASM disk groups, then restoring affected databases from backup.

Perform one of the following actions to prevent bug 19695225:

  • Upgrade to Exadata Storage Server version 12.1.2.1.1 or later (Exadata 12.1.2.1.0 contains the fix to this issue, however 12.1.2.1.1 or later is the recommended version).
  • Upgrade to Exadata Storage Server version 12.1.1.1.2 or later 12.1.1.1.x.
  • Apply patch 19695225 to all Exadata Storage Servers. At the time of writing a patch is available for Exadata versions 12.1.1.1.1, 11.2.3.3.1, and 11.2.3.3.0.
  • Avoid running CellCLI commands CREATE GRIDDISK or ALTER GRIDDISK until the code fix is applied via upgrade or patch apply.

 

I think it’s a good idea to run the check script on your storage cells as root to determine whether there’s any immediate risk (probably unlikely). If necessary, consider applying the patch – but you should be planning your patching to the QFSDP April 2015 now, right? 🙂

 

Mark

Tagged , , , ,

DBA 3.0 – How to Become a Real-World Exadata DBA – IOUG Collaborate 2015

According to a Book of Lists survey, 41% of people’s biggest fear is “public speaking”.  To put that into perspective, “death” is the biggest fear for 19%, “flying” for 18% and “clowns” don’t even register (which does make me seriously doubt the survey’s credibility).

I gave my first public presentation at IOUG Collaborate 2015 last week in Las Vegas and I didn’t die.

Why did do make your presentation debut at the second largest Oracle event on the calendar?  Excellent question.

Continue reading

Tagged , , , ,

GHOST glibc exploit

Exadata’s comp nodes and storage cells may be vulnerable to the glibc “GHOST” exploit that’s currently in the tech news (full control of remote systems can be obtained through gethostbyname()).

Remedial steps for Exadata can be found here:

glibc vulnerability (CVE-2015-0235) patch availability for Oracle Exadata Database Machine (Doc ID 1965525.1)

As it’s a vulnerability with glibc, other RHEL / OEL systems might also be affected.  “Unpatched” versions of glibc from 2.2 to 2.17 contain the exploit.

To check whether a system is vulnerable:

        rpm -q glibc
             glibc-2.12-1.132.el6_5.4.x86_64

If the version of glibc matches or is more recent than the versions below, the system is NOT vulnerable to the exploit.

• RHEL 5: 2.5-123
• RHEL 6: 2.12-1.149
• RHEL 7: 2.17-55

If the installed version is older than these versions, “yum update glibc” will install the latest version.  A server reboot is necessary.

 

Tagged , , , ,

Oracle’s Critical Patch Update for January 2015

Oracle announced their Critical Patch Update for January 2015 today.

The CPU includes a fix for this troubling exploit in E-Business Suite found by David Litchfield where EBS grants index privileges on the (SYS-owned) DUAL table to the public role by default.

The database exploit with the highest Homeland Security threat level is CVE-2014-6567 which could allow for pre-12c databases on Windows to be “entirely compromised”.  If you’re not running pre-12c databases on Windows, the threat score is noticeably reduced, but still a 6.5.

In other news, 12.1.0.2.3 is out, should you live your life on the bleeding edge of technology.  Quarterly Full Stack Download Patches for Exadata are referenced in the availability note but don’t yet link to public documents; no doubt they will soon.

SSL 3.0 is disabled by default in Java SE – thanks to POODLE (really), it’s now considered obsolete and SSL as a whole should be disabled as organizations “can no longer rely on SSL to ensure secure communications between systems”.

Quite a scary world out there, huh?

MOS reference notes: 1935468.1, 1942215.1

Tagged
%d bloggers like this: