Category Archives: Health Checks

Exadata Critical Issue DB27

Oracle announced a new Exadata Critical Issue yesterday (DB27) as per MOS 2004572.1. databases running with Grid Infrastructure 12.1 (either or will crash whenever a health update is received (such as when a cell disk is marked “predictive failure”).

The database ASMB process terminates causing the database instance to crash.  The following errors are reported in the database alert.log:

ORA-15064: communication failure with ASM instance
ORA-03115: unsupported network datatype or representation
ASMB: terminating the instance due to error 15064

Perform one of the following actions to prevent bug 20361671:

  1. Upgrade the Grid Infrastructure home to (Database Patch for Engineered Systems and DB In-Memory or later.
  2. Apply patch 20361671 to the Grid Infrastructure home.

At the time of writing, the patch README incorrectly omits the commands required to unlock and lock the Grid Infrastructure home before and after patching, respectively.

Prior to running the opatch command to apply the patch run the following command as the root user to unlock the Grid Infrastructure home:

$GI_HOME/crs/install/ -unlock

After applying the patch run the following command as the root user to lock the Grid Infrastructure home:

$GI_HOME/crs/install/ -patch

Tagged , , ,

Oracle OpenWorld 2015

Submitted my Oracle OpenWorld 2015 presentation earlier.  Today is the last day to submit proposals for presentations or tutorials.

Oracle have extended their deadline for proposals until May 6th!



Tagged , , , , ,

Exadata System Statistics

Since August 2012, the DBMS_STATS.GATHER_SYSTEM_STATS procedure has offered an ‘EXADATA‘ option to allow Exadata-specific system statistics to be gathered.   The following versions / patchsets of the database include this option:

  • Any version of 12c

Gathering Exadata-specific system statistics ensures the optimizer is aware of the Exadata performance features and takes them into account when determining the execution plan – often resulting in SmartScans (full-table scans) instead of indexes.

To determine the last time that system statistics were gathered on the database:

COL statistic FORMAT a55
COL value FORMAT a20
SELECT pname AS statistic,
pval2 AS value
FROM aux_stats$
ORDER BY pname;

DSTART                                                          03-28-2011 13:59
DSTOP                                                            03-28-2011 13:59
STATUS                                                          COMPLETED

This indicates that the system statistics were not run AFTER the ability to compile Exadata-specific statistics was made available (August 2012).  Also, they have not run since this particular database migrated from a V2 machine to an X3-2 machine earlier in the year, thus they are unlikely to be accurate.

This was the likely cause of a performance problem we encountered recently, which eventually required a FULL hint to be made to the optimizer to allow the query to complete in an acceptable timeframe.

To determine the values of the system statistics (I used DECODE to format them nicely):

‘CPUSPEED’,’CPUSPEED: (Workload) CPU speed in millions of cycles/second’,
‘CPUSPEEDNW’,’CPUSPEEDNW: (No Workload) CPU speed in millions of cycles/second’,
‘IOSEEKTIM’,’IOSEEKTIM: Seek time + latency time + operating system overhead time in milliseconds’,
‘IOTFRSPEED’,’IOTFRSPEED: Rate of a single read request in bytes/millisecond’,
‘MAXTHR’,’MAXTHR: Maximum throughput that the I/O subsystem can deliver in bytes/second’,
‘MBRC’,’MBRC: Average multiblock read count sequentially in blocks’,
‘MREADTIM’,’MREADTIM: Average time for a multi-block read request in milliseconds’,
‘SLAVETHR’,’SLAVETHR: Average parallel slave I/O throughput in bytes/second’,
‘SREADTIM’,’SREADTIM: Average time for a single-block read request in milliseconds’
) AS statistic,
pval1 AS value
FROM aux_stats$
ORDER BY pname;

CPUSPEED: (Workload) CPU speed in millions of cycles/second
CPUSPEEDNW: (No Workload) CPU speed in millions of cycles/second                            2351.43
IOSEEKTIM: Seek time + latency time + operating system overhead time in milliseconds 10
IOTFRSPEED: Rate of a single read request in bytes/millisecond                                      4096
MAXTHR: Maximum throughput that the I/O subsystem can deliver in bytes/second
MBRC: Average multiblock read count sequentially in blocks
MREADTIM: Average time for a multi-block read request in milliseconds
SLAVETHR: Average parallel slave I/O throughput in bytes/second
SREADTIM: Average time for a single-block read request in milliseconds

If the value for the MBRC system statistic is NULL, the optimizer uses the value for db_file_multiblock_read_count which, on this database, is 64.

Exadata benefits from higher values for MBRC as this increases the likelihood that the optimizer will choose to perform full table scans over an index.  Gathering Exadata system statistics will set the MBRC at 128 and will likely set a significantly higher value for the IOTFRSPEED.   These statistics are set based on your machine, not gathered, because:

  • the database won’t take the storage cells into account when calculating multi-block reads
  • direct path reads are not counted as multi-block reads for the MBRC system statistic

We should NOT gather system statistics with a workload on Exadata as the database will attempt to calculate the MBRC rate itself, likely resulting in a significantly lower (and inaccurate) MBRC value.

Gathering Exadata system statistics is pretty simple and should be done if the statistics are older than August 2012 or when you migrate a database to an Exadata machine. We can either back up the stats beforehand or keep a note of their values which we can use to manually set if required, then issue the following command:


Tagged , , , ,

DBA 3.0 – How to Become a Real-World Exadata DBA – IOUG Collaborate 2015

According to a Book of Lists survey, 41% of people’s biggest fear is “public speaking”.  To put that into perspective, “death” is the biggest fear for 19%, “flying” for 18% and “clowns” don’t even register (which does make me seriously doubt the survey’s credibility).

I gave my first public presentation at IOUG Collaborate 2015 last week in Las Vegas and I didn’t die.

Why did do make your presentation debut at the second largest Oracle event on the calendar?  Excellent question.

Continue reading

Tagged , , , ,

How Do *YOU* Measure SmartScans?

With SmartScans being the most important (and most unique) performance feature for Exadata, it’s incredibly useful to measure how well you’re making use of it.

But how?

There are a number of ways you can measure this, but none of these seem to be the DEFINITIVE method to do so.  Instead, it’s probably a good idea to more than one formula, if not all, to get a good idea of our SmartScan usage.

Why are there multiple formulas?  Because the existing database metrics don’t quite capture what we’re looking to measure.  For instance:

  • physical read total bytes‘ –  is all the data including compressed data AND SmartScan-ineligible data.
  • cell physical IO interconnect bytes‘  –  includes the writes (multiplied due to ASM mirroring) AND the reads.
  • cell IO uncompressed bytes‘  –  is the data volume for predicate offloading AFTER the Storage Index filtering and any decompression
  • cell physical IO interconnect bytes returned by smart scan‘  –  includes uncompressed data.

Continue reading

Tagged , , ,

My Collaborate IOUG 2015 Abstract

I will be presenting DBA 3.0 or “How to Become a Real-World Exadata DBA” at Collaborate 2015 – IOUG’s annual user conference – from April 12th to 16th at the Mandalay Bay Resort and Casino in Las Vegas. I submitted this as my abstract:

“DBA resources are more scarce than ever before and it can be very difficult to allocate time on anything but keeping the lights on – even when an organization has made a (substantial) hardware investment in Exadata.

However, if Exadata is treated like any other Oracle database, the promised “extreme performance” will likely be very underwhelming to developers, users and managers and can become unwieldy for DBAs to support.

On the other hand, when an organization configures and supports Exadata properly, they can realize exponential performance improvements in key IT infrastructure, can facilitate better business decisions and may actually reduce infrastructure costs.

The customer has bought a sports car – but might not realize that they haven’t taken it out of second gear (yet).

I will talk about the evolution of Exadata and then get into the “nuts and bolts” of how to support a high-performance Exadata environment as a Production DBA.

I will discuss how to get performance improvements of up to 20x, what NOT to do as an Exadata DBA and how Exadata can become the foundation of your organization’s high-performance enterprise infrastructure.”

I hope to see you in Las Vegas!

Tagged , , ,

Cross-Node Consistency Best Practice Checks

I noticed that Oracle either released or updated MOS 1662018.1 – Oracle Sun Database Machine Cross Node Consistency Best Practice Checks.

At first glance, I don’t see anything in there that isn’t covered by exachk, but in case you don’t want to/can’t easily run an exachk report on your machine, this could help for upgrades, etc.

Tagged , , , ,

UKOUG 2014 – Dan Norris – Exadata Security Best Practices

Dan Norris of the Maximum Availability Architecture team gave what sounded like a very interesting presentation at UKOUG 2014. There seemed to be a lot of really cool stuff at this year’s event, which is to be expected as I no longer reside in the UK!

I encourage you to take a look at the slides, but also at the interesting links he provided:

Naturally, he also quoted a plethora of My Oracle Support notes – some of the greatest hits and some which you might not have seen before:

  • Responses to common Exadata security scan findings (Doc ID 1405320.1)
  • Oracle Sun Database Machine X2-2/X2-8, X3-2/X3-8 and X4-2 Security Best Practices (Doc ID 1071314.1)
  • How to change OS user password for Cell Node, Database Node , ILOM, KVM , Infiniband Switch , GigaBit Ethernet Switch and PDU on Exadata Database Machine (Doc ID 1291766.1)
  • Exadata Database Machine and Exadata Storage Server Supported Versions (Doc ID 888828.1)
  • Information Center: Oracle Exadata Database Machine (Doc ID 1306791.2)

Happy reading!

Tagged , , , , ,

ORAchk and Collection Manager

Quite by chance, I noticed today that Oracle are now offering an exachk-like health check for non-Exadata systems: ORAchk.

This includes some of the exachk functionality and replaces the RACcheck tool for Oracle databases (both clustered and single-instance).

One of the components of ORAchk is the Collection Manager, a ApEx web app, which provides a unified dashboard view of collections (ORAchk, RACcheck and exachk) across your environment.

The Collection Manager uses ApEx 4.2 and can be run against all editions of the database (XE, SE1, SE, EE) or higher. It is supported as part of your support contract, with the exception of the XE edition – you’ll have to visit the OTN forums for help with that.

There are two features in particular which interest me: the ability to compare different health check runs and the creation of incidents for tracking of issues.

Oddly enough, I had just started to create my own system to do just this today – which would have only provided a fraction of what this does, of course. I just need to find somewhere to run ApEx, a spare database and to brush up on my ApEx knowledge.

As a slight tangent, I think it’s a little strange that Oracle are running this on ApEx. I presume this is because it’s an early version and I would imagine that this – along with the OCM/ASR functionality – might end up making its way into a future release of OEM pretty soon as Oracle continue to mature the management of Exadata and its cohorts.

Useful links:


Tagged , , , ,
%d bloggers like this: