Tag Archives: Exadata

Exadata X6

Blink and you might have missed it, but the Exadata X6 was officially announced today

As has become the norm, Oracle have doubled-down on the specs compared to the X5:

  • 2x disk capacity
  • 2x Flash capacity
  • 2x faster Flash
  • 25% faster CPUs
  • 13% faster DRAM

With the X6-2 machine, you still have Infiniband running at 40Gb/sec, but the compute nodes and the storage servers now have the following:

X-6 Compute Node

  • 2x 22-core Broadwell CPUs
  • 256Gb of DDR4 DRAM (expandable to 768Gb)
  • 4x 600Gb 10,000 RPM disks for local storage (expandable to 6)
  • DDR4 DRAM

High Capacity Storage Server

  • 2x 10-core Broadwell CPUs
  • 128Gb of DDR4 DRAM
  • 12x 8Tb 7,000 RPM Helium SAS3 disks
  • 4x 3.2Tb NVMe PCIe 3.0 Flash cards

Extreme Flash Storage Server

  • 2-socket, 10-core Broadwell CPUs
  • 128Gb of DDR4 DRAM
  • 8x 3.2Tb NVMe PCIe 3.0 Flash cards

What does all of that give you when it comes down to it?

Well, remember that the eighth-rack is the same as a quarter-rack, but you have access to half the cores and half the storage across the board (you still have two compute nodes and three storage servers):

High Capacity Eighth-Rack

  • 44-core compute nodes
  • 30-core storage servers
  • 144Tb raw usable disk storage
  • 19.2Tb Flash storage

Extreme Flash Eighth-Rack

  • 44-core compute nodes
  • 30-core storage servers
  • 38.4Tb Flash storage

Minimum licensing requirements is 16 cores for the eighth-rack and 28 cores for the quarter-rack.

I’m sure you can read through the sales stuff yourself, but aside from the UUUUGE increase in hardware, two new features of the X6 really pop out for me.

Exadata now has the ability to preserve storage indexes through a storage cell reboot. Anyone who had to support an older Exadata machine will remember quite how much of a big deal that used to be: the wait for the storage index to be rebuilt would take hours and often require some major understanding on the part of user population and management to get through the first day or so after some maintenance.

Probably the biggest thing is that Oracle have introduced high availability quorum disks for the quarter-rack and eighth-rack machines. I blogged about this before as I thought it had the potential to be a real “gotcha” if you were expecting to run high redundancy diskgroups on anything less than a half-rack.

No longer.

Now, a copy of the quorum disk is stored locally on each database node, allowing you to lose a storage cell and still be able to maintain your high redundancy.

This is a particularly useful development when you remember that Oracle have doubled the size of the high-capacity disks from 4Tb to 8Tb. Why? Well, because re-balancing a bunch of 8Tb disks is going to take longer than re-balancing the same number of 4Tb disks.

I’ll be going to Collaborate IOUG 2016 next week and I’m looking forward to hearing more about the new kit there.

Mark

Advertisements
Tagged , , ,

Today’s Nugget: Oracle OpenWorld 2015 … Or Not

Alas, my submission for this year’s Oracle OpenWorld was turned down by Oracle a little while ago.

Maybe I shouldn’t have installed this browser extension?

Tee-hee ūüôā

I’m nothing if not persistent(ly annoying) – so I submitted a similar abstract to the 2016 RMOUG Training Days.

Tagged , , , ,

Dude … Where’s My GUI?

Are you running Exadata Storage Server 12.1.2.1.0?

Have you tried to get a graphical tool, such as DBCA, DBUA or even VNC to run on your Exadata lately?

If, like me, you had quite the struggle until a friendly sysadmin installed a bunch of packages for you, you might be interested in reading this MOS note:

1969308.1 – Unable to run graphical tools (gui) on Exadata 12.1.2.1.0 or later.

I understand that Oracle have been increasingly hardening Exadata since the birth of their X3-2 machines, but you’d think that you wouldn’t need to add extra packages to a system that’s meant to be “ready to use” once your choice of consultant has finished with the Deployment Assistant.

After all, aren’t DBCA / DBUA Oracle’s tools of choice? Do they really want DBAs to spend their time creating response files and running these tools from the command line?

Odd.

Tagged , , , , , ,

Exadata: why a half-rack is the “recommended minimum size”

Lots of shops dipped their toes in the Exadata water with a quarter-rack first of all.

(For those who are new to the Exadata party and don’t know of a world without elastic configurations, a quarter-rack is a machine with two compute nodes and three storage cells).

If you are / were one of those customers, you’ll probably have winced at the difference between the “raw” storage capacity and the “usable” storage capacity when you got to play with it for the first time.

While you could choose to configure your DATA and RECO diskgroups with HIGH redundancy in ASM, did you notice that you couldn’t do the same with the DBFS_DG / SYSTEM_DG?

Check out page 5 in this document about best practices for consolidation on Exadata.

“A slight HA disadvantage of an Oracle Exadata Database Machine X3-2 quarter or eighth rack is that there are insufficient Exadata cells for the voting disks to reside in any high redundancy disk group which can be worked around by expanding with 2 more Exadata cells. Voting disks require 5 failure groups or 5 Exadata cells; this is one of the main reasons why an Exadata half rack is the recommended minimum size.”

Basically, you need at least 5 storage cells for each Exadata environment if you want to have true “high availability” with your Exadata machine.

While quarter-rack machines have 3 storage cells, half-rack machines have 7 or 8 storage cells, depending on the model.

Let’s say that you have the model with 8 storage cells: ¬†if you split a half-rack machine equally, you’ll have 2x quarter-rack machines with 4 storage cells,¬†so you would need one more storage cell per machine to provide HA for the SYSTEMDG / DATA_DG diskgroup.

For some reason, this nugget escaped my attention until recently. ¬†Even more reason to have a standby Exadata machine at your DR site …

Mark

 

Tagged , , , , ,

Exadata Critical Issue EX19

Overnight, Oracle announced a new Exadata Critical Issue (EX19) which applies to storage cells running 12.1.1.1.1 or earlier of the ESS software.

The bug is 19695225 and more information can be found on MOS 1991445.1.

Cell disk metadata corruption and loss of cell disk content (i.e. grid disk, ASM disk) will occur if many CREATE GRIDDISK or ALTER GRIDDISK commands that modify cell disk space configuration are run over time for the same cell disk.

If CellCLI griddisk commands are typically run in parallel on all storage servers simultaneously, which is a common maintenance practice, and the issue occurs on multiple storage servers at the same time such that all redundant disk extents are lost for files in an ASM disk group, then the disk group will dismount and database will crash, and will require restoring files from backup.

Rolling cell maintenance commands that change grid disk state, such as ALTER GRIDDISK INACTIVE and ALTER GRIDDISK ACTIVE, do not contribute to this issue.

Since initial system deployment if you have recreated or reconfigured grid disks using CellCLI commands CREATE GRIDDISK or ALTER GRIDDISK more than 31 times, then the likelihood of occurrence is high.

 

Risk and Detection
The risk to test and development systems is expected to be higher than production systems due to the dynamic manner in which they may be reconfigured.

To determine if your system is exposed to this issue, and how close the system is to having cell disk metadata corruption, download and run the script attached to this document on all storage servers as the root user.

Possible symptoms that cell disk metadata corruption has occurred as a result of this bug include the following:

  • ASM disk group(s) dismount and database crash following CREATE GRIDDISK or ALTER GRIDDISK.
  • ASM disk group(s) cannot be mounted following the disk group dismount.
  • Error ORA-600 [addNewSegmentsToGDisk_2] is reported in the cell alert.log.

 

The cell disk corruption cannot be repaired once it occurs.  Recovery requires recreating cell disks, grid disks, and ASM disk groups, then restoring affected databases from backup.

Perform one of the following actions to prevent bug 19695225:

  • Upgrade to Exadata Storage Server version 12.1.2.1.1 or later (Exadata 12.1.2.1.0 contains the fix to this issue, however 12.1.2.1.1 or later is the recommended version).
  • Upgrade to Exadata Storage Server version 12.1.1.1.2 or later 12.1.1.1.x.
  • Apply patch 19695225 to all Exadata Storage Servers. At the time of writing a patch is available for Exadata versions 12.1.1.1.1, 11.2.3.3.1, and 11.2.3.3.0.
  • Avoid running CellCLI commands CREATE GRIDDISK or ALTER GRIDDISK until the code fix is applied via upgrade or patch apply.

 

I think it’s a good idea to run the check script on your storage cells as root to determine whether there’s any immediate risk (probably unlikely). If necessary, consider applying the patch – but you should be planning your patching to the QFSDP April 2015 now, right? ūüôā

 

Mark

Tagged , , , ,

Exadata System Statistics

Since August 2012, the¬†DBMS_STATS.GATHER_SYSTEM_STATS¬†procedure has offered an ‘EXADATA‘ option to allow Exadata-specific system statistics to be gathered.¬†¬† The following versions / patchsets of the database include this option:

  • 11.2.0.2.18
  • 11.2.0.3.8
  • 11.2.0.4
  • Any version of 12c

Gathering Exadata-specific system statistics ensures the optimizer is aware of the Exadata performance features and takes them into account when determining the execution plan – often resulting in SmartScans (full-table scans) instead of indexes.

To determine the last time that system statistics were gathered on the database:

COL statistic FORMAT a55
COL value FORMAT a20
SELECT pname AS statistic,
pval2 AS value
FROM aux_stats$
WHERE pname IN (‘STATUS’,’DSTART’,’DSTOP’)
ORDER BY pname;


DSTART                                                          03-28-2011 13:59
DSTOP                                                            03-28-2011 13:59
STATUS                                                          COMPLETED

This indicates that the system statistics were not run AFTER the ability to compile Exadata-specific statistics was made available (August 2012).  Also, they have not run since this particular database migrated from a V2 machine to an X3-2 machine earlier in the year, thus they are unlikely to be accurate.

This was the likely cause of a performance problem we encountered recently, which eventually required a FULL hint to be made to the optimizer to allow the query to complete in an acceptable timeframe.

To determine the values of the system statistics (I used DECODE to format them nicely):

SELECT DECODE(pname,
‘CPUSPEED’,’CPUSPEED: (Workload) CPU speed in millions of cycles/second’,
‘CPUSPEEDNW’,’CPUSPEEDNW: (No Workload) CPU speed in millions of cycles/second’,
‘IOSEEKTIM’,’IOSEEKTIM: Seek time + latency time + operating system overhead time in milliseconds’,
‘IOTFRSPEED’,’IOTFRSPEED: Rate of a single read request in bytes/millisecond’,
‘MAXTHR’,’MAXTHR: Maximum throughput that the I/O subsystem can deliver in bytes/second’,
‘MBRC’,’MBRC: Average multiblock read count sequentially in blocks’,
‘MREADTIM’,’MREADTIM: Average time for a multi-block read request in milliseconds’,
‘SLAVETHR’,’SLAVETHR: Average parallel slave I/O throughput in bytes/second’,
‘SREADTIM’,’SREADTIM: Average time for a single-block read request in milliseconds’
) AS statistic,
pval1 AS value
FROM aux_stats$
WHERE pname IN (‘CPUSPEEDNW’,
‘IOSEEKTIM’,’IOTFRSPEED’,
‘SREADTIM’,’MREADTIM’,
‘CPUSPEED’,’MBRC’,
‘MAXTHR’,’SLAVETHR’)
AND sname = ‘SYSSTATS_MAIN’
ORDER BY pname;

CPUSPEED: (Workload) CPU speed in millions of cycles/second
CPUSPEEDNW: (No Workload) CPU speed in millions of cycles/second                            2351.43
IOSEEKTIM: Seek time + latency time + operating system overhead time in milliseconds 10
IOTFRSPEED: Rate of a single read request in bytes/millisecond                                      4096
MAXTHR: Maximum throughput that the I/O subsystem can deliver in bytes/second
MBRC: Average multiblock read count sequentially in blocks
MREADTIM: Average time for a multi-block read request in milliseconds
SLAVETHR: Average parallel slave I/O throughput in bytes/second
SREADTIM: Average time for a single-block read request in milliseconds

If the value for the MBRC system statistic is NULL, the optimizer uses the value for db_file_multiblock_read_count which, on this database, is 64.

Exadata benefits from higher values for MBRC as this increases the likelihood that the optimizer will choose to perform full table scans over an index.  Gathering Exadata system statistics will set the MBRC at 128 and will likely set a significantly higher value for the IOTFRSPEED.   These statistics are set based on your machine, not gathered, because:

  • the database won’t take the storage cells into account when calculating multi-block reads
  • direct path reads are not counted as multi-block reads for the MBRC system statistic

We should NOT gather system statistics with a workload on Exadata as the database will attempt to calculate the MBRC rate itself, likely resulting in a significantly lower (and inaccurate) MBRC value.

Gathering Exadata system statistics is pretty simple and should be done if the statistics are older than August 2012 or when you migrate a database to an Exadata machine. We can either back up the stats beforehand or keep a note of their values which we can use to manually set if required, then issue the following command:

EXEC DBMS_STATS.GATHER_SYSTEM_STATS(‘EXADATA’);

Tagged , , , ,

How Do *YOU* Measure SmartScans?

With SmartScans being the most important (and most unique) performance feature for Exadata, it’s incredibly useful to measure how well you’re making use of it.

But how?

There are a number of ways you can measure this, but none of these seem to be the DEFINITIVE method to do so. ¬†Instead, it’s probably a good idea to more than one formula, if not all, to get a good idea of our SmartScan usage.

Why are there multiple formulas? ¬†Because the existing database metrics don’t quite capture what we’re looking to measure. ¬†For instance:

  • physical read total bytes‘ – ¬†is all the data including compressed data AND SmartScan-ineligible data.
  • cell physical IO interconnect bytes‘ ¬†– ¬†includes the writes (multiplied due to ASM mirroring) AND the reads.
  • cell IO uncompressed bytes‘ ¬†– ¬†is the data volume for predicate offloading AFTER the Storage Index filtering and any decompression
  • cell physical IO interconnect bytes returned by smart scan‘ ¬†– ¬†includes uncompressed data.

Continue reading

Tagged , , ,

IOUG Exadata Virtual Conference – February 11th and 12th

The IOUG Exadata SIG are holding a virtual conference next week and they’ve got some great speakers:

Wednesday 11th

10:00 a.m. – 11:00 a.m. CST
Exadata X5: Working Smart with Oracle Exadata Database Machine
Speaker: Gurmeet Goindi, Exadata Product Management, Oracle

11:00 a.m. – 12:00 p.m. CST
Oracle Database In-Memory And Exadata: Do I Still Need Exadata?
Speaker: Matt Steinberg, Oracle

12:00 p.m. – 1:00 p.m. CST
Exadata Performance: Latest Improvements and Less Known Features
Speaker: Tanel Poder

Thursday 12th

10:00 a.m. – 11:00 a.m. CST
Smart Analytics and Capacity Management for DbaaS using R
Speaker: Chaitanya Geddam, Practice Director, Accenture Enkitec Group

11:00 a.m. – 12:00 p.m. CST
Exadata Best Practices
Speaker: Dan Norris, Oracle

12:00 p.m. – 1:00 p.m. CST
Exadata and Hadoop Integration Patterns
Speaker: Aaron Werman, First Data

Tagged , , ,

Exadata and OVM

Exadata and OVM.

OVM and Exadata.

It’s not been the best-kept secret in the world, but it is now a reality with Oracle’s new X5 engineered systems.

I don’t like it either, though I admit that I might just be a purist snob. As far as I can see, this might be useful in two possible scenarios:

1) Saving on additional cost option licensing.
Picture this: you have four databases on your Exadata machine and only one of them needs the {INSERT EXPENSIVE COST OPTION HERE} option.

Instead of buying, for instance, an Advanced Security license for all 144 cores, you might consider dividing up your X5-2 half-rack into four virtual machines – one for each database – and only license Advanced Security for the virtual machine on which that particular database resides.

Assuming each virtual machine is provisioned identically (with 36 cores each instead of the full 144), the cost of licensing ASO is 25% of what it was if you had licensed the entire machine.

Some of those cost options are expensive, definitely. But why not consider a smaller, dedicated Exadata machine for that database instead? Why not consider an alternative instead, such as ODA?

2) Capacity on-demand licensing.
Let’s say that you KNOW you’re going to migrate more databases onto your Exadata machine in the future, but you’re not using its full capabilities to support the databases that are running there right now. Bear with me for argument’s sake…

With OVM, you’re able to license a minimum of 40% of the cores on your Exadata system. If you’re not getting close to fifth gear right now, but you know you will be at some point, you could use OVM to license in a “capacity on-demand” fashion and crank things up as your needs increase.

Of course, given the exponential improvements that come with each new version of Exadata, wouldn’t you try your best to wait until a couple of months before you DID need the extra horsepower so you could buy the latest and greatest Exadata then?

Let’s say you DO eventually get to 100% usage, you still have that extra virtualization layer in the stack and whatever issues go with it, including having to maintain it. To remove it, one assumes that the machine would need to be rebuilt, which isn’t a particularly attractive option.

“Exadata is expensive”
I understand the “Exadata is expensive” argument, but I don’t really think this helps with that very much – you’re still laying down a big wad of cash when you buy the hardware, no matter how you slice the licensing up. Is it really going to be worth the hassle of that extra virtualization layer to save (and possibly only temporarily) on licenses?

Oddly, I think the new elastic configuration capability in X5 makes the argument harder to make: you could achieve the same thing by choosing a different hardware configuration and/or adding comp nodes or storage cells as your needs dictate.

I’m sure there’s a compelling reason out there for putting OVM on Exadata that I haven’t figured out yet, there usually is. Until then, I’m back to scratching my head…

Tagged , , , , , ,

Exadata X5 – Yet ANOTHER Level

A couple of weeks ago, I admitted my confusion and bemusement over Oracle’s cloud AND engineered systems strategy. Sometimes, IT workers can get very touchy over people thinking that they might not know EVERYTHING about EVERYTHING, but not I, apparently.

Not only did I scratch my head on my blog, but did so very publicly on LinkedIn too. ¬†In all honesty, I really appreciated the input from some very smart people and I do understand the logic a lot more now. ¬†Admitting that you don’t have the answer to every question is liberating sometimes and personally beneficial almost every time.

Basically, Oracle are going big on engineered systems. ¬†If customers really are serious about migrating to THE CLOUD(TM) and have made a strategic decision to never,ever buy any hardware ever again –¬†I often find that the most reasoned decision involves limiting your options on ideological grounds – Oracle will add these systems to their PaaS offering instead of selling them for on-site use. ¬†Win-win.

It’s still doesn’t really tessellate perfectly for me, but at least it makes more sense now. ¬†I’m sure you’ve all seen the data sheets by now, so here’s a few pennies for my thoughts:

A full-rack can read and write 4m IOPS:  I presume this is four MILLION IOPS, which is a seriously impressive number. To put it into context, the X3-2 quarter-rack was rated for 6,000 IOPS!

The Oracle Database Appliance now comes with FlashCache and InfiniBand:  which should make the ODA worthy of very serious consideration for a lot of small-to-medium-sized enterprises.

Goodbye High Performance drives:¬† they’ve been replaced with a flash-only option. ¬†Not only is it Flash, but it’s “Extreme Flash“, no less.

Do I trust all-Flash storage?  No.
Since moving off V2 and leaving Bug Central, have I encountered any problems whatsoever with the FlashCache?  No.
Can I justify my distrust in Flash storage?  Without delving into personality defects, probably not.

There’s a “gotcha” with the Extreme Flash drives:¬† the license costs are DOUBLE that of High Capacity drives. I don’t understand the reasoning behind this, unless Oracle are specifically targeting clients for whom money is no option with this (and they probably ARE in a way).

Configuration elasticity is cool:¬† you can pick and choose how many compute nodes / storage cells you buy. ¬†I do remember in the days of the V1 and V2 when¬†you couldn’t even buy more storage to an existing machine. ¬†The rationale being that you’d mess the scaling all up (offloading, etc).

It’s a really great move for Oracle to make this very flexible and will go some way to silencing those who claim that Exadata is monolithic (and, don’t forget, expensive).

You can now virtualize your Exadata machine with OVM:¬† I haven’t had the best of luck ever getting OVM to work properly, so I’ll defer my views¬†on that for the time being, though the purist thinks they’re dumbing down the package by offering virtualization at all. ¬†Isn’t that what the Exalytics machine is for?

OK, fine, they want to bring Exadata to the masses and it’s an extension of the “consolidation” drive they’ve been on for a couple of years, but it’s a bit like buying a top-end Cadillac and not wanting to use high-grade gasoline because it’s too expensive.

Other cool-sounding new Exadata features that made my ears prick up:

  • faster pure columnar flash caching
  • database snapshots
  • flash cache resource management – via the ever-improving IORM
  • near-instant server death detection – this SOUNDS badass, but could be a bit of a sales gimmick; don’t they already do that?
  • I/O latency capping – if access to one copy of the data is “slow”, it’ll try the other copy/copies instead.
  • offload of JSON and XML analytics – cool, I presume this is offloaded to the cells.

I didn’t have the chance to listen to Oracle’s vision of the “data center of the future” – I think it had something to do with their Virtual Compute Appliance competing against Cisco’s offerings and “twice the price at half the cost“.

Oracle’s problem is still going to be¬†persuading customers to consider VALUE instead of COST. ¬†“Exadata is outrageously expensive” is something I’m sure everyone hears all the time and to claim it’s “cheap” isn’t going to work because managers with sign-off approval can count.

Is it expensive?  Of course.  Is it worth it?  Yes, if you need it.

This is why I’m unconvinced that customers will buy an Exadata machine and then virtualize it. ¬†The customers who are seriously considering Exadata are likely doing so because they NEED that extreme performance. ¬†You can make a valid argument for taking advantage of in-house¬†expertise once your DBA team has their foot in the door – best of breed, largest pool of talent and knowledge, etc.

However, so many companies are focusing solely on the short-term and some¬†exclude their SMEs from strategic discussions altogether. ¬†Getting to a point where the DBA team is able to enforce¬†Exadata as¬†the gold standard in an IT organization¬†is going to be incredibly difficult without some sort of sea change across the entire industry and … well, the whole economy, really.

I’m not sure what caused it, but I came away with a feeling that these major leaps in performance were very distant to me. Maybe it’s because I don’t personally see much evidence of companies INVESTING in technology, but still attempting to do “more with less” (see all THE CLOUD(TM) hype).

I’m really not convinced there is much appetite out there to maximize¬†data as an asset or to gain a competitive advantage through greatly enhancing business functionality so much as there is to minimize IT expenditure as much as possible. ¬†Cost still feels seems to be¬†the exclusive driver behind business decisions, which is a real shame because it’s difficult to imagine a BETTER time to spend to invest in enterprise data than right now.

Said the DBA, of course.

Tagged , , , , , , ,
%d bloggers like this: