Tag Archives: oracle

Azure host-caching with Oracle database workloads on Azure VMs

There are almost 2,000 Azure customers running Oracle database workloads on Azure VMs, and that number is growing rapidly all the time.  Running Oracle on Azure is not the first idea to occur to many people, but it should be, as there are several types of VMs appropriate for Oracle workloads within Azure, as well as multiple choices for storage, ranging from inexpensive to high-performance.  Overall, Oracle performs very well in Azure, and there are several great features in Azure to make running Oracle even easier, such as Azure VM Backup (i.e. topic for another post).

One of the most common configurations is running one or more Oracle databases on an Azure VM with premium SSD managed disk.  When attaching managed disk to a VM, one has the option of enabling ReadOnly host-caching or ReadWrite host-caching for the managed disk device.

What is host-caching?  Essentially it is locally-attached SSD on the physical host underlying the virtual machines, and it is documented HERE.  When the VM is allocated to a physical host and running, locally-attached SSD is used with a simple LRU1 caching algorithm for I/O between the VM and the virtualized network-attached managed disk device.  ReadOnly host-caching caches only read I/O from the VM to the managed disk device, while ReadWrite host-caching caches both read and write I/O operations from the VM to the managed disk.

Please note that ReadWrite host-caching is intended either as the default setting for the VM OsDisk, or for data disks using filesystems with barrier functionality enabled, as cited HERE.  It is not a great idea to use ReadWrite host-caching with Oracle database files, because of possible data-integrity issues resulting from lost writes should the VM crash.

But the rest of this article pertains to the use of ReadOnly host-caching for managed disk for Oracle database workloads.

If the Oracle online redo log files are isolated from other database files and reside on their own ReadOnly host-cached disk, then…

Generally each block in online redo log files are written once and then read only once.  ReadOnly host-caching ignores the write altogether and does the work to cache the single read, as it is designed.  However, if there is never second, third, or subsequent read, then the maintenance of the cache chains and the cache buffers performed at the first read is entirely wasted.  Worse still, this happens with every single block written to the online redo log files, over and over again;  the cache is constantly manipulated, consuming compute resources, but to no benefit.  I don’t believe that this activity slows I/O operations for the online redo log files significantly, but simply increases the consumption of compute resources which could otherwise be used productively, which of course means that the overall system is capable of accomplishing less work.

If the Oracle online redo log files are not isolated and reside on the same ReadOnly host-cached disk as other database files…

This is where actual performance decreases can result.  Oracle datafiles, tempfiles, controlfiles, BFILEs, files for external tables, block change tracking files, and Flashback log files do in fact use ReadOnly host-caching appropriately, with numerous reads of the same block after each write, not to mention the buffer caching within the RDBMS itself.  If Oracle online redo log files activity is mixed into the same workload as the Oracle datafiles, tempfiles, controlfiles, etc, then ReadOnly host-caching is forced to cache the online redo log files as well, even though there is zero benefit, along with the rest of the database files.  As a result, blocks stay in the cache for less time due to the constant rollover of buffers enforced by the sequential online redo log files workload, making the ReadOnly cache less effective for all files, and causing more “overhead” as pages/blocks needlessly “flushed” from the cache by the sequentially-written redo need to re-acquired.  This is the scenario where the ReadOnly host-caching is noticeably less effective, and the average I/O latency does not improve.

Also, if using the D_v3 or E_v3 or M series of VMs…

ReadOnly host-caching actually provides slower I/O throughput on these VM instance types than uncached I/O, which is impressively fixed with the Dds_v4 and Eds_v4 series.  For example, this table (below) summarizes from the documentation pages the differences in IOPS and I/O throughput limits of the Standard_E32s_v3 series and the Standard_E32ds_v4 instance types, as illustration…

SizevCPUMemory (GB)Max cached and temp storage IOPSMax cached and temp storage throughput (MBPS)Max uncached disk IOPSMax uncached disk throughput (MBps)
Standard_E32s_v3322566400051251200768
Standard_E32ds_v432256308000193651200768
Comparison of cached throughput limits of the same VM instance type in v3 vs v4

As you can see, I/O throughput for cached storage is lower than the I/O throughput for uncached storage for v3, while a more intuitive result is apparent for v4.

The same is true for other size instance types, of course; I just didn’t display them.

In summary, if you wish to reap the expected benefits of ReadOnly host-caching, do not use D_v3 or E_v3 instance or anything in the M-series instance types.

So in summary, I think these are good rules for using Azure VM host-caching with Oracle database workloads…

  1. Don’t use ReadWrite host-caching with Oracle database files
  2. Isolate the transaction logs from other database files to separate managed disk devices and disable host-caching
  3. Use ReadOnly host-caching for all other Oracle database files
  4. Do not use the D_v3, E_v3, or M-series VMs because host-caching and ephemeral SSD is slow compared to Dds_v4 and Eds_v4 series VMs

Oracle dNFS on NFS v4.1

If you search the Oracle Support website or the internet for information on Oracle’s direct NFS (dNFS), you’ll find a decent amount of useful information, including coded examples. However, if you look a little more closely, all of those examples show NFS v3.0 in use, and do not show examples using higher versions of NFS.

Oracle19c documentation states that Direct NFS Client supports NFSv3, NFSv4, NFSv4.1, and pNFS protocols to access the NFS server, but you have to search more deeply to learn that Oracle v11.2 supports only NFS v3.0, while v4.0, v4.1, and parallel NFS (pNFS) are supported from Oracle v12.1 and higher.

The same 19c documentation page also states that Direct NFS Client can obtain NFS mount points either from the operating system mount entries, or from the oranfstab file, but it doesn’t state the important detail that obtaining NFS mount point information is available only with NFS v3.0, while the oranfstab configuration file is required for higher versions of NFS. This is useful information that should be documented, but is not.

Using dNFS with all versions of NFS requires running the command make -f ins_rdbms.mk dnfs_on in the $ORACLE_HOME/rdbms/lib subdirectory, as well as ensuring that the oradism executable in the $ORACLE_HOME/bin subdirectory has is owned by root and has SETUID file permissions.

But creating an oranfstab configuration file is only optional with NFS v3.0; the dNFS client can obtain information about NFS mount points from the standard UNIX/Linux /etc/mtab file. However, for NFS v4.x, it is required to create the oranfstab file, either in the global /etc directory or in the installation-specific $ORACLE_HOME/dbs subdirectory.

Documentation for the contents of the oranfstab file is found HERE, but there is another important detail that is not included in the documentation: the ordering of the attributes, listed as name/value pairs, matters.

For example, if the contents of the oranfstab file looks like this, with the properties ordered as follows, then all works well…

server: anfv41
local:  10.0.0.4
path:   10.1.0.4
export: /ora-anfv41-vol01 mount: /u02
nfs_version: NFSv4.1
security_default: sys

However if, for example, the path and local properties were specified in reversed order, then dNFS does not work properly…

server: anfv41
path:   10.1.0.4
local:  10.0.0.4
export: /ora-anfv41-vol01 mount: /u02
nfs_version: NFSv4.1
security_default: sys

The server attribute must always be specified first as the “header” of the following attributes describing a specific mount-point. Of course, the oranfstab file can contain information about multiple mount points, each of which is initiated by the server attribute.

The local attribute must come next, followed by the path attribute.  The export and mount attributes must be on the same line, following path.  I have tried different variations of ordering for the nfs_version and security_default attributes with no adverse effect, but it seems logical to specify them last, and the order they are specified doesn’t seem to matter.

As you can imagine, this behavior can easily lead to insidious problems. When dNFS is misconfigured, the result is not that dNFS is disabled, but rather that dNFS is enabled but far from optimally. In general, performance is abysmal, resulting in cases I’ve observed where a database startup which normally completes in seconds taking upwards of 30 minutes, effectively rendering the database unusable.

In summary, upgrading from NFS v3.0 to NFS v4.1 is a good thing to do today, especially for the improvements in security (i.e. encryption, authentication, parallel NFS, etc), as well as the improvements in stability (i.e. integrated locking with leases, etc) and simplicity (i.e. more firewall-friendly by using fewer ports, etc).

Best of all, even though Oracle is light on documentation for higher versions of NFS, you don’t have to abandon Oracle’s direct NFS (dNFS) while taking advantages of the improvements offered by NFS v4.1.

But it would be nice if Oracle documented all of this more coherently.

Presentation “Real-Time SQL Tuning” at Hotsos Sym2014 and UTOUG TD2014

This presentation was delivered at the Hotsos Symposium 2014 and at the Utah Oracle Users Group (UTOUG) Training Days 2014.

It describes a PL/SQL package that I put together called “ash_xplan.sql” which can be used to display real-time information about a SQL statement currently executing.  This package combines information from the Active Session History (V$ACTIVE_SESSION_HISTORY) or “ASH” view with information from the DBMS_XPLAN package.

As of Oracle12c v12.1, nothing in Oracle displays elapsed time while the SQL statement is still executing, not even SQL Monitor.

The source code for the “ash_xplan.sql” script is on the Scripts page of this website, along with sample spooled output.

Presentation “Three Types Of Table Compression”

There are several different types of compression available in the Oracle database for compressing table data, and so the question must arise: which should be used, and when, for which purpose? This presentation will discuss all the different types of table data compression in Oracle database (i.e. basic/advanced de-duplication compression and hybrid-columnar compression), as well as some older often-forgotten techniques for fitting more data into less space. The presentation will begin with a brief review of data compression in general, followed by a review of the formats used for storage in database blocks and rows, the latter supported by block dumps. All of this will be a prelude for an knowledgeable discussion of the advantages and disadvantages of each type of table compression in Oracle.

This is the presentation slidedeck to accompany the white paper of the same title.

Paper “Three Types Of Table Compression”

There are several different types of compression available in the Oracle database for compressing table data, and so the question must arise: which should be used, and when, for which purpose? This presentation will discuss all the different types of table data compression in Oracle database (i.e. basic/advanced de-duplication compression and hybrid-columnar compression), as well as some older often-forgotten techniques for fitting more data into less space. The presentation will begin with a brief review of data compression in general, followed by a review of the formats used for storage in database blocks and rows, the latter supported by block dumps. All of this will be a prelude for an knowledgeable discussion of the advantages and disadvantages of each type of table compression in Oracle.

This is the white paper to accompany the presentation slidedeck of the same title.

Presentation “RDBMS Forensics – Troubleshooting Using ASH”

Arthur C. Clarke wrote that any sufficiently advanced technology is indistinguishable from magic, and it sometimes makes troubleshooting seem like magic, or perhaps art. But it is neither magic nor art; it is process and this presentation uses a case study, such as resolving the ORA-03136 “connection lost” error, to demonstrate that process and to show how components such as ASH, typically thought of as performance tuning tools for DBAs, are also useful diagnostic tools for everyone, particularly developers.  Attendees will learn a forensic, empirical approach to troubleshooting and see a demonstration of a complex troubleshooting task resulting in a non-intuitive but empirically sound resolution.  This presentation should make you think of Active Session History as something like queryable trace information captured and stored within the database.

This is the presentation slidedeck to accompany the white paper of the same title.