First Thoughts: BlueMedora’s Oracle Enterprise Manager Plugin for VMware

This post reviews the Oracle Enterprise Manager Plugin for VMware from BlueMedora.  This commercial (for-fee) plugin integrates into OEM 12c to provide visibility into your VMware environment for Cloud Control users.  I have only had the plugin installed for two days so this will serve as more of a “first thoughts” report than as a full review of the product and all of its capabilities.

Prerequisites

The plugin, license and support are available directly from BlueMedora.  They are also currently offering a fully-functional 30-day free evaluation copy.  You will need to meet the following prerequisites in your environment:

  1. An active, functional installation of Enterprise Manager 12c (I used 12.1.0.2.0 + PSU2, but any EM12c release should work)
  2. A functional emcli installation.  I run emcli out of the $OMS_HOME on my OMS server.
  3. An OEM user account with appropriate permissions to import and deploy plugins (I simply used SYSMAN)
  4. An Oracle Management Agent installed on some server other than your OMS server to host the plugin (per BlueMedora’s recommendation, which I followed)
  5. A login to your VMware environment with read-only access to vCenter, your cluster, datastores, ESX hosts, and virtual machines.  If you use an account with permission to start/stop VMs that functionality is available for use in EM12c via the plugin, but I am using a read-only account and have not tested that part of the product
  6. At install time you will also need the hostname of your vCenter server and the SDK port if it has been changed from the default 443.

Installation and Configuration

After downloading the plugin from BlueMedora, installation is as simple as any other plugin you may already be using.  Their support team will walk you through the install and configuration (over a webinar in my case), but the steps are just what you would expect: copy the plugin .opar file to a location accessible by your OMS and import it using the emcli import_plugin verb, then after the import completes, deploy the plugin to your OMS and finally deploy the plugin to a management agent.

Configuring the plugin (adding targets) is also simple.  For the first step you will run an  “Add Target Manually” step with the “VMware vSphere” target type to let the plugin know about your vSphere environment.  Here is where you will provide your VMware login credentials, hostname and port (if non-default) in order to begin monitoring.  After adding the vSphere target, access it through the All Targets view, and go to the “All Metrics” link to validate that your VMware login credentials worked and metrics are being collected.  As an aside, a particularly interesting feature here that I have not seen in other plugins is that there is a metric group called “Collection Trigger” that, when clicked, triggers a collection.  This is handy and I would like to see this implemented elsewhere; I find it much easier than going over to an agent and running emctl to force a metric collection.

Once that is done you will see a new auto-discovery module called “vSphere Discovery Module”, and after configuring that discovery module with the name of your vSphere target you will run auto-discovery on the management agent to which you have deployed the plugin.  Auto-discovery identified our VMware cluster, ESX hosts (hypervisors), datastores and all of our virtual machines.  From the auto-discovery results you then promote any targets you wish to monitor through Enterprise Manager.  You may not want to promote all of your VMs (for management reasons, or to comply with your license terms) but you should promote all clusters, hypervisors and datastores, for the best overall view of your environment.

As with the other plugins I’ve used, after promoting targets you’ll need to wait a little while before metrics are collected and screens populate with data.  In my case I had useful data appearing on screen within about 10 minutes after promoting the first VM.

What’s In It For Me?

For a good high-level overview of what the plugin can do, take a look at the white paper from BlueMedora, the product overview PDF and the product datasheet.  I’ll only cover some of the additional items that these sources do not go into in detail.

Visuals

The plugin provides several nice visualizations.  On top of the “CPU and memory usage over time” graphs you would expect to find, I particularly like the bar graphs that group your hypervisor CPU and memory load into quantiles, making it easy to see, for example, that one host has 75-100% CPU usage while the other hosts are all in the 0-25% bucket, indicating that you may want to allocate your VM load a bit more evenly.  The datastore visualization showing the fill percentage on each datastore is also nice. Samples of these can be found in the product overview PDF.

Another useful visual is the integrated view of an Oracle database along with the VM on which it runs, the datastore(s) assigned to that VM, and the hypervisor on which the VM is running.  You can quickly see if, for example, the hypervisor is under memory pressure even if the VM does not appear to be, along with the executions and IOPS per second and average active sessions metric for the database.

Centralized Data

I find it hard to click through the vSphere client to find information I’m looking for.  I’m always in the wrong inventory view, or seeing only a subset of the data since I have a host selected rather than the entire cluster, and so on.  This plugin provides an easy to use centralized view of information across the VMware environment.  By going to the “Virtual Machines” view from the main vSphere target I can see a data grid showing each VM’s power status, provisioned disk, consumed CPU and memory, guest memory usage percentage, Even better, the grid includes a column indicating whether or not VMware tools are installed and running. There’s also an uptime column but I’m not sure how to parse it.  I think it represents the VM’s uptime on the specific hypervisor currently running it, but I’ll be asking support to clarify that for me.

New Job Types

These will only be useful to you if you want to automate your VMware environment from within OEM, and if you grant permissions beyond read-only to your monitoring user.  I do not expect to make use of this feature. But if you choose to do so, the plugin adds several new job types you can use for OEM jobs:

  • vSphere Hypervisor Enter Maintenance
  • vSphere Hypervisor Exit Maintenance
  • vSphere VM Power Off
  • vSphere VM Power On
  • vSphere VM Reset
  • vSphere VM Restart Guest
  • vSphere VM Shutdown Guest
  • vSphere VM Suspend

Metrics/Alerts

What use is an OEM plugin without metrics and alerting?  Very little.  This plugin provides a ton of metrics for your VMware environment.  The list is too long to include here, but see this on pastebin for a quick view produced from the MGMT_METRICS table.  You can also set warning and critical thresholds for many (although not all) metrics, and those alerts will go through the normal EM12c event framework to create incidents and/or notifications if configured to do so through incident rules. You can also view the same metric-over-time graphs as you can with the out-of-the-box EM metrics.

Other Items

The overview states that the plugin includes some integration with BI Publisher for reports.  I do not have BI Publisher installed with my EM12c environment so I can’t speak to this feature.

Disclaimer

I am not employed by BlueMedora, VMware or Oracle and neither I nor my employer received any consideration or compensation from those vendors.

Why your EM12cR2 FMW stack probably needs patch 13490778 to avoid OHS down/up events

MOS note 1496775.1 describes a situation with EM12cR2 where OEM will falsely report the Oracle HTTP Server instance (ohs1) as down, even though it is up.  This is due to some changes in FMW 11.1.1.6.  If you don’t have any incident rules or notifications set up that would catch this event, it’s easy to miss it and not know that it is happening.  I had run into this note a couple times before but ignored it, since I had never seen any open events complaining about OHS being down so I figured I just wasn’t hitting the bug.

This morning I caught one of the events.  I found myself wondering how often this had been happening — was it an issue once every couple days, every few hours, or what?

SQL> col msg format a45
SQL> select msg, count(*) from sysman.mgmt$events
  2  where closed_date >= sysdate - 1 and msg like '%HTTP Server instance%'
  3  group by msg;

MSG                                             COUNT(*)
--------------------------------------------- ----------
CLEARED - The HTTP Server instance is up             430
The HTTP Server instance is down                     430

Turns out it had been happening a LOT.  If you’ve followed Oracle’s recommendations and set up target lifecycle status priorities (see my post on doing so) you’ve probably set your OEM targets up with “MissionCritical” priority.  That means your OMS has been burning a lot of CPU to process all these up/down events on a mission critical target with high priority, potentially delaying processing of other events elsewhere in your events.

Applying patch 13490778, with ORACLE_HOME set to $MW_HOME/oracle_common should resolve this issue.  For best results, stop all OEM components prior to patch application and restart them when complete.

To convince yourself that applying the patch helped, re-run that query about 15 minutes after applying the patch and you should see the count decrease.

SLOB.R v0.6: An R script for analyzing SLOB results from repeated runs

This post covers usage of my SLOB.R script, used to analyze SLOB results from repeated test runs.  The contents of the SLOB.R script are at the bottom of this post, but first I will show the contents of a sample SLOB.R session, followed by how you can produce similar results.

A Sample SLOB.R Session

The first step is to start R.  On Windows you’ll double-click an icon, on Linux you’ll just type R and hit enter.  Once R has started, you will be in the interactive R interface and it will display some introductory text standard in R.

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>

Once you are in R, you need to load the SLOB.R script and then load your SLOB data.  SLOB data is produced by running SLOB repeatedly, saving the awr.txt output in between each run, then running the SLOB/misc/awr_info.sh script to summarize the collected AWR reports. I have saved the SLOB.R script in my Cygwin home directory, and saved the awr_info.sh output in a file called “awrinfo” in the same directory.

> source('C:/cygwin/home/brian/SLOB.R') 
> SLOB <- loadSLOB('C:/cygwin/home/brian/awrinfo')

Now you have the contents of your awr_info.sh output in an R variable called SLOB.  You can call this variable anything you wish, but I used SLOB.

To use SLOB.R you need to tell it a little bit about your AWR data: specifically which variables you are trying to test and how many sessions you used. In this example I am comparing performance of the XFS and EXT3 filesystems using the SLOB physical read, redo, and physical write models.  The variables to be tested (EXT3 and XFS) are embedded in the filenames I used when saving the awr.txt report between SLOB runs, and the session counts are the numbers I used when running SLOB’s runit.sh script to put load on the database.  We tell SLOB.R about this by setting a few R variables containing this data.  You can call them anything you wish, you just need to know their names.

> SLOB.sessions <- c(8, 16, 32) 
> SLOB.read.vars <- c('XFS', 'EXT3') 
> SLOB.redo.vars <- c('XFS', 'EXT3') 
> SLOB.readwrite.vars <- c('XFS', 'EXT3')

As you can see, variable assignment in R uses the <- operator.  I’m using the R built-in c() function (concatenate) to create vector variables that contain multiple values (you can think of them as like an array for now).  The SLOB.sessions variable contains three integer values: 8, 16, and 32; the other three variables each contain two string values: ‘XFS’ and ‘EXT3′.  For this demo I am only including two variables but it works fine with many more than that.  I have been using about 7 of them.  I am comparing filesystems, but you might be comparing storage vendor 1 vs storage vendor 2, or fibre channel vs dNFS, or Oracle 11.2.0.1 vs Oracle 11.2.0.3.  As long as the variables are identifiable from your AWR filenames, they will work.

You can view the contents of any R variable just by typing its name.

> SLOB.sessions
[1]  8 16 32
> SLOB.read.vars
[1] "XFS"  "EXT3"

With these variables set up you can now use the main SLOB.R driver functions to summarize the contents of your SLOB AWR reports.  First I’ll call the SLOBreads() function to summarize physical read performance.  This function summarizes the PREADS column from awr_info.sh output, by variable and session count.  To produce a better average it discards the lowest value and highest value from each combination of variable and session count.  Other SLOB.R driver functions are SLOBredo() and SLOBreadwrite().

> SLOBreads(SLOB, SLOB.reads.vars, SLOB.sessions)
            8       16       32  Overall
XFS  27223.89 46248.05 61667.21 44886.22
EXT3 30076.77 49302.59 59113.00 46094.39

So this indicates that for 8 reader sessions, the XFS filesystem gave me an average of 27,223.89 physical reads per second, while EXT3 gave me 30,076.77 physical reads per second.  The columns for 16 and 32 sessions have the same meaning as the 8 session column.  The ‘Overall’ column is an average of ALL the data points, regardless of session count.

You don’t have to use variables when calling the SLOB.R driver functions.  You can specify the variables or session counts directly in the call.  This is part of how R works.  Another example, showing how you can receive the same output by calling it with a variable or not:

> SLOBredo(SLOB, SLOB.redo.vars, SLOB.sessions)
             8        16        32   Overall 
XFS  326480426 336665385 321823426 326425272 
EXT3 304188997 325307026 326618609 317991362 
> SLOBredo(SLOB, c('XFS', 'EXT3'), c(8, 16, 32)) 
             8        16        32   Overall 
XFS  326480426 336665385 321823426 326425272 
EXT3 304188997 325307026 326618609 317991362

The numbers above would indicate that when storing my redo logs on XFS, SLOB could push 326,480,426 bytes of redo per second with 8 sessions.  On EXT3 with 8 sessions I saw 304,188,977 bytes of redo per second.  The 16, 32 and Overall columns have meanings similar to what I showed before.

The SLOBreadwrite() function reports the sum of physical reads and physical writes, with the columns all meaning the same as they do for the other functions.

> SLOBreadwrite(SLOB, SLOB.readwrite.vars, SLOB.sessions)
            8       16       32  Overall 
XFS  18520.44 20535.41 20728.07 19823.37 
EXT3 19568.04 21730.94 22641.14 21203.78

How To Create Output For SLOB.R

SLOB.R is smart enough to figure out which of your runs are testing reads, which are testing redo, and which are testing readwrite performance.  But for this to work you have to follow the naming convention defined in the SLOB/misc/README file when renaming your awr.txt files for processing by awr_info.sh:  [whatever].[number of SLOB writers].[number of SLOB readers] — SLOB.R expects your variables to be uniquely identifiable strings in the ‘whatever‘ part.

I recommend using scripts to run SLOB repeatedly and save the awr.txt output in between.  I provided some scripts in a prior post, but you can use your own scripts as long as your filenames match the required format.

Once you have your AWR output collected, run SLOB/misc/awr_info.sh on all the files, and save its output.  This is the file you will load into R.

SLOB.R Script (v0.6)

Save this as SLOB.R.  You may find it easier to use the pastebin version.

# SLOB.R version 0.6
#   BJP 2013 - Twitter @BrianPardy - http://pardydba.wordpress.com/
# See http://wp.me/p2Jp2m-4i for more information
#
#
# Assumes three possible SLOB test models: READS, REDO, WRITES
#   READS are readers-only
#   REDO and WRITES are writers only, differing in size of buffer_cache (small = REDO, large = WRITES)
#
#   Reports PREADS in SLOB.R READS model
#   Reports REDO in SLOB.R REDO model
#   Reports PWRITES in SLOB.R WRITES model
#   Use SLOB.R meta-model READWRITE to report PREADS+PWRITES in WRITES model
#
# Setup:
#   Run SLOB as usual, with at least three runs for each variable tested for each model
#   Save awr.txt between runs in filename matching [something]-variable.writers.readers
#     Example:  AWR-FCSAN-run1.8.0, AWR-FCSAN-run2.8.0 (...) AWR-FCSAN-run10.8.0
#               AWR-ISCSINAS-run1.8.0, AWR-ISCSINAS-run2.8.0 (...) AWR-ISCSINAS-run10.8.0
#       (In this case, the variables would be "FCSCAN" and "ISCSINAS", comparing fibrechannel SAN to NAS)
#   Run awr_info.sh from SLOB distribution against all AWR files at the end and save the output
#   Load awr_info.sh output into R with:  SLOB <- loadSLOB("filename")
#
# Hints:
#   Best results achieved with more SLOB runs - myavg() drops high and low values per set, averages remaining
# 
# Detailed example usage:
#   Testing SLOB read, redo and readwrite models 10 times each with 8, 16, and 32 sessions on EXT3 vs EXT4
#   Used a tablespace on EXT3 for EXT3 testing and a tablespace on EXT4 for EXT4 testing
#   Used redo logs on EXT3 for EXT3 REDO testing and redo logs on EXT4 for EXT4 REDO testing
#   Ran SLOB/misc/awr_info.sh on all awr.txt reports generated from these 60 SLOB runs
#   Saved awr_info.sh output as filename "awrinfo"
#
#  (Start R)
#  > source("SLOB.R")
#  > SLOB <- loadSLOB("awrinfo")
#  > SLOB.sesscounts < c(8, 16, 32)            # Specify the number of sessions used in tests
#  > SLOB.read.vars <- c('EXT3', 'EXT4')       # Specify the variables for READ testing: EXT3 vs EXT4
#  > SLOB.redo.vars <- SLOB.read.vars          # Same variables for REDO testing as for READ testing
#  > SLOB.readwrite.vars <- SLOB.read.vars     # Same variables for READWRITEtesting as for READ testing
#  > SLOB.reads <- SLOBreads(SLOB, SLOB.reads.vars, SLOB.sesscounts)
#  > SLOB.redo <- SLOBredo(SLOB, SLOB.redo.vars, SLOB.sesscounts)
#  > SLOB.readwrite <- SLOBreadwrite(SLOB, SLOB.readwrite.vars, SLOB.sesscounts)
### Previous three lines populate SLOB.reads, SLOB.redo and SLOB.readwrite variables with AWR results
### You can then interrogate those variables by typing their names
# > SLOB.reads
#             8       16       32  Overall
# XFS  27223.89 46248.05 61667.21 44886.22
# EXT3 30076.77 49302.59 59113.00 46094.39
#
#
# Usage variants for more detailed investigation.  Consider this advanced usage.
#   Most people should just use SLOBreads(), SLOBredo() and SLOBreadwrite()
#
#
#   Get average REDO bytes for variable 'foo' across all sessions: 
#     avg(SLOB, 'REDO', 'foo')
#     redoavg(SLOB, 'foo')
#
#   Get average REDO bytes for variable 'foo' with 8 sessions:
#     avg(SLOB, 'REDO', 'foo', 8)
#     redoavg(SLOB, 'foo', 8)
#
#   Get average PREADS (physical read) for variable 'foo' across all sessions:
#     avg(SLOB, 'READS', 'foo')
#     readavg(SLOB, 'foo')
#
#   Get average PWRITES (physical writes) for variable 'foo' with 16 sessions:
#     avg(SLOB, 'WRITES', 'foo', 16)
#     writeavg(SLOB, 'foo', 16)
#
#   Get sum of PREADS and PWRITES for variable 'foo' with 32 sessions:
#     avg(SLOB, 'READWRITE', 'foo', 32)
#     readwriteavg(SLOB, 'foo', 32)
#
#   Get average REDO bytes for multiple variables ('foo' and 'bar') across all sessions:
#     sapply(c('foo', 'bar'), redoavg, dat=SLOB)
#       or for 16 sessions:
#     sapply(c('foo', 'bar'), redoavg, dat=SLOB, sessioncount=16)
#     alternate:   sapply(c('foo', 'bar'), avg, dat=SLOB, sessioncount=16, model='READS')
#     (Note: This returns separate results for each variable, it does not combine and average them)
#
#   Get sum of PREADS and PWRITES for multiple variables ('XFS' and 'EXT3') across 16 sessions:
#     sapply(c('XFS', 'EXT3'), avg, dat=SLOB, sessioncount=16, model='READWRITE')
#
#   View the top waits in XFS READ model with 16 sessions:
#     waits(SLOB, str='XFS', model='READS', sessioncount=16)
#
#   View all data for a particular model with a specific session count:
#     SLOB[intersect(REDO(SLOB), sessions(16, SLOB)),]
#
#

getruns <- function(dat, model, str, sessioncount) {
  tmp <- intersect(grep(str, dat$FILE), eval(parse(text=paste(model, '(dat)', sep=''))))           
  if(missing(sessioncount)) { return(tmp)} 
  else { intersect(tmp, sessions(sessioncount, dat))}
}

loadSLOB <- function(filename) {
  read.table(file=filename, sep="|", header=TRUE)
}

# heuristically identify REDO model runs - if this sucks for you, comment it out
# and uncomment the alternate REDO function below.  It expects your filenames to include
# the string '-REDO-' when testing REDO performance.
REDO <- function(dat) {
  setdiff(which(dat$REDO/(dat$PWRITES*8192) > 2), READS(dat))
}

#REDO <- function(dat) {
#  grep('-REDO-', dat$FILE)
#}

WRITES <- function(dat) {
  setdiff(which(dat$WRITERS > 0), REDO(dat))
}

READS <- function(dat) {
  which(dat$READERS > 0)
}

READWRITE <- function(dat) {
  WRITES(dat)
}

sessions <- function(n, dat) {
  union(which(dat$WRITERS == n), which(dat$READERS == n))
}

myavg <- function(z) {
  mean(z[!z %in% c(min(z), max(z))])
}

getavg <- function(mode, dat) {
   myavg(eval(parse(text=paste('dat$', mode, sep=''))))
}

redoavg <- function(dat, ...) {
  getavg('REDO', dat[getruns(dat, 'REDO', ...),])
}

readavg <- function(dat, ...) {
  getavg('PREADS', dat[getruns(dat, 'READS', ...),])
}

writeavg <- function(dat, ...) {
  getavg('PWRITES', dat[getruns(dat, 'WRITES', ...),])
}

readwriteavg <- function(dat, ...) {
  getavg('PWRITES', dat[getruns(dat, 'WRITES', ...),]) + getavg('PREADS', dat[getruns(dat, 'WRITES', ...),])
}

avg <- function(dat, model, ...) {
  if(model=='REDO') {
    getavg('REDO', dat[getruns(dat, 'REDO', ...),])
  } else if(model=='READS') {
    getavg('PREADS', dat[getruns(dat, 'READS', ...),])
  } else if (model=='WRITES') {
    getavg('PWRITES', dat[getruns(dat, 'WRITES', ...),])
  } else if (model=='READWRITE') {
    getavg('PWRITES', dat[getruns(dat, 'WRITES', ...),]) + getavg('PREADS', dat[getruns(dat, 'WRITES', ...),])
  }
}

waits <- function(dat, ...) {
  as.character(dat[getruns(dat, ...), 'TOP.WAIT'])
}

testdata <- function(dat, model, str, ...) {
  if(model=='REDO') {
    sapply(str, avg, dat=dat, model='REDO', ...)
  } else if(model=='READS') {
    sapply(str, avg, dat=dat, model='READS', ...)
  } else if(model=='WRITES') {
    sapply(str, avg, dat=dat, model='WRITES', ...)
  } else if (model=='READWRITE') {
    sapply(str, avg, dat=dat, model='READWRITE', ...)
  }
}

readdata <- function(dat, str, ...) {
  sapply(str, avg, dat=dat, model='READS', ...)
}

redodata <- function(dat, str, ...) {
  sapply(str, avg, dat=dat, model='REDO', ...)
}

readwritedata <- function(dat, str, ...) {
  sapply(str, avg, dat=dat, model='READWRITE', ...)
}

SLOBreads <- function(dat, strs, sessioncounts) {
  z <- data.frame()
  for(i in sessioncounts) { 
    z <- rbind(z, readdata(dat, strs, i))
  }
  z <- rbind(z, readdata(dat, strs))
  z <- t(z)
  colnames(z) <- c(sessioncounts, 'Overall')
  rownames(z) <- strs
  return(z)
}

SLOBredo <- function(dat, strs, sessioncounts) {
  z <- data.frame()
  for(i in sessioncounts) { 
    z <- rbind(z, redodata(dat, strs, i))
  }
  z <- rbind(z, redodata(dat, strs))
  z <- t(z)
  colnames(z) <- c(sessioncounts, 'Overall')
  rownames(z) <- strs
  return(z)
}

SLOBreadwrite <- function(dat, strs, sessioncounts) {
  z <- data.frame()
  for(i in sessioncounts) { 
    z <- rbind(z, readwritedata(dat, strs, i))
  }
  z <- rbind(z, readwritedata(dat, strs))
  z <- t(z)
  colnames(z) <- c(sessioncounts, 'Overall')
  rownames(z) <- strs
  return(z)
}

That’s it for now.  I welcome comments and questions.

Collection of links, tips and tools for running SLOB

This post contains a collection of various tips and tools I have found useful for running SLOB. Where possible I’ve provided attribution for where I first learned of each tip, but some items are simply common sense or have already been posted by multiple people, so I can’t be sure that I have linked every possible source. If you have a write up out there that I’ve missed that you would like linked, please let me know in the comments. I have only focused on running SLOB to test physical read performance, redo generation performance, and mixed workload read/write performance so these tips will only cover those use cases, not other potential uses like testing logical I/O performance. I would be happy to include LIO testing tips if anyone shares some in a comment.

Getting Started – Things You Need

  1. First you’ll need to have an installation of the Oracle database licensed to run AWR reports.
  2. Next go to the OakTable site and download the SLOB distribution, SLOB includes a simple database creation script in the misc/create_database_kit/ directory along with a README describing how to use it, or you can use your existing database. I extract the kit into my oracle user’s home directory, but you can put it anywhere.
  3. You should also get the simple SLOB init.ora file for read IOPS testing and start your database with those parameters to get a quick start on testing physical read performance. If you’re using Oracle 11.2.0.3 you really should use the underscore parameters in this init.ora for accurate results.
  4. Read the SLOB readme if you haven’t already.

General SLOB Tips For Best Results

  1. Disable automatic generation of AWR snapshots. SLOB relies on the differences between an AWR snapshot taken at the start of the test run and another snapshot taken at the end of the test run, so if an automated AWR snapshot occurs it will throw off your results.
  2. Disable backups, autotasks, resource manager, and so on. You want the database to be as idle as possible, other than SLOB. See Yury’s SLOB index page which includes these suggestions.
  3. Save the awr.txt file that SLOB generates after each run so you can compare performance with previous runs. Use the awr_info.sh script included with SLOB to summarize your collected AWR reports for easy reading.
  4. Review your results. Check for unexpected differences from run to run, don’t simply run it once and expect meaningful results; you want to see consistent performance to have confidence your tests accurately reflect your setup.
  5. A throwaway SLOB run after creating the users with setup.sh (or bouncing the DB to change init parameters) will help with repeatability.
  6. Start small, with 4 or 8 sessions, and then try again with a few more sessions to find the sweet spot with the best performance for your hardware. Once you hit a session count where performance starts to degrade, don’t bother running more sessions than that. On high-powered hardware you might be able to run 128 sessions, but a smaller server might work better with 32.
  7. If you aren’t trying to test variances in your storage, keep your storage consistent from test to test. To put this another way, if (like me) you are using SLOB to test the performance of various filesystems, don’t create all your filesystems on separate LUNs. You probably don’t know how each LUN might be striped across the storage backend. Use the same LUN for every test, running mkfs as necessary to change filesystems or fdisk to reformat for raw usage. (Kevin Closson suggested this in email, I can’t find a public posting to link to.)

Physical Read Testing Tips

  1. Disable Oracle IO optimization that turns your desired db file sequential reads into db file parallel reads. Kevin Closson has also recommended use of these parameters in the simple init.ora file for read testing. I consider this a requirement, not a recommendation. I already mentioned this above but it’s worth repeating.

Redo Generation Testing Tips

  1. When testing M writers with N redo logs configured, preface your test with (N*2)+1 log switches followed by a throwaway SLOB run using M readers. See this tweet from Kevin Closson. This will produce more consistent results from run to run.

Read/Write Testing Tips

  1. If your buffer cache is small enough (and it should be), running SLOB with zero readers and some writers will produce enough random reads to represent a mixed workload. See this comment from Kevin Closson. Including dedicated read sessions is not necessary.

SLOB Tools

  1. The SLOB distribution contains a script called awr_info.sh in the misc/ directory that will summarize the most important parts of your AWR report for easy reading.
  2. My SLOB.R script to interrogate SLOB awr_info.sh output in R.
  3. Yury Velikanov’s SLOB On Steroids, which I haven’t yet tried but is worth a look.

Advanced Tips

  1. SLOB is simple, yet powerful, and people are always finding new uses for it. So read what others write about SLOB, like Yury’s SLOB index linked above, my other SLOB posts, and posts about SLOB from Karl Arao, flashdba, Martin Bach and of course Kevin Closson himself.
  2. Make mistakes and learn from them.

Further Reading

  1. Added 20130220: Check out the new SLOB mind map!  Thanks to Martin Berger for the great idea.
  2. Added 20130226: Lies, Damned Lies, and I/O Statistics by Matt Hayward.  While not directly related to SLOB, this is worthwhile reading for anyone involved in any sort of I/O benchmarking.
  3. Added 20130320: Using SLOB for PIO Testing by FlashDBA, including a configuration-file driven SLOB harness that you will probably like more than the simple ones I put out there.

Benchmarking filesystem performance for Oracle using SLOB (Part 2) – Results!

Update 20130212: Please consider this a work in progress. I am re-running most of these tests now using the same LUN for every filesystem to exclude storage-side variance. I am also using some additional init parameters to disable Oracle optimizations to focus the read tests more on db file sequential reads and avoid db file parallel reads. My posts so far document the setup I’ve used and I’m already noting flaws in it. Use this and others’ SLOB posts as a guide to run tests on YOUR system with YOUR setup, for best results.

Update 20130228: I have re-run my tests using the same storage LUN for each test.  I have updated the charts, text, and init parameters to reflect my latest results.  I have also used a 4k BLOCKSIZE for all redo log testing.  Further, instead of running each test three times and taking the average, I have run each test ten times, dropped the highest and lowest values, and averaged the remaining eight, in hopes of gaining a more accurate result.  Where I have changed my setup compared to the original post, I have left the old text in, struck out like this.

In Part 1 and Part 1.5 I covered the setup and scripts I’ve used to benchmark performance of various filesystems (and raw devices) using SLOB for Oracle datafiles and redo logs. In this post I will provide more details, including the results and database init parameters in use for my tests. This is an SAP shop so I have several event and _fix_control parameters in place that may have performance impacts. SAP says “use them” so I wanted my tests to reflect the performance I could hope to see through the SAP applications,

I do not intend these tests to demonstrate the maximum theoretical I/O that our hardware can push using each filesystem. I intend them to demonstrate what performance I can expect if I were to change the databases to use each listed filesystem while making minimal (preferably zero) other changes to our environment. I know, for example, that adding more network paths or faster ethernet or splitting the bonded paths will improve Oracle direct NFS performance, but I am testing with what we have available at the moment, not what we COULD have.

Now that 11g is certified on Oracle Linux 6, we are building out an OL6.3 box and I will be running identical tests on that server in a couple weeks. I’m expecting to see some significant improvement in BTRFS performance there.

With that said, some information on the test environment:

  • Server OS: SLES 11SP2 (kernel 3.0.51-0.7.9-default)
  • Oracle version: 11.2.0.3.2 (CPUApr2012 + SAP Bundle Patch May2012)
  • 2 CPU sockets, 8 cores per socket, with hyper-threading (Oracle sees CPU_COUNT = 32 but I have set CPU_COUNT to 2 to allow me to create a very small buffer cache)
  • All filesystems and raw devices created within a single volume in a single NetApp storage aggregate
  • I use only single instance databases so no effort was made to test cluster filesystems or RAC performance

Some notes about each test:

  1. All tests run in NOARCHIVELOG mode — obviously you should not do this in production
  2. Each filesystem was tested by creating SLOB users in a single tablespace containing a single 16GB 4GB datafile on the indicated filesystem
  3. All redo logs created with 4k blocksize
  4. All datafiles created with default parameters, I did not specify anything other than size and filename
  5. Read-only tests run with a 64M 32M db_cache_size to force PIO
  6. Redo generation tests run with a large log_buffer and two 32GB redo logs on the listed filesystem, with a 32G db_cache_size. I also set log_checkpoint_timeout to 999999999 to avert time-based checkpoints.
  7. XFS redo testing performed with filesystemio_options=none, as otherwise I could not create redo logs on XFS without using an underscore parameter and specifying the sector size when creating the logs (see http://flashdba.com/4k-sector-size/ for more information on this issue). All other tests used filesystemio_options=setall; only XFS redo required none.   All tests run with filesystemio_options=setall.
  8. Read-write tests run with three 64MB redo logs stored on raw devices and 128M db_cache_size
  9. Every test was run three ten times with 8 sessions, three ten times with 16 sessions, and three ten times with 32 sessions, and an overall average was taken after dropping high and low values. Prior to each batch of tests the database was bounced. Per Kevin Closson’s recommendation, I executed a throwaway SLOB run after each bounce, discarding the results.
  10. “NFS” refers to Oracle direct NFS  I dropped the NFS tests, as our ethernet setup is currently only 1Gb and does not perform as well as FC
  11. Automated tasks like the gather stats job, segment and tuning advisors were disabled
  12. The BTRFS volume was mounted with nodatacow to avoid copy-on-write

Results

Oracle’s standard license prohibits publication of customer publication of benchmark results without express written consent. To avoid any confusion that these results represent a benchmark, I am NOT publishing any of the absolute numbers. Instead I have scaled all results such that the performance of a RAW device is equal to 1, and all other results are reported relative to RAW. I make no claims that these results represent the best possible performance available from my software/hardware configuration. What they should accurately reflect, though, is what someone with a similar setup could expect to see if they were to create redo logs or datafiles on the indicated filesystem without performing any tuning to optimize the database, OS or filesystem.

Comments

I am disappointed with the performance of BTRFS. Our OS vendor is deprecating ext4 in favor of BTRFS so if we’re going to abandon ext3 due to fsck issues BTRFS is the path of least resistance. Ext4 appears to provide performance similar to ext3 and should reduce fsck runtimes so if we stay on cooked devices that looks like the way to go. Raw performance won overall (though storage write caching appears to have made ext3 look better than raw) but it has management issues such as the 255-device limit and inability to extend a datafile on a raw device. ASM should provide the same overall performance as raw without the limitations, but adds additional management overhead with the need to install Grid Infrastructure and that just gives one more target to keep up to date on patches. XFS had poor performance for redo logs in my environment but good performance elsewhere, and it should entirely avoid fsck-on-boot problems.

Direct NFS seemed to deliver the most consistent performance from test to test, with a smaller standard deviation than any of the others. This might be relevant for anyone who requires consistent performance more than maximal performance.

Read Performance

Scaled read performance

Scaled read performance

Redo Generation Performance

Scaled redo generation performance

Scaled redo generation performance

Read/Write Performance

Scaled read/write performance

Scaled read/write performance

Init Parameters

Below are the init parameters used for each test. Note that filesystemio_options had to be set to none for the XFS redo generation testing, but other than that these parameters are accurate for all the tests.

Read Performance

SLOB.__db_cache_size=34359738368
SLOB.__oracle_base='/oracle/SLOB'#ORACLE_BASE set from environment
SLOB.__shared_pool_size=704643072
*._db_block_prefetch_limit=0
*._db_block_prefetch_quota=0
*._db_file_noncontig_mblock_read_count=0
*._disk_sector_size_override=TRUE
*._fix_control='5099019:ON','5705630:ON','6055658:OFF','6399597:ON','6430500:ON','6440977:ON','6626018:ON','6972291:ON','8937971:ON','9196440:ON','9495669:ON','13077335:ON'#SAP_112031_201202 RECOMMENDED SETTINGS
*._mutex_wait_scheme=1#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._mutex_wait_time=10#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optim_peek_user_binds=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_adaptive_cursor_sharing=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_extended_cursor_sharing_rel='NONE'#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_use_feedback=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*.audit_file_dest='/oracle/SLOB/saptrace/audit'
*.compatible='11.2.0.2'
*.control_file_record_keep_time=30
*.control_files='/oracle/SLOB/cntrl/cntrlSLOB.dbf','/oracle/SLOB/cntrlm/cntrlSLOB.dbf'
*.cpu_count=1
*.db_block_size=8192
*.db_cache_size=33554432
*.db_files=500
*.db_name='SLOB'
*.db_recovery_file_dest_size=8589934592
*.db_recovery_file_dest='/oracle/SLOB/fra'
*.db_writer_processes=2
*.diagnostic_dest='/oracle/SLOB/saptrace'
*.event='10027','10028','10142','10183','10191','10995 level 2','38068 level 100','38085','38087','44951 level 1024'#SAP_112030_201112 RECOMMENDED SETTINGS
*.filesystemio_options='SETALL'
*.local_listener='LISTENER_SLOB'
*.log_archive_dest_1='LOCATION=/oracle/SLOB/arch/SLOBarch'
*.log_archive_format='%t_%s_%r.dbf'
*.log_buffer=14221312
*.log_checkpoints_to_alert=TRUE
*.max_dump_file_size='20000'
*.open_cursors=1600
*.optimizer_dynamic_sampling=6
*.parallel_execution_message_size=16384
*.parallel_max_servers=0
*.parallel_threads_per_cpu=1
*.pga_aggregate_target=10737418240
*.processes=800
*.query_rewrite_enabled='FALSE'
*.recyclebin='off'
*.remote_login_passwordfile='EXCLUSIVE'
*.remote_os_authent=TRUE#SAP note 1431798
*.replication_dependency_tracking=FALSE
*.resource_manager_plan=''
*.sessions=800
*.sga_max_size=10737418240
*.shared_pool_size=5242880000
*.star_transformation_enabled='TRUE'
*.undo_retention=432000
*.undo_tablespace='PSAPUNDO'

Redo Generation Performance

SLOB.__db_cache_size=34359738368
SLOB.__oracle_base='/oracle/SLOB'#ORACLE_BASE set from environment
SLOB.__shared_pool_size=704643072
*._db_block_prefetch_limit=0
*._db_block_prefetch_quota=0
*._db_file_noncontig_mblock_read_count=0
*._disk_sector_size_override=TRUE
*._fix_control='5099019:ON','5705630:ON','6055658:OFF','6399597:ON','6430500:ON','6440977:ON','6626018:ON','6972291:ON','8937971:ON','9196440:ON','9495669:ON','13077335:ON'#SAP_112031_201202 RECOMMENDED SETTINGS
*._mutex_wait_scheme=1#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._mutex_wait_time=10#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optim_peek_user_binds=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_adaptive_cursor_sharing=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_extended_cursor_sharing_rel='NONE'#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_use_feedback=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*.audit_file_dest='/oracle/SLOB/saptrace/audit'
*.compatible='11.2.0.2'
*.control_file_record_keep_time=30
*.control_files='/oracle/SLOB/cntrl/cntrlSLOB.dbf','/oracle/SLOB/cntrlm/cntrlSLOB.dbf'
*.cpu_count=1
*.db_block_size=8192
*.db_cache_size=34359738368
*.db_files=500
*.db_name='SLOB'
*.db_recovery_file_dest_size=8589934592
*.db_recovery_file_dest='/oracle/SLOB/fra'
*.db_writer_processes=2
*.diagnostic_dest='/oracle/SLOB/saptrace'
*.event='10027','10028','10142','10183','10191','10995 level 2','38068 level 100','38085','38087','44951 level 1024'#SAP_112030_201112 RECOMMENDED SETTINGS
*.filesystemio_options='SETALL'
*.local_listener='LISTENER_SLOB'
*.log_archive_dest_1='LOCATION=/oracle/SLOB/arch/SLOBarch'
*.log_archive_format='%t_%s_%r.dbf'
*.log_buffer=268427264
*.log_checkpoint_timeout=99999999
*.log_checkpoints_to_alert=TRUE
*.max_dump_file_size='20000'
*.open_cursors=1600
*.optimizer_dynamic_sampling=6
*.parallel_execution_message_size=16384
*.parallel_max_servers=0
*.parallel_threads_per_cpu=1
*.pga_aggregate_target=10737418240
*.processes=800
*.query_rewrite_enabled='FALSE'
*.recyclebin='off'
*.remote_login_passwordfile='EXCLUSIVE'
*.remote_os_authent=TRUE#SAP note 1431798
*.replication_dependency_tracking=FALSE
*.resource_manager_plan=''
*.sessions=800
*.sga_max_size=68719476736
*.shared_pool_size=5242880000
*.star_transformation_enabled='TRUE'
*.undo_retention=432000
*.undo_tablespace='PSAPUNDO'

Read/Write Performance

SLOB.__db_cache_size=34359738368
SLOB.__oracle_base='/oracle/SLOB'#ORACLE_BASE set from environment
SLOB.__shared_pool_size=704643072
*._db_block_prefetch_limit=0
*._db_block_prefetch_quota=0
*._db_file_noncontig_mblock_read_count=0
*._disk_sector_size_override=TRUE
*._fix_control='5099019:ON','5705630:ON','6055658:OFF','6399597:ON','6430500:ON','6440977:ON','6626018:ON','6972291:ON','8937971:ON','9196440:ON','9495669:ON','13077335:ON'#SAP_112031_201202 RECOMMENDED SETTINGS
*._mutex_wait_scheme=1#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._mutex_wait_time=10#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optim_peek_user_binds=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_adaptive_cursor_sharing=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_extended_cursor_sharing_rel='NONE'#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*._optimizer_use_feedback=FALSE#RECOMMENDED BY ORACLE/SAP FOR 11.2.0 - SAP note 1431798
*.audit_file_dest='/oracle/SLOB/saptrace/audit'
*.compatible='11.2.0.2'
*.control_file_record_keep_time=30
*.control_files='/oracle/SLOB/cntrl/cntrlSLOB.dbf','/oracle/SLOB/cntrlm/cntrlSLOB.dbf'
*.cpu_count=1
*.db_block_size=8192
*.db_cache_size=134217728
*.db_files=500
*.db_name='SLOB'
*.db_recovery_file_dest_size=8589934592
*.db_recovery_file_dest='/oracle/SLOB/fra'
*.db_writer_processes=2
*.diagnostic_dest='/oracle/SLOB/saptrace'
*.event='10027','10028','10142','10183','10191','10995 level 2','38068 level 100','38085','38087','44951 level 1024'#SAP_112030_201112 RECOMMENDED SETTINGS
*.filesystemio_options='SETALL'
*.local_listener='LISTENER_SLOB'
*.log_archive_dest_1='LOCATION=/oracle/SLOB/arch/SLOBarch'
*.log_archive_format='%t_%s_%r.dbf'
*.log_buffer=268427264
*.log_checkpoint_timeout=99999999
*.log_checkpoints_to_alert=TRUE
*.max_dump_file_size='20000'
*.open_cursors=1600
*.optimizer_dynamic_sampling=6
*.parallel_execution_message_size=16384
*.parallel_max_servers=0
*.parallel_threads_per_cpu=1
*.pga_aggregate_target=10737418240
*.processes=800
*.query_rewrite_enabled='FALSE'
*.recyclebin='off'
*.remote_login_passwordfile='EXCLUSIVE'
*.remote_os_authent=TRUE#SAP note 1431798
*.replication_dependency_tracking=FALSE
*.resource_manager_plan=''
*.sessions=800
*.sga_max_size=34359738368
*.shared_pool_size=5242880000
*.star_transformation_enabled='TRUE'
*.undo_retention=432000
*.undo_tablespace='PSAPUNDO'

Supplementary Data

I have removed the content of this section on 20130228 as it related to my previous results and not the updated results now contained in this posting.

Benchmarking filesystem performance for Oracle using SLOB (Part 1.5): Improved test scripts

Minor update 20130207: Adjusted read-write testing script for three redo logs, not two.

In Part 1 I described a starter methodology for using SLOB to benchmark filesystem performance for Oracle. After many SLOB runs and much valuable advice from Kevin Closson, I now have an improved method. My next post will contain the results of these tests, with all numbers scaled against RAW performance for the sake of comparison. Since my hardware and network don’t match yours, my specific numbers aren’t very relevant; but for a DBA or SA considering changing the filesystem they use for Oracle without a priori knowledge of how to best tune it, this will hopefully present a useful baseline.

Previously I was bouncing the database, dropping and recreating the SLOB users before every test run. Now I am initializing the test batches for each filesystem/volume by bouncing the database, creating the users and executing a throwaway SLOB run to prime the database.

Test Scripts

Below are the scripts I am using to test read-only, redo generation, and read-write performance. If you use any of them, edit the scripts to reflect your base directory (the oracle user’s home), SLOB install directory and desired location for SLOB AWR output. Before running any of the scripts, prepare your environment by creating a tablespace on each filesystem/device type that you wish to test. Name the tablespace something that will make it clear to you which filesystem you are testing. I used a single 16GB datafile for each tablespace. Each script runs SLOB setup.sh to create the users and then SLOB runit.sh to execute the tests. After an initial throwaway run to warm things up, they will run SLOB three times each with 8, 16, and 32 sessions. You can adjust the session count by changing the for loops. At the end of each script the awr.txt file generated by SLOB is renamed and moved to the $SLOBAWR directory, using the filename convention expected by the awr_info.sh script (found in the latest SLOB release.) All tests are performed in NOARCHIVELOG mode, don’t do that on a database that you need. The scripts assume that you are executing them from within the SLOB directory.

Read-Only Performance

Before running the read-only test, I created two 32G redo logs on raw devices. The read-only test should not hit redo at all so where you place them doesn’t really matter. For read-only testing you should use a very small db_cache_size, I used 64M. The small cache will make sure that reads are fulfilled through physical I/O rather than cached.

#!/bin/sh
#
# Usage: ./testread.sh FS RUNCOUNT
#
# Script assumes a tiny SGA to force physical reads
# Script will create SLOB schema users in tablespace named $FS, and run each test $RUNCOUNT times

TYPE=$1
NUMRUNS=$2
BASEDIR=/oracle/SLOB
SLOBDIR=$BASEDIR/SLOB
DROPUSERS=$SLOBDIR/drop_users.sql
AWRDIR=$SLOBDIR/SLOBAWR
COUNTER=1

if [ -z "$TYPE" -o -z "$NUMRUNS" ]; then
    echo "Usage: $0 FS RUNCOUNT"
    exit 1
fi

mkdir $AWRDIR >& /dev/null

echo "Starting SLOB read-only performance testing for $TYPE ($NUMRUNS runs)"

echo "Dropping existing users and bouncing database"

sqlplus -s / as sysdba<<EOF
@$DROPUSERS;
shutdown immediate;
startup;
exit;
EOF

echo "Setting up SLOB user schemata"
$SLOBDIR/setup.sh $TYPE 32

echo "Performing N+1 logswitch"
sqlplus -s / as sysdba<<EOF2
alter system switch logfile;
alter system switch logfile;
alter system switch logfile;
exit;
EOF2

echo "Throwaway SLOB run to prime database"
$SLOBDIR/runit.sh 0 8

for i in 8 16 32; do (
    while [ $COUNTER -le $NUMRUNS ]; do
        echo "Running SLOB for $i readers (run #$COUNTER)"

        $SLOBDIR/runit.sh 0 $i
        echo "Renaming AWR report"
        mv awr.txt $AWRDIR/SLOB-AWR-$TYPE-$COUNTER.0.$i
        COUNTER=$((COUNTER+1))
    done )
done

Redo Performance

Before running the redo generation test, I created two 32G redo logs the filesystem whose performance I wish to test. For redo testing you should use a large db_cache_size; I used 32G. I also increased log_buffer to reduce log buffer waits, but I have not been able to eliminate them entirely. SAP requires specific log_buffer settings so I don’t want to deviate from them too much, as I want this test to have at least some similarity to our production performance. Prior to each test run I performed N+1 log switches (where N is the number of configured redo logs) to clear out any pending redo.

One possible improvement others may wish to consider when testing redo generation performance would be to configure your UNDO tablespace to use the same filesystem type as your redo logs. I did not do so, so my max performance is somewhat constrained by writes to UNDO. Each test should be similarly constrained so I am not concerned about that at the moment.

#!/bin/sh
#
# Usage: ./testredo.sh FS RUNCOUNT
#
# Script assumes you have pre-created exactly two redo logs on the filesystem/device to be tested
# Script will create SLOB schema users in tablespace named $FS, and run each test $RUNCOUNT times

TYPE=$1
NUMRUNS=$2
BASEDIR=/oracle/SLOB
SLOBDIR=$BASEDIR/SLOB
DROPUSERS=$SLOBDIR/drop_users.sql
AWRDIR=$SLOBDIR/SLOBAWR
COUNTER=1

if [ -z "$TYPE" -o -z "$NUMRUNS" ]; then
    echo "Usage: $0 FS RUNCOUNT"
    exit 1
fi

mkdir $AWRDIR >& /dev/null

echo "Starting SLOB redo generation performance testing for $TYPE ($NUMRUNS runs)"
echo "Assuming two redo logs exist"

echo "Dropping existing users and bouncing database"

sqlplus -s / as sysdba<<EOF
@$DROPUSERS;
shutdown immediate;
startup;
exit;
EOF

echo "Setting up SLOB user schemata"
$SLOBDIR/setup.sh $TYPE 32

echo "Throwaway SLOB run to prime database"
$SLOBDIR/runit.sh 8 0

for i in 8 16 32; do (
    while [ $COUNTER -le $NUMRUNS ]; do
        echo "Running SLOB for $i writers (run #$COUNTER)"
        echo "Performing N+1 logswitch"
        sqlplus -s / as sysdba<<EOF2
        alter system switch logfile;
        alter system switch logfile;
        alter system switch logfile;
        exit;
EOF2

        $SLOBDIR/runit.sh $i 0 
        echo "Renaming AWR report"
        mv awr.txt $AWRDIR/SLOB-AWR-$TYPE-$COUNTER.$i.0
        COUNTER=$((COUNTER+1))
    done )
done

Read-Write Performance

Before running the read-write performance test, I replaced my redo logs with three 64M files on raw devices. I also decreased db_cache_size to 128M to help make sure reads are fulfilled from physical disk instead of cache. I left log_buffer as-is from the redo testing.

#!/bin/sh
#
# Usage: ./testrw.sh FS RUNCOUNT
#
# Script assumes you have pre-created exactly three small (64M) redo logs 
# Script will create SLOB schema users in tablespace named $FS, and run each test $RUNCOUNT times

TYPE=$1
NUMRUNS=$2
BASEDIR=/oracle/SLOB
SLOBDIR=$BASEDIR/SLOB
DROPUSERS=$SLOBDIR/drop_users.sql
AWRDIR=$SLOBDIR/SLOBAWR
COUNTER=1

if [ -z "$TYPE" -o -z "$NUMRUNS" ]; then
    echo "Usage: $0 FS RUNCOUNT"
    exit 1
fi

mkdir $AWRDIR >& /dev/null

echo "Starting SLOB read/write performance testing for $TYPE ($NUMRUNS runs)"
echo "Assuming three redo logs exist"

echo "Dropping existing users and bouncing database"

sqlplus -s / as sysdba<<EOF
@$DROPUSERS;
shutdown immediate;
startup;
exit;
EOF

echo "Setting up SLOB user schemata"
$SLOBDIR/setup.sh $TYPE 32

echo "Throwaway SLOB run to prime database"
$SLOBDIR/runit.sh 4 4

for i in 8 16 32; do (
    while [ $COUNTER -le $NUMRUNS ]; do
        echo "Running SLOB for $i read/write sessions (run #$COUNTER)"
        echo "Performing N+1 logswitch"
        sqlplus -s / as sysdba<<EOF2
        alter system switch logfile;
        alter system switch logfile;
        alter system switch logfile;
        alter system switch logfile;
        exit;
EOF2

        READERS=$((i/2))
        WRITERS=$READERS
        $SLOBDIR/runit.sh $WRITERS $READERS
        echo "Renaming AWR report"
        mv awr.txt $AWRDIR/SLOB-AWR-$TYPE-$COUNTER.$WRITERS.$READERS
        COUNTER=$((COUNTER+1))
    done )
done