Performance Analysis


So let me set the stage for this blog. I recently completed a storage assessment for a customer. Without boring you with the details one of the major areas of concern was the Exchange environment, I was provided with proem data from two active Exchange cluster member to be used for analysis. The proem data was collected on both cluster members over the course of a week at 30 second intervals – THAT’S ALOT OF DATA POINTS. Luckily the output was provided in .blg (binary log) format because I am not sure how well Excel would have handled opening a .CSV file with 2,755,284 rows – YES you read that correctly 2.7 million data points – and that was after I filtered the counters that I did not want.

The purpose of this post is to walk through how the data was ingested and mined to produce useful information. Ironically having such granular data point while annoying at the onset proved to be quite useful when modeling the performance.

First let’s dispense with some of the requirements:

  • Windows XP, 2003
  • Windows 2003 Resource Kit (required for the log.exe utility)
  • MSDE, MSSQL 2000 (I am sure you can use 2005 – but why – it is extreme overkill for what we are doing)
  • A little bit of skill using osql
    • If you are weak :) and using MSSQL 2000 Enterprise Manager can be used
  • A good SQL query tool
  • MS Excel (I am using 2007 – so the process and screen shots may not match exactly if you are using 2003, 2000, etc…)
    • Alternatively a SQL reporting tool like Crystal could be used. I choose Excel because the dataset can easily be manipulated once in a spreadsheet.
  • grew for Windows (http://gnuwin32.sourceforge.net/packages/grew.htm) – ensure this is in your path

OK – now that the requirements are out of the way let’s move on. I am going to make the assumption that the above software is installed and operational, otherwise this blog will get very, very long. NOTE: It is really important that you know the “sa” password for SQL server.

Beginning the process:

Step 1: Create a new data base (my example below uses the data base name “test” – you should use something a bit more descriptive)

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Documents and Settings\bocchrj>osql -Usa
Password:
1> create database test
2> go
The CREATE DATABASE process is allocating 0.63 MB on disk ‘test’.
The CREATE DATABASE process is allocating 0.49 MB on disk ‘test_log’.
1> quit

C:\Documents and Settings\bocchrj>

NOTE: This DATABASE will be created in the default DATA location under the SQL SVR install directory. Make sure you have enough capacity – for my data set of 2.7 million rows the database (.mdf) was about 260 MB.

Step 2: Create an ODBC connection to the database

A nice tutorial on how to do this can be found here: http://www.truthsolutions.com/sql/odbc/creating_a_new_odbc_dsn.htm

Two things to note:

  • Use the USER DSN
  • Use SQL Server Authentication (NOT NT Authentication) the user name is “sa” and hopefully you remembered the password.

Step 3: Determine relevant counters

Run the following command on the .blg file: log -q perf.blg > out.txt

This will list all of the counters to the text file out.txt. Next open counters.txt in you favorite text editor and determine which counters are of interest (NOTE: Life is much easier if you import a specific counter class into the database, create new a new database for a new counter class)

Once you determine the counter class e.g. PhysicalDisk run: log -q perf.blg | grew PhysicalDisk > counters.txt

Step 3: Import the .blg into the newly created database.

  • NOTE: There are two considerations here

    • Are the data points contained in a single .blg file (if you have 2.7 million data points this is unlikely)? If they are the command to do the import is fairly simple:

      • log perf.blg -cf counter.txt -o SQL:DSNNAME!description

    • If the data points are contained in a number of files make sure that these files are housed in the a directory. You can use the following PERL script to automate the import process (NOTE: This requires that PERL be installed – http://www.activestate.com/Products/ActivePerl/)

#!/usr/bin/perl -w

# syntax: perl import_blg.pl input_dir odbc_connection_name $counter
# e.g. – perl import_blg_pl

#DO NOT EDIT BELOW THIS LINE
$dirtoget=”$ARGV[0]“;
$odbc=”$ARGV[1]“;
$counter=”$ARGV[2]“;
$l=0;

opendir(IMD, $dirtoget) || die(“Cannot open directory”);
@thefiles= readdir(IMD);
closedir(IMD);

foreach $f (@thefiles)
{
unless ( ($f eq “.”) || ($f eq “..”) )
{
$label=$l++;
system “log \”$dirtoget\/$f\” -cf $counter -o SQL:$odbc!$label”;
}
}

 

  • If the import is working properly you should see output similar to the following:

Input
—————-
File(s):
C:\perf_log – 30 second interval_06132201.blg (Binary)

Begin: 6/13/2007 22:01:00
End: 6/14/2007 1:59:30
Samples: 478

Output
—————-
File: SQL:DSNNAME!1.blg

Begin: 6/13/2007 22:01:00
End: 6/14/2007 1:59:30
Samples: 478

The command completed successfully.

Step 4: OK – The data should now be imported into the database. We will now look at the DB structure (table names) and run a test query. At this point I typically start using SQL Manager 2005 Lite but you can continue to use osql or Enterprise Manager (uggghhhh). For the purposes of cutting and pasting examples I used osql.

  • This is not a DB 101 tutorial but you will need to be connected to the database we created earlier.

  • Once connected run the following query (I cut out non relevant information in the interest of length):

C:\Documents and Settings\bocchrj>osql -Usa
Password:
1> use test
2> go
1> select * from subjects where type = ‘u’ order by name
2> go
CounterData
CounterDetails
DisplayToID

(3 rows affected)
1>

CounterData and CounterDetails are the two tables we are interested in:

Next lets run the query to display the first 100 rows of the CounterDetails table to verify that the data made its way from the .blg file to the database

1> select top 20 * from counterdetails
2> go
SHOULD SCROLL 20 RECORDS
(20 rows affected)
1>quit

Step 5: Determining what to query

Open counters.txt in your favorite browser and determine what you want to graph – there are a number of metrics, pick one you can get more complicated once you get the hang of the process.

e.g. When you open the text file you will see a number of rows that look like this – Take note of bold sections below, this is the one of the filters that will be used when selecting the working dataset

\test\PhysicalDisk(4)\Avg. Disk Bytes/Read
\test\PhysicalDisk(5)\Avg. Disk Bytes/Read
\test\PhysicalDisk(6 J:)\Avg. Disk Bytes/Read
\test\PhysicalDisk(1)\Avg. Disk Bytes/Read
\test\PhysicalDisk(_Total)\Avg. Disk Bytes/Read
\test\PhysicalDisk(0 C:)\Avg. Disk Write Queue Length
\test\PhysicalDisk(3 G:)\Avg. Disk Write Queue Length
\test\PhysicalDisk(4)\Avg. Disk Write Queue Length
etc…

Once you determine which performance counter is of interest open Excel.

Step 6: Understanding the anatomy of the SQL query

SELECT
CounterData.”CounterDateTime”, CounterData.”CounterValue”,
CounterDetails.”CounterName”, CounterDetails.”InstanceName”
FROM
{ oj “test“.”dbo”.”CounterData” CounterData INNER JOIN “test“.”dbo”.”CounterDetails” CounterDetails ON
CounterData.”CounterID” = CounterDetails.”CounterID”}
WHERE
CounterDetails.”InstanceName” = ‘3 G:‘ AND CounterDetails.”CounterName”=’Disk Writes/sec‘ AND CounterData.”CounterDateTime” like ‘%2007-06-13%’

  • The above query will return all the Disk Writes/sec on 6/13/2007 for the G: drive.
  • In the above query I have BOLDED the VARIABLES that should be modified when querying the database. The JOINS should NOT be modified. You may add additional criteria like multiple INSTANCENAME or COUNTERNAME fields to grab, etc…. Below you will see exactly how to apply the query.

Step 7: Run the query from Excel (NOTE: Screen shots are of Excel 2003, the look and feel will be different for other versions of Excel)

ex-query1

Once you select “From Microsoft Query” the next screen will appear

dsn

Select the DSN that you defined earlier. NOTE: Also uncheck the “Use the Query Wizard to create/edit queries. I will provide the query syntax which will make life much easier.Now you need to login to the database. Hopefully you remember the sa password.

login add

Once the login is successful – CLOSE the add tables window

Now you will see the MS Query Tool Click the SQL button on the toolbar. Enter the SQL query into the SQL text box and hit OK. You will recite a warning – just hit OK and continue.

Use the query syntax explained above.

sql

Once the query is complete it will return a screen that looks like this:

q

Now hit the 4th button from the left on the tool bar “Return Data” – this will place the data into Excel so that is can be manipulated:

e

Once the data is placed into Excel you can begin to graph and manipulate it.

out

I hope this was informative, if you find any errors in the process please place a comment to the post.

VN:F [1.9.17_1161]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.17_1161]
Rating: 0 (from 0 votes)

vmfs and rdm performance characteristics


It seems as if one of the most debated topics related to VMware and I/O performance is the mystery sounding the relative performance characteristics of vmfs volumes and rdm (Raw Device Mode) volumes.

Admittedly it is difficult to argue with the flexibility and operational benefits of vmfs volumes but I wanted to measure the characteristics of each approach and provide some documentation that could be leveraged when making the decision to use vmfs or rdm.? By no means are these test concluded but I thought as a gathered the data I would blog it so it could be used prior to me completing the whitepaper which all these tests will be part of.

Benchmark configuration:
The benchmarks contained in this document were performed in a lab environment with the following configuration:

  • Physical Server:? Dell dual CPU 2850 w/ 4 GB RAM
    • Windows 2003 SP2 Virtual Machine
    • Single 2.99 Ghz CPU
    • 256 MB RAM (RAM configured this low to remove the effects of kernel file system caching)
  • Disk array
    • EMC CLARiiON CX500
    • Dedicated RAID 1 Device
    • 2 LUNs Created on the RAID 1 Storage Group
    • Two dedicated 10 GB file systems
      • c:\benchmark\vmfs
        • 10 GB .vmdk created and vmfs and NTFS file system created
      • c:\benchmark\rdm
        • 10 GB rdm volume mapped to VM and NTFS file system created?

Benchmark tools:
Benchmark tests thus far were run using?two popular?disk and file system benchmarking tools.

IOzone Benchmarks:

HDtune benchmarks:

HD Tune: VMware Virtual disk Benchmark
Transfer Rate Minimum : 54.1 MB/sec
Transfer Rate Maximum : 543.7 MB/sec
Transfer Rate Average : 476.4 MB/sec
Access Time : 0.4 ms
Burst Rate : 83.3 MB/sec
CPU Usage : 36.9%

HD Tune: DGC RAID 1 Benchmark
Transfer Rate Minimum : 57.1 MB/sec
Transfer Rate Maximum : 65.3 MB/sec
Transfer Rate Average : 62.4 MB/sec
Access Time : 5.4 ms
Burst Rate : 83.9 MB/sec
CPU Usage : 13.8%

One thing that is very obvious is that VMFS makes extensive use of system/kernel cache.? This is most obvious in the HDtune benchmarks.? The increased CPU utilization is a bit of a concern, most likely due to the caching overhead.? I am going to test small block random writes while monitoring CPU overhead, my gut tells me that small block random writes to a VMFS volume will tax the CPU.? More to come….

VN:F [1.9.17_1161]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.17_1161]
Rating: 0 (from 0 votes)

Preliminary results: rsych to replicate virtual machines


So a couple of months ago or so I had the idea to test and document the use of rsych to replicate VMware virtual machines. Unfortunately my machine is running ESX 2.5.x and I had yet to upgrade to 3.0 so this information is a bit depreciated but I feel that it will be indicative of what I will see on ESX 3.0 (aka – VI3). On ESX 3.0 the process actually becomes much easier because all of the files (.vmx, .vmdk, .nvram) are contained in the same directory structure. So here is a simplistic representation of the commands required to replicate a VM on ESX 2.5, I am also in the process of building an automation script for ESX 3.0:

************ CREATE .REDO LOG ************
ware-cmd ~bocchrj/vmware/rh62_1/linux.vmx addredo scsi0:0

************ STARTING REPLICATION ************
rsync –verbose –progress –stats –compress /vmfs/VMs/rh62_1.vmdk esx2::rsyncVMs

************ REPLICATE VM CONFIG FILES ************
rsync –verbose –progress –stats –compress /home/bocchrj/vmware/rh62_1/* esx2::vmconfig/rh62_1

************ ADDING REDO.REDO LOG ************
vmware-cmd ~bocchrj/vmware/rh62_1/linux.vmx addredo scsi0:0

************ COMMITING REDO LOGS *************
vmware-cmd ~bocchrj/vmware/rh62_1/linux.vmx commit scsi0:0 1 0 1
vmware-cmd ~bocchrj/vmware/rh62_1/linux.vmx commit scsi0:0 0 0 0

************ DONE ************

The process worked well. I captured the output of the initial rsync and the second rsych cycle below:

Initial rsync cycle

Number of files: 1
Number of files transferred: 1
Total file size: 419430912 bytes
Total transferred file size: 419430912 bytes
Literal data: 419430912 bytes
Matched data: 0 bytes
File list size: 30
File list generation time: 0.152 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 85826493
Total bytes received: 38

sent 85826493 bytes received 38 bytes 432375.47 bytes/sec
total size is 419430912 speedup is 4.89

Second rsync cycle

Number of files: 1
Number of files transferred: 1
Total file size: 419430912 bytes
Total transferred file size: 419430912 bytes
Literal data: 1864192 bytes
Matched data: 417566720 bytes
File list size: 30
File list generation time: 0.135 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 344586
Total bytes received: 143405

sent 344586 bytes received 143405 bytes 4337.70 bytes/sec
total size is 419430912 speedup is 859.51

Looking forward to getting sometime to play with this on ESX 3.0, I am lobbying hard for the 48hr day :)

VN:F [1.9.17_1161]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.17_1161]
Rating: 0 (from 0 votes)

Mounting .vmdk files in Windows


Ever feel the need to mount a Windows or Linux .vmdk on windows, I have. Here are a couple of utilities that make it possible to open Linux and Windows .vmdks on Windows in read-only mode or read/write mode.

Virtual Disk Driver for Windows (http://chitchat.at.infoseek.co.jp/vmware/vdk.html)
Ext2IFS – Ext2/3 file system driver for windows. Required to mount Linux Virtual disks on windows (http://www.fs-driver.org/index.html)

VN:F [1.9.17_1161]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.17_1161]
Rating: 0 (from 0 votes)

Collecting system information with SYDI


I have been using SYDI for quite a while now. The 2.0 release of sydi-server has added some nice features, the ability to generate Word output from .xml files is huge. The only think that I found to be annoying is that the sydi-transform.vbs script does allow the input of an entire directory.

cscript.exe sydi-transform.vbs -xServer.xml -sServerhtml.xsl -oServer.html

This can make the transformation process from .xml to .html fairly labor intensive for a large environment. This little script solves that problem.

#!/usr/bin/perl -w

# syntax: perl transform.pl dir_where_xml_files_live
# e.g. – perl transform.pl ./xmlfiles
# Will walk the file system and output filename.xml.html for each .xml file
#
# Edit these variables to match the location on your system
$pathtotransform=”g:/sydi/tools/sydi-transform.vbs”;
$pathtoxsl=”g:/sydi/xml/serverhtml.xsl”;

#DO NOT EDIT BELOW THIS LINE
$dirtoget=”$ARGV[0]“;

opendir(IMD, $dirtoget) || die(“Cannot open directory”);
@thefiles= readdir(IMD);
closedir(IMD);

foreach $f (@thefiles)
{
unless ( ($f eq “.”) || ($f eq “..”) )
{
system “cscript $pathtotransform -x$f -s$pathtoxsl -o$dirtoget\/$f.html”;
}
}

VN:F [1.9.17_1161]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.17_1161]
Rating: 0 (from 0 votes)