EMC World 2010 Initial Thoughts

Sitting in room 153A at a rather rudimentary Cisco session so thought I would take a few minutes to write up my initial observations and comments on EMC World 2010 this far (only an 1 hour and 30 mins in):

  • I am hoping that the Cloud message is less nebulous by the time I leave on Thursday
  • Green is a theme, ice work with the bamboo plates at breakfast.  Also enjoyed NOT getting five million data sheets in my EMC goodie bag.
  • No backpacks or laptop cases this year, they have been replaced by a generic EMC branded bag.  Works for me because I can fit the giveaway bag into my normal bag so I don’t have to carry two bags all day.
    • Assumed this was due to cutbacks but I was told by the guy at the registration desk that it feels like there are more attendees this year than in the past.  He thought the number was somewhere around 10k attendees.
  • Social networking is very visible, with Twitter and FourSquare leading the way. image
    • Not sure that everyone here gets the FourSquare thing, EMC has created a ton of venues but you would think that a swarm would be easily attainable at a technology conference with nearly 10k attendees.  I think a fair number of attendees may need a mobile device upgrade, the StarTAC is no longer an acceptable cellular device, having the StarTAC car kit in your Mercedes does not justify holding onto the phone 🙂
  • Nice work with the mobile app
    • http://emc.tripbuilder.com/mobi
    • Who knows, maybe next year there will be an actual iPhone and Android app so we can jettison the session guide and save a few trees.
  • Finally it is obvious that like Apple ownership of the letter “I” EMC has laid claimed the letter “V”
    • I understand but does every product need to really be prefixed with a “V”?
  • Only one guru level session, very disappointing.
  • Looking forward to hearing about V-Plex, Unisphere and attending the session on automating Virtual Provisioning with Windows PowerShell
  • Hoping the V-Plex and Unisphere session are more than just marchitecture

Here is wishing everyone a great EMC World 2010.  If you don’t have FourSquare installed on your mobile device, I will assume you are either a StarTAC users or posting your updates to MySpace 🙂

The best thing and the worst thing….

It has been said that the best thing about the internet is that anyone can publish, unfortunately the worst thing about the internet is that anyone can publish.  Another fitting cliché is that opinions are like buttholes, I propose that the blog sphere has become a public restroom, reeking from the stench of personal opinion backed by analogies, anecdotes, etc… and little fact (much like the opinions expressed in this blog).  By no means am I absolving myself from the claims made in this post, I am as guilty as the next guy when it comes to writing valueless content, but I do feel like I mix in some valuable content that is based on empirical data and facts.  I am all for some good rhetoric but let’s face it people, we all like to hear ourselves talk regardless of how little value the commentary actually has.  When your platform is largely opinions based loosely on facts as defined in the “Marchitecture” documentation you have to be willing to accept the 50% of the people who will agree with your perspective and 50% of the people who won’t.  Why does anyone even care what influences the authors perspective, why does it matter to the content consumer?  The assumption we all should make is that the content author is motivated by something, this motivation can be pure or corrupt in nature.  The great thing is you can decide to either agree or disagree, offer up some additional conjecture or not, that is the beauty of free will.

A WORD OF CAUTION TO BLOG CONTENT CONSUMERS

Every person has a predisposition to one perspective or another thus the concept of a non-biased view of the world, policy, product, etc… is made impossible by this little thing we call human nature.  But wait it's worse, beyond just human nature we have what I believe to be the two additional key aspects that influence behavior:

Indoctrination:  The belief system in which we participate (e.g. – WAFL vs CoW).  There is no doubt that a long time NetApp employee, user, etc… who has been indoctrinated in to the culture, ideology, thought process, etc… will believe that WAFL is a superior technology when compared to CoFW.  In contrast a person indoctrinated into the EMC culture could likely argue why CoFW is a superior technology.  The problem is that both of these perspectives speak to the technology and not the use case.

Personal Gain (monetary or otherwise):  My favorite because it has a huge impact because the so called independent analysts in the technology community (who will remain unnamed due to the litigious nature of the world we live in) are really marketing mercenaries or blackmail artists depending on your perspective.  This is not to say that analysts do not initiate coverage on technologies that they are no being paid to follow but let’s just say that the coverage of technologies that are being paid to follow is a bit more substantial.  It is funny how as human beings our opinions tend to align with our goals.

So my word of caution is as follows, trust ONLY yourself (and yes you can trust yourself, it is true that you likely have an agenda but is also true that this agenda is likely in your best interest), read lots of differing opinions and formulate your own.  Realize that reading information found on the web can be a dangerous thing if you don’t take what you have learned, internalize it and think for yourself.  My favorite example here is researching symptoms on WebMD, use the WebMD Symptom Checker and enter the req’d info (e.g. – male, 35-44 years), click submit, then drill down on the head, once you get to the symptom picker choose Headache (worst ever) now take note of the only possible condition.  Enough said about the dangers of the internet and not thinking for yourself.  So PLEASE apply some modicum of logic, reason and realism when digesting opinionated content.

So what prompted this seemingly common sense cautionary tale.  My opinionated colleague over at RecoveryMonkey.net posted an OPINION entitled “More FUD busting: Deduplication – is variable-block better than fixed-block, and should you care?” on his blog that received criticism from other opinionated ministers of public enlightenment and propaganda.  I have read through the posts and the short answer is everyone is correct and equally adept at the art of FUD slinging, what a tragedy.   IMHO the market today (especially among the big boys) has parity +/- 1% (exclusive of the features that don’t work or no one cares about, and yes they exist in all products), the 1% differentiation is often littered with caveats, the blogs outlining these caveats, workarounds, use cases, etc… are the valuable ones, spend more time consuming this content and less time reading content that reminds more of TMZ than a technical blog.

It should be fairly easy when consuming content to determine what is valuable and what is not, just read Scott Lowe’s Blog to see what good content looks like.

One final thought, was the FTC warning really necessary, really????  See what I mean about the litigious nature of our society.  It all starts with nationalizing health care, the next thing you know the FTC is commandeering your blog, where does it end.

WordPress mime types

I wanted to expand the type of files that wordpress would allow me to upload and attach to a post.  I used a plugin called pjw-mime-config here are a few things that I figured out beyond just using this plugin to add mime types.

I originally wanted to upload and attach a powershell script to a post when wordpress responded with the following error:  File type does not meet security guidelines. Try another.  I googled the error and found that I needed to add additional mime types to be accepted by wordpress, pjw-mime-config was suggested to easily add mime types, I installed the plugin and fat fingered the mime type, I tried to remove the mime type but it failed to delete it…  I uninstalled the plugin and reinstalled thinking that would remove the mime type, no luck.  My thought at this point was that the mime types must be stored in the wordpress database  by the pjw-mime-config plugin, I worked with a test wordpress installation and exported the DB (db1.sql) installed jpw-mime-config, added a mime type (foo, text/plain) and exported the DB again (db2.sql) I then did a diff on the two SQL exports.  Sure enough there jpw-mime-config row in the wordpress DB table wp_options, I uninstalled jpw-mime-config, deleted the row and reinstalled jpw-mine-config and all was good.

After installing jpw-mime-config the wordpress uploader accepted the file but gave an error from the upload dialog that referenced functions.php @ line 2258.  To resolve the error I edited ./wp-includes/functions.php and added the ps1 mime type to the get_allowed_mime_types function (starts at appox line 2275).

Data Profiling with Windows PowerShell

A customer asked me the other day about a method to do some data profiling against a file system.  So I thought I would share the request, my suggestion and little PowerShell script I crafted to do some data profiling.  The request read as follows:  “Do you know of any tools that will list all files in a directory, and all subs and provide attributes like filename, path, owner, create date, modify date, last access date and maybe some other attributes in report format?”

My recommendation was a commercial product called TreeSize Professional by JAM Software the product license cost is ~ $50 and worth every penny, scan speeds are good, supports UNC paths and reporting is intuitive.  Overall an excellent product.

As an alternative below is a quick PowerShell script (also attached to post as data_profile.ps1) that will create a CSV file with data profiling information, once the CSV file is create the CSV can be opened in Excel (or your spreadsheet tool of choice) or imported into a DB and manipulated.

$root = "c:\\files"
$report = ".\report.csv"

$AllFiles = @()
foreach ($file in get-childitem $root  -recurse| Select-Object FullName, Root, Directory, Parent, Name, Extension, PSIsContainer, IsReadOnly, Length, CreationTime, LastAccessTime, LastWriteTime, Attributes)
{
$acl = get-acl $file.fullname | select-object path,owner,accesstostring,group
$obj = new-object psObject
#$obj | Add-Member -membertype noteproperty -name FilePathandName -Value $file.FullName
$obj | Add-Member -membertype noteproperty -name Root -Value $file.Root
$obj | Add-Member -membertype noteproperty -name Ditrectory -Value $file.Directory
$obj | Add-Member -membertype noteproperty -name Parent -Value $file.Parent
$obj | Add-Member -membertype noteproperty -name Name -Value $file.Name
$obj | Add-Member -membertype noteproperty -name Extension -Value $file.Extension
$obj | Add-Member -membertype noteproperty -name IsDIR -Value $file.PSIsContainer
$obj | Add-Member -membertype noteproperty -name IsReadOnly -Value $file.IsReadOnly
$obj | Add-Member -membertype noteproperty -name Size -Value $file.Length
$obj | Add-Member -membertype noteproperty -name CreationTime -Value $file.CreationTime
$obj | Add-Member -MemberType noteproperty -Name LastAccessTime -Value $file.LastAccessTime
$obj | Add-Member -MemberType noteproperty -Name LastWriteTime -Value $file.LastWriteTime
$obj | Add-Member -MemberType noteproperty -Name Attributes -Value $file.Attributes
#$obj | Add-Member -MemberType noteproperty -Name Path -Value $acl.path
$obj | Add-Member -MemberType noteproperty -Name Owner -Value $acl.owner
$obj | Add-Member -MemberType noteproperty -Name AccessToString -Value $acl.accesstostring
$obj | Add-Member -MemberType noteproperty -Name Group -Value $acl.group
$AllFiles += $obj
}
$AllFiles |Export-Csv $report –NoTypeInformation

The above script scans all files recursively starting at c:\files and outputs the results to results.csv.  One thing to note is that the scan stores all data in an array in memory, the is because the PowerShell Export-Csv function does not support appending to a CSV file (you gotta wonder what Microsoft talks about in design meetings).  I will likely create a version of the script that uses the out-file function to write each row to the csv file as the scan happens rather then storing in memory until the scan is completes and then writing the entire array to the report.csv file, goal here is to reduce the memory footprint during large scans.

The output of this file script will be similar to the following:

Blackberry Issues

So about a week ago my Blackberry (8900) booted to a screen stating “Error [some number I can’t remember]:  Reload OS”, obviously not good.  So I broke out JLcmder (a must have for all hard core Blackberry hackers), and proceeded to wipe and OS and reload my BB.  Yesterday afternoon I am sitting at my desk and I look down at my Blackberry and it is sitting there with a white screen, nothing but a white screen.  I try a soft reboot, and back to the white screen, I try a battery pull, back to the white screen.  Then in not my finest moment I some how rationalize that running over to the T-Mobile store will be the easiest/quickest fix, they waste a solid 30 mins of my life pulling the battery repeatedly and praying that it will boot, I will never get that 30 mins back.  As usual I returned to the office hooked the BB up to a laptop to see if I could connect from JLcmder, no luck.  I removed the battery, sim card and my MicroSD memory card, replaced the battery and rebooted, my BB returned.  I then stated scouring the forums, turns out that a few others had seen an issue with a corrupted SD card that caused the white screen of death.  Last night I formatted my SD card (fat32) and placed it back into my BB and the phone booted fine (happy about that).  When will I realize to never call my carrier for technical support (I have been with Verizon, AT&T and now T-Mobile and they are all the same. When you go to a BB specialist and the first thing they tell you do is pull your batter, you have to wonder how special he or she is.)  Hope this helps someone.

What have I been up to… Project Hive…

Obviously my post frequency has dramatically decreased this is due to a couple of factors.  First I am busy so I have less time to turn my experiences into easy to digest blog posts and second myself and a few of my comrades have been developing something we call “Project Hive” .  As you can probably tell from many of my blog posts most of my work in recent years has been associated with EMC technologies.  Throughout the years we realized that while there are some good framework tools out there they are costly, require significant customization and often don’t solve the common day-to-day operational issues that system administrators face.  The goal of “Project Hive” is to dramatically simplify the common tasks associated with managing EMC technologies.  Being intimately familiar with these tasks we have developed a platform that is based on a distributed collection, aggregation and presentation, we call this the “Honeycomb”, each Honeycomb contains modules, we call these “Workers” which are responsible for the collection, aggregation and analysis of data from discrete infrastructure components, all workers are centrally managed on the Honeycomb and use standard based methods to collect data (i.e. – WMI, SSH, SNMP, APIs, etc…).  “Project Hive” is a very active project and we are continually adding functionality to existing workers and building new workers as time permits or requirements dictate.

Any EMC customer who has been through an upgrade is familiar with the EMCGrab process (the process of running the EMCGrab utility on each individual SAN attach host within the environment and providing the output to EMC so they can validate the host environment prior to the upgrade).

In a reasonably sized environment this process can be tedious and time consuming, one of our released workers centralizes and automates the EMCGrab process.  I recently created a video which contrasts the process of running an EMCGrab manually on an individual host vs. using the Hive Worker.  My hope is to publish more of these videos in the future but as you can imagine they take a bit of time to produce.  If you are looking for more information contact the Project Hive team at dev@projecthive.info

A hi-resolution video is available here .

EMC CX3-80 FC vs EMC CX4-120 EFD

This blog is a high level overview of some extensive testing conducted on the EMC (CLARiiON) CX3-80 with 15K RPM FC (fibre channel disk) and the EMC (CLARiiON) CX4-120 with EFD (Enterprise Flash Drives) formerly know as SSD (solid state disk).

Figure 1:  CX4-120 with EFD test configuration.

image

Figure 2:  CX3-80 with 15K RPM FC rest configuration.

image

Figure 3:  IOPs Comparison

image

Figure 4:  Response Time

image

Figure 5:  IOPs Per Drive

image

Notice that the CX3-80 15K FC drives are servicing ~ 250 IOPs per drive, this exceeds 180 IOPs per drive (the theoretical maximum for a 15K FC drive is 180 IOPs) this is due to write caching.  Note that cache is disabled for the CX4-120 EFD tests, this is important because high write I/O load can cause something known as a force cache flushes which can dramatically impact the overall performance of the array.  Because cache is disabled on EFD LUNs forced cache flushes are not a concern.

Table below provides a summary of the test configuration and findings:

Array CX3-80 CX4-120
Configuration (24) 15K FC Drives (7) EFD Drives
Cache Enabled Disabled
Footprint   ~42% drive footprint reduction
Sustained Random Read Performance   ~12x increase over 15K FC
Sustained Random Write Performance   ~5x increase over 15K FC

In summary, EFD is a game changing technology.  There is no doubt that for small block random read and write workloads (i.e. – Exchange, MS SQL, Oracle, etc…) EFD dramatically improves performance and reduces the risk of performance issues.

This post is intended to be an overview of the exhaustive testing that was performed.  I have results with a wide range of transfer sizes beyond the 2k and 4k results shown in this posts, I also have Jetstress results.  If you are interested in data that you don’t see in this post please Email me a rbocchinfuso@gmail.com.