VSS – Error – “Volume Shadow Copy Service error: Unexpected error DeviceIoControl(\\?\fdc#generic_floppy_drive”

Background

Reviewing errors on a couple of our servers and consistently seeing errors sourced to VSS.

BTW, VSS is a Microsoft Application and the term VSS stands for Volume Shadow Copy.

Error

Error Image

unexpectederror-deviceiocontrol-genericfloppydrive-20170118-0636pm-brushedup

Error Message



Volume Shadow Copy Service error: Unexpected error DeviceIoControl(\\?\fdc#generic_floppy_drive#6&2cb9d9b7&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b} - 0000000000000474,0x00560000,0000000000000000,0,0000003E699986F0,4096,[0]).  hr = 0x80070001, Incorrect function.
. 

Operation:
   Exposing Recovered Volumes
   Locating shadow-copy LUNs
   PostSnapshot Event
   Executing Asynchronous Operation

Context:
   Device: \\?\fdc#generic_floppy_drive#6&2cb9d9b7&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b}
   Examining Detected Volume: Existing - \\?\fdc#generic_floppy_drive#6&2cb9d9b7&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b}
   Execution Context: Provider
   Provider Name: VMware Snapshot Provider
   Provider Version: 1.0.0
   Provider ID: {564d7761-7265-2056-5353-2050726f7669}
   Current State: DoSnapshotSet


Error Explanation

The Event Source is VSS.  And, the Event ID is 12289.

And, the event description reads  “Volume Shadow Copy Service error: Unexpected error DeviceIoControl(\?\fdc#generic_floppy_drive …  hr = 0x80070001, Incorrect function“.

Later on the Device Id is identified again within the Context sub-region.  And, the Device ID is 53f5630d-b6bf-11d0-94f2-00a0c91efb8b.

 

Remediation

NetApp Community

dailyFresh

Link

Did you, by any change, create a VM with a floppy drive, installed Windows and then one day removed the floppy drive from the VM?

In that case disable the floppy drive in the Device Manager of Windows and see if the error goes away.

 

Steps

  1. Launch Control Panel \ Device Manager
  2. Select Floppy disk drives \ Floppy disk drive
    • Right click on your selection
    • From the drop-down menu, please select disable
 Device Manager – Original

devicemanager-floppydiskdrive-20170118-0641pm

 

 

 Device Manager – Disabling the device ….

disablefloppydiskdrive

 

 Device Manager – Disabled Device

devicemanager-floppydiskdrive-20170118-0642pm

 

Knowledge Based

Here is some of the KB available on the Net.

 

Device Id

Removable Drives

smallvoid.com

Link

From smallvoid.com here are the registry keys for the various types of removable storages.

Device Type Registry Key
 CD and DVD Drives  {53f56308-b6bf-11d0-94f2-00a0c91efb8b}
 Floppy Drives  {53f56311-b6bf-11d0-94f2-00a0c91efb8b}
 Removable Disks  {53f5630d-b6bf-11d0-94f2-00a0c91efb8b}
 Tape Drives  {53f5630b-b6bf-11d0-94f2-00a0c91efb8b}

 

 

Error

0x80070001

Error 0x80070001 often means that one is trying to access a device that does not exist.

Please refer to the following KB articles

  1. Windows backup or restore errors 0x80070001, 0x81000037, or 0x80070003
    Link

Giving Credit

Though DailyFresh will like to keep an Air of Anonymity, he still gets credit.

 

Storage – Strip Unit Size

Background

A Storage disk’s Strip Unit Size is often taken into consideration as one considers storage alignment.

So I am finishing up and closing all opened Google’s Chrome Tab, and went back and read:

Disk Partition Alignment (Sector Alignment) for SQL Server: Part 4: Essentials (Cheat Sheet)
http://blogs.msdn.com/b/jimmymay/archive/2008/12/04/disk-partition-alignment-sector-alignment-for-sql-server-part-4-essentials-cheat-sheet.aspx

Jimmy May’s paper principally deals with Microsoft products: SQL Server on Windows OS, but his formula is very generalized.

 

Strip Unit Size – What does it matter?

Let us see how it plays out:

Here is Jimmy May’s formula:

Three Values - Two Essential Correlations

Jimmy’s document says:

  • Perform these calculations for each partition which must result in integer values
  • Of the two, the first is far more important.  Use the information below to divine this information.

And, here is how he says to get “File Allocation Unit Size” and also “Starting Offset”:

GetJimmysNumbers

It is left to the inquiring mind how to get “Stripe Unit Size” as that number is not based on the OS, Microsoft Windows, in this case; but based on the Vendor and each Application.

Matrix – Vendor

Here are some numbers for various vendors and models:

Strip Unit Size
Vendor Model# Strip Unit Size
EMC Clarion  64k
NetApp (all models)  4k
HP (all models)  64k -to- 256k

Matrix – Application

Here are popular applications along with their Block Size.

Block Size
Vendor Application Block Size
Microsoft SQL Server  64 kb
Hadoop HBASE  64 kb
MySQL\Percona InnoDB  512 Bytes

Matrix – Application – MySQL/Inno DB

http://www.percona.com/doc/percona-server/5.5/scalability/innodb_io_55.html?id=percona-server:features:innodb_io_55

This variable changes the size of transaction log records. The default size of 512 bytes is good in most situations. However, setting it to 4096 may be a good optimization with SSD cards. While settings other than 512 and 4096 are possible, as a practical matter these are really the only two that it makes sense to use. Clean restart and removal of the old logs is needed for the variable innodb_log_block_size to be changed.

Calculation

Calculation – NetApp

  1. NetApp’s default Strip Unit Size is 4K and Microsoft’s SQL Server best practice suggest using a File Allocation Unit Size of 64K.
  2. It does not take much calculation to deduce that 4K / 64K will not render an integer value, but a fractional value
  3. Please keep in mind that for NTFS, the default Strip Unit Size is 4K and one will get a whole number of 1 (Strip Unit Size / File Allocation Unit Size = 4 K / 4K = 1)

References

References – Vendor – Microsoft

References – Vendor – NetApp

References – Vendor – EMC

References – Vendor – HP

References – Vendor – Oracle

References – Vendor – MySQL

References – Technology – Hadoop

References – Changing Block Size

Microsoft – SQL Server – Storage – Disk Alignment (via DiskPart Scripting)

Introduction:

As part of a Storage I/O review, discussions whether our disks are properly aligned came back up.

The specific Knowledge Base article that I was pointed to is :

How to diagnose misaligned I/O on Windows hosts
https://kb.netapp.com/support/index?page=content&id=1010803

It is a recently published NetApp article; as its publish date is 2013.02.27.

Though a short article, there is a lot in it.

There are two areas that I will like to cover in this posting; those areas are StartingOffset and Partition Style.

Starting Offset:

Depending on the Version of Windows (OS Version) , Partition Style, and your LUN size, your gold standard for “Starting offset” appears to vary a bit.

Here are the Numbers published by NetApp:

  • For Windows MBR, this number should be 32256, 31.5kb offset is used when the LUN is created with the Windows LUN type (31.5 * 1024 = 32256).
  • For Windows GPT, this number should be 65535 bytes for LUNs smaller than 4GB or 1048576 bytes for LUNs that are 4GB or larger.
  • For Windows 2008+, this number should be 65535 bytes for LUNs smaller than 4GB or 1048576 bytes for LUNs that are 4GB or larger

In summary:

  • Windows MBR, 32556
  • Windows GPT, disk size < 4 GB, then 65535 : else disk size >=  4GB, then 1048576

Detection:

There are a couple of procedures you can employ to determine your Starting Offset:

  • Windows Management Interface (WMI)

WMI

Syntax:


wmic partition get BlockSize, StartingOffset, Name

 

Interpretation:

OS version

Starting Offset

  • On the disk we are most interested in, Disk 4, our Starting Offset is 1048576.  That number matches up with NetApp guidance.

Partition Style:

Microsoft supports a couple of partition styles, MBR and GPT.

  • MBR — Legacy partition
  • GPT – new partition

There a couple of ways to determine your partition style:

  • GUI – Disk Management
  • DiskPart / list disk

Partition Style – Detect via Disk Management

Here are the steps:

  • Launch “Computer Management”
  • On the left panel, transverse to “Storage” \ “Disk Management”
  • The list of available disks are displayed on the right panel
  • On the right panel, select the disk you are interested in — make sure you select the physical disk, and not logical disk
  • Right click on your selected physical disk, and select “Properties” from the drop-down  menu
  • Notice that the name of the window that shows up will very based on whether this is a local disk, a SAN disk and the Disk Vendor
  • Proceed to the “Volumes” tab

Here is what shows up for us:

NetApp Device Properties

Partition Style – Via DiskPart / listdisk

Here are the steps:

  • Launch OS Shell
  • Start diskpart interactively (diskpart)
  • Issue “List Disk

DiskPart - List disk

 

To determine which disks are GPT, follow the GPT column.

If a disk has the asterisk symbol, then it is GPT. Else, it is not…

The two checks we performed conclusively affirm that our “Disk 4” is in fact MBR.  How could this be:

So went back and looked at our scripts:


select disk 4
create partition primary align=1024
assign letter=V
format fs=ntfs unit=64k label="Disk - Temp" quick

The script looks good:

 

Googled some more and found out what is wrong:

  • The Create partition syntax does not allow us to directly set the Partition Style

 

To correct your script for the future, have it resemble something along the likes of:


select disk 4
clean
convert gpt
offline disk
online disk
attribute disk clear readonly
create partition primary align = 1024
assign letter=V
format fs=ntfs unit=64k label="Disk - Temp" quick

Please show and demonstrate extraordinary care when preparing to issue the script above:

  • Make sure you have selected the right disk (Select disk 4)
  • Notice the use of “clean” — It destroys the disk
  • Convert gpt –> Converts the partition to gpt; the default is mbr
  • Notice the use of “attribute disk clear readonly”; It says to the disk remove the armor you place on formatted and in-use disk

Why GPT:

Performance:

Is there a performance penalty with choosing either MBR or GPT as your Partition Style. Googled for help, and there does not appear to be.

Please keep in mind that the partition style does not affect the way your data is written out, it has more to do with your partition table.

And, Microsoft will not let you get away with a wrong choice (ie MBR) for disks bigger than 2TB.

References

NetApp

GPT

References – GPT ( Processing)

References – Vendor

References – Vendor – NetApp:

Technical: NetApp – MPIO – Path Details

Technical: NetApp – MPIO – Path Details

As part of NetApp diagnostic, you might need to dig deep into which paths are actually being used.

MPIO Path Details

  1. Launch Computer Management
  2. In the left panel, Access Storage \ Data OnTap(R) DSM Management \ Disk Management
  3. In the right panel, select the Disk
  4. Right Click on the the Disk Nth, and in the ensuring “Drop-down” menu, select the “Properties”
  5. The “NETAPP LUN Multi-Path Disk Device Properties” window appears
  6. The paths are listed in the “This device has the following paths”
  7. Double-clicked on the path you want to dig into …
  8. The “MPIO Path Details” window appears

NetApp - MPIO Path Details

The following areas are displayed:

  • Number of Reads
  • Number of Writes
  • Bytes Read
  • Bytes Written

 

Note that in the instructions above, you can not select the “Logical Disk”.

 

Computer Management

 

So in the screen above, please select “Disk 4” and right click on that selection.

 

Microsoft – SQL Server – Datafiles – Log File Write Patterns

One of the first emails I received this morning was one detailing one of my wrong assumptions about NetApp LUN (mis) alignment determination:

NetApp Lun Aligning
https://danieladeniji.wordpress.com/2012/12/13/netapp-lun-aligning/

And, so I read up some more and tried argumenting that blog with any new data on the Net.

And, I still came away with something that I did not fully disclose earlier.  And, that is that seemingly LUNS dedicated to MS SQL Server log files are registering as mis-aligned when ‘profiled’ within NetApp.

The specific NetApp commands for validating alignment:

  • priv set diag; lun alignment show <lun-name>
  • lun show -v <lun-name>

so what to do, but take to Google.  It is a bit difficult to get a good Google “Search Item”, but managed to do OK.

Here are some relevant entries:

Since both Kendra back in May 29th, 2012 and MS Premier SQL Server Engineering on May 23rd, 2013 says to use SysInternals’s Process Monitor and I am ‘ve a big fan of Mark Russinovich,  I took to it.

In SysInternals – Process Monitor, filtered for :

  • Process Name –> sqlservr.exe
  • Operation –> WriteFile

Here is SysInternal’s Process Monitor results page:

MS SQL Server - Log File - Write Patterns

This much is obvious:

  • In the “Details” column, the length of most Log Files writes are 61,440 bytes (60 KB)
  • This holds true for both entries written to our local drive (D:) and our Network Drive (Y:)
  • Testimonial to how MS SQL Server Log files are written, our entries are written sequentially and when one adds up the offset to the size, one will arrive at the new line — i.e if one takes 127,044,608 (offset) + 61,440 (length), one will arrive @ 127,106,048 (the start of the next line)
  • Other important facts are the I/O Flags : Non-cached and Write-Through.  SQL Servers writes are not cached, as they are persisted directly to disk

The fact that log entries are persisted directly to disk might explain what we see on our SAN.  The SAN is reporting that our writes are misaligned – The SAN is expecting us to come in bursts, but we write out to the Lun per each transaction commits.

In conclusion, it appears that for SQL Server Log files we will more likely than not report misaligned LUNs.

Addendum

Addendum – 2013.04.02

Data Storage for VDI – Part 8 – Misalignment
http://storagewithoutborders.com/2010/07/21/data-storage-for-vdi-part-8-misalignment/

On March 3rd, 2013, John Martin said…

Partial writes on an Oracle redo log file (or any other file which is written to sequentially) are handled pretty well by the existing partial write mechanisms inside of ONTAP. For the most part these are held in memory until the subsequent writes to the log file come in and these are combined internally into a single 4K block. The real killer is partial overwrites which I think I covered off in a blog post here

 

References

NetApp – Data Collection Tool for Windows (ONTAPWinDC.exe)

Microsoft Visual C++ 2008

Ran OnTapWinDC, but unfortunately received an error message:

The application has failed to start because its side-by-sideconfiguration  is in correct. Please see the application 
event log or use the command-line sxstrace.exe tool for more detail.

To fix download and install “Microsoft Visual C++ 2008 SP1 redistributable package”. For X64 versions, download the following:

http://www.microsoft.com/downloads/details.aspx?familyid=BA9257CA-337F-4B40-8C14-157CFDFFEE4E&displaylang=en

I installed “Microsoft Visual C++ 2008 SP1”.  And, re-ran OnTapWinDC.

And, my God it launched successfully.

But, when I tried gathering data it shows the error listed below: Authentication failed <Filer Name>

NetApp - OnTapWinDC

I spent all evening trying to fix this one. Missed the last bus and everything in between.

I knew I had a problem with Network Firewall and not just simple user-name & password authentication.

Tried setting up a SSH Tunneling with MS Windows based Free SSH Servers.  Had one working with Cygwin, but was not in a mood to set up another one.

Nevertheless, finally, resorted to using a pre-existing Linux one.

To create the SSH Tunnel that will facilitate SSH Access to the Filers:

 

Syntax:


  plink [SSHServer] -P 22 -C -L [BridgeAddress]:BridgePort:DestinationAddress:22

Example:


 plink SSHServer -ssh -P 22 -C -L 127.0.0.1:22:FilerHRDB:22

To ensure that your connections are set up, use putty and see if you ‘re able to connect to the Filers:

  • Run putty
  • Target hostname : localhost
  • Target Port number: 22

Putty screen:

putty - localhost - port 22

The system came back with Filer asking me to authenticate.  Should have taken the help of the best NetApp engineer in world and configured Filer’s SSH Key authentication.

And, would not have to enter passwords.  Issued the password and we are good.

Issue a couple of NetApp Host Commands:


version

uptime

netapp - validation

Run netstat -ano to review your Network Connections:

netstat -ano | find ":22 "

Results

Netstat -ano | find ":22 "

netstat - port - 22 - port 22

And, then returned to the tool:

  • Filer Name/IP –> Make sure to use localhost  — as ssh tunneling will redirect

NetApp - OnTapWinDC - User Authentication - post ssh

But, still no luck.  The same error message:Authentication Failed.

I am here thanking Craig:

Gotcha with NetApp’s OnTapWinDcTool
http://www.humblecraig.com/?tag=ontapwindc

Basically, he says that connectivity to Filer is actually over HTTP, and not SSH. And, that to fix you should connect to your Filer and enable HTTP.

options httpd.admin.enable on

To facilitate HTTP Access over TCP/IP Tunneling:


  plink SSHServer -ssh -P 22 -C -L 127.0.0.1:80:FilerHRDB:80

To validate our SSH Connection, issue:

   netstat -ano | find ":80"

And, you should see at minimum the following entries:

netstat - port - 80

The data is easy enough to read:

  • Column – 1 – Protocol {TCP}
  • Column – 2 – Local IP Address and Port Number { LocalHost:80)
  • Column – 3 – Destination IP Address and Port Number
  • Column – 4 – Status { Listening}
  • Column – 5 – Process ID {13080}

The Process ID is important.  To terminate the SSH Connection, issue a kill request directed to the process ID.

Once we have SSH Tunneling to the Filer over port 80, we were good.

References:

References – NetApp – OnTapWinDCTool

References – SSH Tunneling

NetApp – Performance and Statistics Collector (PerfStats)

NetApp – Performance and Statistics Collector (PerfStats) – version 7

Command Line Parameters

-f

[-f controllername[,controllername1,controllername2,...]]
  • name of the filer

 

-t

 [-t time] (sample time per iteration, default 2)
  • Duration of each iteration in minutes

-i,m

[-i n[,m]] (repeat n times with m minutes between samples, 
                     defaults: n=1,m=0)
  • Number of Iterations and wait time between iterations
  • Make sure that there are no spaces between the two numbers
  • Default values for i is 1, and m is 0 –> That is the default is 1 iteration and no wait time

-l

 [-l login[:password]] (rsh/ssh login and password for rsh only)
  • Login Account to connect to Filer

-S pw:

[-S pw:|kf:]

Processing Steps:

Versioning

  • The version is stated
  • In our case, the information stated is “PerfStat v7.38 (10-2012)”

Begin Iteration <Iteration>

  • Indicates the beginning of the Iteration
  • In our case, the information stated is “Begin Iteration <nth> 


Checking filer <filer>

  • Checking filer …. Establishes a connection with the filer noted
  • In our case, the information stated is “Checking filer filerHR


Prestats on <filer>; OS: ONTAP<version>

  • This step connects to the Filer and kicks off statistics gathering on the Filer
  • In our case, the information stated is “Prestats on  filerHR; OS: ONTAP8.0.2 


Sleep for <performance duration> minute(s)…

  • Once performance gathering is initiated on the Filer, this step waits for the iteration duration
  • In our case, the information stated is “Sleep for 2 minutes”


Poststats on <filer>; OS: ONTAP<version>

  • This step connects to the Filer and “concludes” statistics gathering on the Filer
  • In our case, the information stated is “Poststats on  filerHR; OS: ONTAP8.0.2 


End Iteration <Iteration>

  • Indicates the completion of the Iteration
  • In our case, the information stated is “Poststats on  End Iteration <nth> 

 


Sleep <n> seconds

  • This indicates how long to wait between iterations
  • In our case, the information stated is “Sleeping 60 seconds”

Sample Code (baseline):



If not exist perfData mkdir perfData

for /F "tokens=2,3,4 delims=/ " %%i in ('date/t') do set y=%%k
for /F "tokens=2,3,4 delims=/ " %%i in ('date/t') do set d=%%k%%i%%j
for /F "tokens=5-8 delims=:. " %%i in ('echo.^| time ^| find "current" ') do set t=%%i%%j
set t=%t%_
if "%t:~3,1%"=="_" set t=0%t%
set t=%t:~0,4%
rem set "theFilename=%d%%t%"
set "fname=perfData\%1__%d%%t%.perfdata"
echo %fname%

Time /T

perfstat -f %1 -t 2 -i 4,1 -l root  -S pw:rootpwd > %fname%

Time /T

Sample invokation:



  Syntax:

     getNetAppFilePerData 

  Sample:

    getNetAppFilerPerfData filerHR

Output:

PerfStats - Output - 20130301

References: