NetApp Architecture

The NetApp architecture consist of hardware, Data ONTAP operating system and the network. I have already shown you a diagram of a common NetApp setup but now i will go into more detail.

Hardware

NetApp have a number of filers that would fit into any company and cost, the filer itself may have the following

The filer can be attached to a number of disk enclosures (shelves) which expands the storage allocation, these disk enclosures are attached via FC, as mentioned above the disk enclosures can support the following disks

FCP These are fibre channel disks, they are very fast but expensive
SAS Serial attached SCSI disks again are very fast but expensive , due to replace the FC disks
SATA Serial ATA are slow disks but are cheaper, ideal for QA and DEV environments

One note to remember is that the filer that connects to the top module of a shelf controls (owns) the disks in that shelf under normal circumstances (i.e. non-failover).

The filers can make use of VIF's (Virtual Interfaces), they come in two flavors

Single-mode VIF
  • 1 active link, others are passive, standby links
  • Failover when link is down
  • No configuration on switches
Multi-mode VIF
  • Multiple links are active at the same time
  • Loadbalancing and failover
  • Loadbalancing based on IP address, MAC address or round robin
  • Requires support & configuration on switches

Software

I have already touched on the operating system Data ONTAP, the latest version is currently version 8 which fully supports grid technology (GX in version 7). It is fully compatible with Intel and AMD architectures and supports 64bit, it borrows the idea's from FreeBSD.

All additional NetApp products are activated via licenses, some require the filer to be rebooted so check the documentation.

Management of the filer can be accessed via any of the following

Storage Terminology

When talking about storage you probably come across two solutions

NAS
(Network Attached Storage)

NAS storage speaks to a file, so the protocol if a file based one. Data is made to be shared examples are

  • NFS (Unix)
  • CIFS or SMB (Windows)
  • FTP, HTTP, WebDAV, DAFS
SAN
(Storage Area Network)

SAN storage speaks to a LUN (Logical Unit Number) and accesses it via data blocks, sharing is difficult examples are

  • SCSI
  • iSCSI
  • FCAL/FCP

There are a number of terminologies associated with the above solutions, I have already discussed some of them in my EMC section

Terminology
Solution
Description
share/export
NAS
CIFS servers makes data available via shares, a Unix server makes data available via exports
Drive mapping/mounting
NAS
CIFS clients typically map a network drive to access data stored on a storage server, Unix clients typically mount the remote resource
LUN
SAN
Logical Unit Number , basically a disk presented by a SAN to a host, when attached it looks like a locally attached disk.
Target
SAN
The machine that offers a disk (LUN) to another machine in other words the SAN
Initiator
SAN
The machine that expects to see the disk (LUN) the host OS, appropriate initiator software will be required
Fabric
SAN

One or more fibre switches with targets and initiators connected to them are referred to as a fabric. Cisco, McData and Brocade are well know fabric switch makers

See my EMC architecture section for more details

HBA
SAN
Host Bus Adapter, the hardware that connects the server or SAN to the fabric switches. There are also iSCSI HBA's
Multipathing (MPIO)
SAN
The use of redundant storage network components responsible for transfer of data between the server and the storage (Cabling, adapters, switches and software)
Zoning
SAN

The partioning of a fabric into smaller subsets to restrict interference, added security and simplify management, it's like VLAN's in networking

See my EMC zoning section for more details

Below is a typical SAN setup using NetApp hardware

NetApp Terminology

Now that we know how a NetApp is configured from a hardware point of view, we now need to know how to present the storage to the outside world, first some NetApp terminologies explained

Disk

This is the physical disk itself, normally the disk will reside in a disk enclosure, the disk will have a pathname like 2a.17

  • 2a = SCSI adapter
  • 17 = disk SCSI ID

Any disks that are classed as spare will be used in any group to replace failed disks.

Disks are assigned to a specific pool, also parity disks do not contain any data.

Raid Group (Pool)

Normally there are three pools 0, 1 and spare

  • 0 = normal pool
  • 1 = mirror pool (if syncMirror is enabled)
  • spare = spares disks that be used for growth and replacement of failed disks
Aggregate

A collection of disks that can have either of the below RAID levels, the aggregate can contain up to 1176 disks, you can have many aggregates with the below different RAID levels. An aggregate can contain many volumes (see volumes below).

  • RAID-4
  • RAID-DP (RAID-6) better fault tolerance

One point to remember is that a aggregate can grow but cannot shrink, the disadvantage with RAID 4 is that a bottleneck can happen on the dedicated parity disk, which is normally the first disk to fail due to it being used the most, however the NVRAM helps out by only writing to disks every 10 seconds or when the NVRAM is 50% full.

Plex When a aggregate is mirrored it will have two plexes, when thinking of plexes think of mirroring. A mirrored aggregated can be split into two plexes.
Volume (Flexible) This is more or like a traditional volume in other LVM's, it is a logical space within an aggregate that will contain the actual data, it can be grown or shrunk as needed
LUN The Logical Unit Number is what is present to the host to allow access to the volume.
WAFL

Write anywhere filesystem layout is the filesystem used, it uses inodes just like Unix. Disks are not formatted they are zeroed.

By default WAFL reserves 10% of a disk space (unreclaimable)

Snapshot

A frozen read-only image of a volume or aggregate that reflects the state of the new file system at the time the snapshot was created, snapshot features are

  • Up to 255 snapshots per volume
  • can be scheduled
  • Maximum space occupied can be specified (default 20%)
  • File permissions are handled

Snapshots in NetApp world are very fast, basically it takes a snapshot of all the blocks that are associated with the files, this data is never actual changed, if a block is changed a new block is created, the snapshot still points to the old block. NetApp has two products called SnapDrive and SnapManager that deal with consistency problems where data has not actually been written to the disk but cached in memory buffers, you might want to take a look at these products.

There are three additional replication products that can you can use

SyncMirror
  • real time replication of data
  • maximum distance of up to 35km
  • Fibre Channel or DWDM protocol
  • Synchronous

is used primarily for data redundancy

SnapMirror
  • long distance DR data consolidation
  • no limit on distance and uses
  • IP protocol (WAN/LAN)
  • ASync Mirror (> 1 minute)

is used primarily for disaster recovery

SnapVault
  • disk-to-disk backup, restore HSM
  • no limit on distance
  • IP protocol (WAN/LAN)
  • ASync Mirror (> 1 hour)

is used primarily for backup/restore