Avamar sizing brain dump

Avamar DS18 = Utility Node + Spare Node + 16 Active Data Nodes

For a 3.3. TB Gen-3 Grid

  • Raw Capacity ~102 TB
  • Spare Node ~6 TB
  • RAID5 ~15 TB
  • Checkpoint  / GC ~28 TB
  • RAIN ~3 TB
  • Available for Active Backups ~49 TB

RAID Configuration:

  • RAID 1 for 3.3 TB node
  • RAID 5 for 2 TB nodes
  • RAID 1 for 1 TB nodes

How to calculate the required capacity:

  • Seed (Initial backups)
    Daily Change * Rentention in Days
    +RAIN = GSAN Utilization

 

  • Need min available space for 4 checkpoints
  • 3 checkpoints maintained by default

Data Gathering

Note:  Agent only vs. data store depends on the desired RPO

  • xfer_rate = Gb/hr * .70
  • date_size = total of the data set to be backed up
  • restore_time = data_size x .65 / xfer_rate

If RTO < restore_rate then data store else agent only

Always use 3.3 TB nodes when configuring unless additional nodes are required to increase the ingestion rate.

Use the default de-dupe rate unless a POC or assessment has been performed.

Sizing Considerations:

  • Data Types
    • File Systems
    • Databases
    • Large Clients > 2 TB
    • Dense File Systems (excluding EMC Celerra and NetApp)
  • Organic Growth
  • RTO
  • Replication Window
  • Maintenance Window

Non-RAIN node must be replicated this includes single node Avamar deployments and 1×2 (1 utility node and 2 data store nodes – this is non-RAIN config) configurations.

**** Remember this: As a general rule it seems that transactional databases are better suited to be backed up to Data Domain and NOT with the Avamar as the hashing of databases is generally very slow.

VMware (specifically using the VMware Storage APIs) and CIFS are well suited for Avamar

Data save rates:

  • 100 – 150 GB/hr per avtar stream on latest server types
    • Note:  it is possible to launch multiple avtar daemons with some tweaking, but an out of the box install only launches a single avtar process.
  • VMguest backups can be slower (very scientific, these are backups that
  • Default assumption is chuck-compress-hash process runs at a rate of 100 GB/hr
    • This is the process that bottlenecks database backups (ideally is seems that the avtar stream rate should match the check-compress-hash process)

Scan rate:

  • ~ 1 million files per hour
    • 1 TB of file data will take about 1 hour to backup
    • 1 TB DB will take ~ 10 hours to complete

Performance:

  • 1 TB/hr per node in the grid (all file data)
  • 80% file (800 GB file) and 20% DB (200 GB DB) and the performance level drops off to .5 TB/hr
  • E.g. – DS18 perf will be ~ 15-16 TB/hr
  • Per node ingest rate ~ 8GB/hr

Restores:

Data Fetch Process

  • Per node assumption
    • Chuck size 24kb
    • each chunk is referenced in a hash index stripe
    • Speed:
      • 5 MB/s
      • 18 GB/hr (compressed chunk)
      • 25 GB/hr (rehydrated chunk)
  • E.g. – A DS18 will restore at a rate of .5 TB/hr

NDMP Sizing:

  • Size of the NDMP data set
  • Type of filer (Celerra or NetApp)
  • Number of volumes, file systems, qtrees
  • Size of volumes
  • Number of files per volume / file system

L-0 Fulls on happen once (we don’t want to size for them)

Size for L-1 incremental which will happen in perpetuity following the completion of the L-0 full.

  • Important L-1 sizing data
    • Number of files in the L-1 backup
    • Backup window

2 Accelerator Nodes

Config Max Files   Max Data   Max Streams  
  Celerra NetApp Celerra NetApp Celerra NetApp
6 GB 5 m 30 m 4-6 TB 4-6 TB 1-2 1-2
36 GB 40 m 60 m 8-12 TB 8-12 TB 4 4

NDMP throughput ~ 100 – 150 TB/hr

Assumed DeDupe Rates:

  • File data
    • Initial backup:  70% commonality (30% of the data is unique)
      • e.g. – 30% of 10 TB = 3 TB stored
    • Subsequent backups:  .3% daily change
      • e.g. – .3% of 10 TB = 30 GB stored per day
  • Database data
    • Initial backup:  35% commonality (65% of the data is unique)
      • e.g. – 65% of 10 TB = 6.5 TB stored
    • Subsequent backups:  4% daily change
      • e.g. – 4% of 10 TB = 400 GB stored per day

Tip:  Based on scan rate and the amount of data stored for DB backups you can see why Avamar may not be the best choice for DB backups.

NDMP Tips:

  • Avamar NDMP accelerator node should be on the same LAN segment as the filer and the same switch when possible
  • No Include/Exclude rules are supported
  • Able to run up to 4 NDMP backups simultaneously
    • most effective with large files
    • min of 4GB of memory per accelerator node per stream
    • 4 NDMP simultaneously scheduled as groups backups

Desktop / Laptop

Sizing:

  • Number of clients
  • Amount of data per client
    • user files
    • DB/PST files

DS18 can support ~ 5000 clients

Number of streams per node default is 18 (17 are usable, one should be reserved for restores).

That completes the brain dump.  Wish I had more but that is all for now.

Leave a Reply

Your email address will not be published. Required fields are marked *