Avamar sizing brain dump

Avamar DS18 = Utility Node + Spare Node + 16 Active Data Nodes

For a 3.3. TB Gen-3 Grid

Raw Capacity ~102 TB
Spare Node ~6 TB
RAID5 ~15 TB
Checkpoint / GC ~28 TB
RAIN ~3 TB
Available for Active Backups ~49 TB

RAID Configuration:

RAID 1 for 3.3 TB node
RAID 5 for 2 TB nodes
RAID 1 for 1 TB nodes

How to calculate the required capacity:

Seed (Initial backups)
Daily Change * Rentention in Days
+RAIN = GSAN Utilization

Need min available space for 4 checkpoints
3 checkpoints maintained by default

Data Gathering

Note: Agent only vs. data store depends on the desired RPO

xfer_rate = Gb/hr * .70
date_size = total of the data set to be backed up
restore_time = data_size x .65 / xfer_rate

If RTO < restore_rate then data store else agent only

Always use 3.3 TB nodes when configuring unless additional nodes are required to increase the ingestion rate.

Use the default de-dupe rate unless a POC or assessment has been performed.

Sizing Considerations:

Data Types

File Systems
Databases
Large Clients > 2 TB
Dense File Systems (excluding EMC Celerra and NetApp)

Organic Growth
RTO
Replication Window
Maintenance Window

Non-RAIN node must be replicated this includes single node Avamar deployments and 1×2 (1 utility node and 2 data store nodes – this is non-RAIN config) configurations.

**** Remember this: As a general rule it seems that transactional databases are better suited to be backed up to Data Domain and NOT with the Avamar as the hashing of databases is generally very slow.

VMware (specifically using the VMware Storage APIs) and CIFS are well suited for Avamar

Data save rates:

100 – 150 GB/hr per avtar stream on latest server types

Note: it is possible to launch multiple avtar daemons with some tweaking, but an out of the box install only launches a single avtar process.

VMguest backups can be slower (very scientific, these are backups that
Default assumption is chuck-compress-hash process runs at a rate of 100 GB/hr

This is the process that bottlenecks database backups (ideally is seems that the avtar stream rate should match the check-compress-hash process)

Scan rate:

~ 1 million files per hour

1 TB of file data will take about 1 hour to backup
1 TB DB will take ~ 10 hours to complete

Performance:

1 TB/hr per node in the grid (all file data)
80% file (800 GB file) and 20% DB (200 GB DB) and the performance level drops off to .5 TB/hr
E.g. – DS18 perf will be ~ 15-16 TB/hr
Per node ingest rate ~ 8GB/hr

Restores:

Data Fetch Process

Per node assumption

Chuck size 24kb
each chunk is referenced in a hash index stripe
Speed:

5 MB/s
18 GB/hr (compressed chunk)
25 GB/hr (rehydrated chunk)

E.g. – A DS18 will restore at a rate of .5 TB/hr

NDMP Sizing:

Size of the NDMP data set
Type of filer (Celerra or NetApp)
Number of volumes, file systems, qtrees
Size of volumes
Number of files per volume / file system

L-0 Fulls on happen once (we don’t want to size for them)

Size for L-1 incremental which will happen in perpetuity following the completion of the L-0 full.

Important L-1 sizing data

Number of files in the L-1 backup
Backup window

2 Accelerator Nodes

Config	Max Files		Max Data		Max Streams
	Celerra	NetApp	Celerra	NetApp	Celerra	NetApp
6 GB	5 m	30 m	4-6 TB	4-6 TB	1-2	1-2
36 GB	40 m	60 m	8-12 TB	8-12 TB	4	4

NDMP throughput ~ 100 – 150 TB/hr

Assumed DeDupe Rates:

File data

Initial backup: 70% commonality (30% of the data is unique)

e.g. – 30% of 10 TB = 3 TB stored

Subsequent backups: .3% daily change

e.g. – .3% of 10 TB = 30 GB stored per day

Database data

Initial backup: 35% commonality (65% of the data is unique)

e.g. – 65% of 10 TB = 6.5 TB stored

Subsequent backups: 4% daily change

e.g. – 4% of 10 TB = 400 GB stored per day

Tip: Based on scan rate and the amount of data stored for DB backups you can see why Avamar may not be the best choice for DB backups.

NDMP Tips:

Avamar NDMP accelerator node should be on the same LAN segment as the filer and the same switch when possible
No Include/Exclude rules are supported
Able to run up to 4 NDMP backups simultaneously

most effective with large files
min of 4GB of memory per accelerator node per stream
4 NDMP simultaneously scheduled as groups backups

Desktop / Laptop

Sizing:

Number of clients
Amount of data per client

user files
DB/PST files

DS18 can support ~ 5000 clients

Number of streams per node default is 18 (17 are usable, one should be reserved for restores).

That completes the brain dump. Wish I had more but that is all for now.

GotITSolutions.org

Get Information Technology Conjecture…

Leave a Reply Cancel reply