Open source backup and data de-duplication virtual appliance

By rbocchinfuso - Last updated: Monday, January 15, 2007 - Save & Share - 6 Comments

While I have been a bit quiet lately, I have been working on two new projects one is a BackupPC virtual appliance. An agentless backup and recovery system that also de-duplicates. There is a lot of discussion around technologies like Data Domain, Diligent, EMC/Avamar, and Asigra, this virtual appliance provides similar functionality for FREE. I am also working on a web based UI for this appliance to simplify setup and sync integration for replication.

Setup of the BackupPC virtual appliance is quite simple.

  1. Start the BackupPC Virtual Appliance
  2. The virtual appliance is setup to use bridged networking, it should receive a dhcp address on your network
  3. Log into the BackupPC virtual appliance
    • Username: root
    • Password: backuppc
  4. Identify network IP address
    • ifconfig -a
  5. Open you desktop browser and point it to http://ip_address/cgi-bin/BackupPC_Admin
    • Username: admin
    • Password: password
  6. Create a user “backuppc” on the windows hosts that you want to backup. Backups are performed via SMB so the “backuppc” user should have read access to the C$, etc… shares.
    • Note: Don’t forget to set the users password
  7. vi /b2d_target/conf/config.pl
    • You will need to modify 3 variables in this file
      • $Conf{SmbShareName} = ‘C$’;
        • These are the shares that you want to backup
      • $Conf{SmbShareUserName} = ‘backuppc’;
      • $Conf{SmbSharePasswd} = ‘backuppc’;
        • The SmbShareUserName and SmbSharePasswd should match the username and password that you created on your windows host.
  8. Add hosts to backup.
    • vi /b2d_target/conf/hosts
      • Follow the syntax in the file - the use of IP addresses is OK
  9. Reload the backup configuration file from the web ui
    • Select “Admin Options” and “Reload Config”
  10. You are now ready to start a backup

For detailed usage documentation see: http://backuppc.sourceforge.net/

NOTE: [THIS IS OPTIONAL]

Finally the virtual machine is configured with a 300GB backup target disk (/dev/hdb2). This is Hard Disk 2 (IDE 0:1), if you need more space you should follow the following procedure:

  1. Shutdown the virtual machine
  2. Remove Hard Disk 2 and add a new larger virtual disk - See vmware docs for more detail
  3. Login as root
  4. /etc/init.d/backuppc stop
  5. mkfs -t ext3 /dev/hdb1
  6. mount -a
    • verify that the new device is mounted on /b2d_target
      • df -k
  7. /etc/init.d/backuppc start

Please download and try this virtual machine, and let me know if you find any issues with the documentation. Enjoy!

[tags]virtual appliance backup recovery backuppc linux vmware windows agentless[/tags]

Share and Enjoy:
Posted in General Discussion • • Top Of Page

6 Responses to “Open source backup and data de-duplication virtual appliance”

Comment from Dale
Time April 27, 2007 at 11:35 pm

What is the distro that this vm is built on. It does not have the rsycp perl module, so I tried to do a perl cpan install File::RsyncP and it failed because it needed to have gcc installed, so I am trying to find what rpm I need to have installed for gcc.

Comment from spiffed
Time November 7, 2007 at 1:02 pm

The startup banner lists this as CentOS 4.4 (or RHEL 4.4).

Comment from Don
Time January 18, 2008 at 12:42 pm

What about doing backups from a Unix client? Any plans to support backups from Unix via tar, cpio, etc? What sort of compression ratio’s are you seeing?

Comment from Jason White
Time March 3, 2008 at 5:09 pm

Just out of curiosity, does the backup process itself do the dedupe or is it handled by the file system? Is this a block level or file level dedupe?

Comment from Davish
Time April 21, 2009 at 1:11 am

How you implement De-duplication

Comment from Davish
Time May 5, 2009 at 9:57 pm

Reply to comment 4# Backup process can never itself dedupe we
always need some intermediate software to deduplicate backup data
before it goes to backup server (in CASE OF INLINE DEDUPE) or some
software at backupserver which runs on timely basis to remove
duplicate entries and link them up properly for use during later
reconstruction of original data (in case of post process
deduplication) for further clarification mail me to:
davish.bhardwaj@nechclst.in

Write a comment