Open source backup and data de-duplication virtual appliance


While I have been a bit quiet lately, I have been working on two new projects one is a BackupPC virtual appliance. An agentless backup and recovery system that also de-duplicates. There is a lot of discussion around technologies like Data Domain, Diligent, EMC/Avamar, and Asigra, this virtual appliance provides similar functionality for FREE. I am also working on a web based UI for this appliance to simplify setup and sync integration for replication.

Setup of the BackupPC virtual appliance is quite simple.

  1. Start the BackupPC Virtual Appliance
  2. The virtual appliance is setup to use bridged networking, it should receive a dhcp address on your network
  3. Log into the BackupPC virtual appliance
    • Username: root
    • Password: backuppc
  4. Identify network IP address
    • ifconfig -a
  5. Open you desktop browser and point it to http://ip_address/cgi-bin/BackupPC_Admin
    • Username: admin
    • Password: password
  6. Create a user “backuppc” on the windows hosts that you want to backup. Backups are performed via SMB so the “backuppc” user should have read access to the C$, etc… shares.
    • Note: Don’t forget to set the users password
  7. vi /b2d_target/conf/config.pl
    • You will need to modify 3 variables in this file
      • $Conf{SmbShareName} = ‘C$’;
        • These are the shares that you want to backup
      • $Conf{SmbShareUserName} = ‘backuppc’;
      • $Conf{SmbSharePasswd} = ‘backuppc’;
        • The SmbShareUserName and SmbSharePasswd should match the username and password that you created on your windows host.
  8. Add hosts to backup.
    • vi /b2d_target/conf/hosts
      • Follow the syntax in the file – the use of IP addresses is OK
  9. Reload the backup configuration file from the web ui
    • Select “Admin Options” and “Reload Config”
  10. You are now ready to start a backup

For detailed usage documentation see: http://backuppc.sourceforge.net/

NOTE: [THIS IS OPTIONAL]

Finally the virtual machine is configured with a 300GB backup target disk (/dev/hdb2). This is Hard Disk 2 (IDE 0:1), if you need more space you should follow the following procedure:

  1. Shutdown the virtual machine
  2. Remove Hard Disk 2 and add a new larger virtual disk – See vmware docs for more detail
  3. Login as root
  4. /etc/init.d/backuppc stop
  5. mkfs -t ext3 /dev/hdb1
  6. mount -a
    • verify that the new device is mounted on /b2d_target
      • df -k
  7. /etc/init.d/backuppc start

Please download and try this virtual machine, and let me know if you find any issues with the documentation. Enjoy!

[tags]virtual appliance backup recovery backuppc linux vmware windows agentless[/tags]

VN:F [1.9.17_1161]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.17_1161]
Rating: 0 (from 0 votes)

10 thoughts on “Open source backup and data de-duplication virtual appliance

  1. What is the distro that this vm is built on. It does not have the rsycp perl module, so I tried to do a perl cpan install File::RsyncP and it failed because it needed to have gcc installed, so I am trying to find what rpm I need to have installed for gcc.
    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  2. The startup banner lists this as CentOS 4.4 (or RHEL 4.4).
    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  3. What about doing backups from a Unix client? Any plans to support backups from Unix via tar, cpio, etc? What sort of compression ratio’s are you seeing?
    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  4. Just out of curiosity, does the backup process itself do the dedupe or is it handled by the file system? Is this a block level or file level dedupe?
    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  5. How you implement De-duplication
    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  6. Reply to comment 4# Backup process can never itself dedupe we always need some intermediate software to deduplicate backup data before it goes to backup server (in CASE OF INLINE DEDUPE) or some software at backupserver which runs on timely basis to remove duplicate entries and link them up properly for use during later reconstruction of original data (in case of post process deduplication) for further clarification mail me to: davish.bhardwaj@nechclst.in
    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  7. This looks very good, perhaps a OpenVZ template is nice to, then it can be a virtual appliance for the free virtualisation technologie Proxmox.

    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  8. Love the idea.  I see this original is 2007.  Is there any newer file that might include any upgrades of late from backuppc ?

     

    Thanks.

     

    Jay

    VA:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.17_1161]
    Rating: 0 (from 0 votes)
  9. Pingback: Deduplication – An Open Source Approach « www.MICKOTOOLE.com

  10. Unfortunately it has been on my list to create a new virtual appliance but just have not had the time.
    VN:F [1.9.17_1161]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.17_1161]
    Rating: 0 (from 0 votes)

Leave a Reply

Your email address will not be published. Required fields are marked *

*


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <q cite=""> <strike> <strong>