Cloning Directories in Ruby
Recursively Cloning Directories in Ruby using Hard Links
Hard links exist under windows 7 so you can clone huge directories or files without taking up any extra disk space. Both the original and the copy are equal and apps can’t tell the difference between them - because we are using true hard links (not shortcuts or symbolic links).
Built in Windows Command
The dos command for cloning a file is simply
mklink /H original clone
If you delete the original file, the clone is still there and indistinguishable from the original file. You can clone as many times as you like and no extra disk space is used.
Ruby code to achieve the same
I’m not aware of a way of recursively hard linking directories (files as hard links, sub directories as real directories) in windows or linux using the standard commands, so here is a recursive directory cloning utility written in Ruby. It uses the FileUtils.ln method to do the file cloning. This works under both windows 7 and linux.
# ln_r
# Copy a directory recursively creating hardlinks for files and real dirs for directories
# Andy Bulka, Reilly Beacom
# version 1.5
require 'fileutils'
require 'optparse'
def ln_r source, target, options = {:verbose => true, :report => true, :countsize => false, options[:deletetarget] => true}
verbose = options[:verbose]
puts "ln_r copying and hard linking from #{source} to #{target}" if verbose
puts "..." if verbose
raise "source not a directory" if not File.directory?(source)
# Add trailing slash
source = File.join(source, "")
target = File.join(target, "")
# Ensure target dir exists before we start and delete destination files
FileUtils.mkdir_p target
FileUtils.rm_r Dir.glob(File.join(target, '/*')) if options[:deletetarget]
total_file_sizes = 0
Dir.glob(File.join(source, '**/*')).each do | source_path |
target_path = source_path.gsub Regexp.new("^" + source), target
if File.file? source_path
FileUtils.mkdir_p File.dirname(target_path)
FileUtils.ln source_path, target_path
total_file_sizes += File.size(source_path) if options[:countsize]
puts "created hard link #{target_path} (source: #{source_path}" if verbose
else
FileUtils.mkdir_p target_path
puts "created directory " + target_path if verbose
end
end
puts "Done copying/linking." if verbose
def number_with_delimiter(number, delimiter=",")
number.to_s.gsub(/(\d)(?=(\d\d\d)+(?!\d))/, "\\1#{delimiter}")
end
puts "Bytes saved by linking: #{number_with_delimiter(total_file_sizes/1024000)} Mb" if options[:countsize]
if options[:report]
puts
puts "---- RESULT: SOURCE DIRECTORY"
puts Dir.glob(File.join(source, '/**/*'))
puts "---- TARGET DIRECTORY"
puts Dir.glob(File.join(target, '/**/*'))
puts "---- RESULT END"
puts
end
end
# This hash will hold all of the options
# parsed from the command-line by
# OptionParser.
options = {}
optparse = OptionParser.new do|opts|
# Set a banner, displayed at the top
# of the help screen.
opts.banner = "Usage: ln_r.rb [options] source_dir target_dir"
# Define the options, and what they do
options[:verbose] = false
opts.on( '-v', '--verbose', 'Output more information' ) do
options[:verbose] = true
end
options[:test] = false
opts.on( '-t', '--test', 'Run test copy on test data dirs - Andy only' ) do
options[:test] = true
end
options[:report] = false
opts.on( '-r', '--report', 'Display directory of source and target dirs after finish' ) do
options[:report] = true
end
options[:countsize] = false
opts.on( '-c', '--countsize', 'Display bytes saved by using hard linking' ) do
options[:countsize] = true
end
options[:deletetarget] = true
opts.on( '-d', '--dontdeletetarget', 'Dont rm -r * target directory first' ) do
options[:deletetarget] = false
end
#options[:logfile] = nil
#opts.on( '-l', '--logfile FILE', 'Write log to FILE' ) do|file|
# options[:logfile] = file
#end
# This displays the help screen, all programs are
# assumed to have this option.
opts.on( '-h', '--help', 'Display this screen' ) do
puts opts
exit
end
end
# Parse the command-line. Remember there are two forms
# of the parse method. The 'parse' method simply parses
# ARGV, while the 'parse!' method parses ARGV and removes
# any options found there, as well as any parameters for
# the options. What's left is the list of files to resize.
optparse.parse!
puts "Being verbose" if options[:verbose]
#puts "Logging to file #{options[:logfile]}" if options[:logfile]
if options[:test]
ln_r "LinkTests1/dirA", "LinkTests1/dirB", options
ln_r "LinkTests2/dirA", "LinkTests2/dirB/MyCopy/Fred", options
exit
end
if ARGV.length < 2
puts optparse.help
exit
end
ln_r ARGV[0], ARGV[1], options
You can invoke it using:
ruby ln_r.rb [options] dir1 dir2 ...
-v, --verbose Output more information
-t, --test Run test copy on test data dirs - Andy only
-r, --report Display directory of source and target dirs after finish
-c, --countsize Display bytes saved by using hard linking
-d, --dontdeletetarget Dont rm -r * target directory first
-h, --help Display this screen
Linux built in command
Update Feb 2013
Its true that you can achieve the above in linux with
cp -lr from to
where -r
means recursive copy and -l
mean use linking.
What about Windows COPY
Under windows, the COPY isn’t so smart and so the above ruby script may be of help. Alternatively you can use a port of the linux cp command under windows - and it seems to work OK. See Port of the most important GNU utilities to Windows
Comments
Posted by Reilly on Dec 20th, 2011
The unix/linux/posix command is:
cp -al dir1 dir2
"copy [cp] with archive [a] (recursive) and link [l]"
Posted by admin on Dec 20th, 2011
I wonder if Win 7 / NTFS has similar flags on the DOS copy command?