1 Oct, 2021
james@Jamess-MacBook Desktop % mkdir rsync-test
james@Jamess-MacBook Desktop % cd rsync-test
james@Jamess-MacBook rsync-test % mkdir -p old current diff
james@Jamess-MacBook rsync-test % echo '1' > old/1.txt
james@Jamess-MacBook rsync-test % rsync -aHxv old/ current/
building file list ... done
./
1.txt
sent 141 bytes received 48 bytes 378.00 bytes/sec
total size is 2 speedup is 0.01
james@Jamess-MacBook rsync-test % echo '2' > current/2.txt
james@Jamess-MacBook rsync-test % echo '3' > old/3.txt
james@Jamess-MacBook rsync-test % rsync -aHxv --compare-dest=../old/ current/ diff/
building file list ... done
./
2.txt
sent 156 bytes received 48 bytes 408.00 bytes/sec
total size is 4 speedup is 0.02
james@Jamess-MacBook rsync-test % ls diff
2.txt
james@Jamess-MacBook rsync-test % ls old
1.txt 3.txt
james@Jamess-MacBook rsync-test % ls current
1.txt 2.txt
james@Jamess-MacBook rsync-test % ls diff
2.txt
james@Jamess-MacBook rsync-test % rm -r diff
james@Jamess-MacBook rsync-test % rsync -aHxv --compare-dest=../current/ old/ diff/
building file list ... done
created directory diff
./
3.txt
sent 156 bytes received 48 bytes 408.00 bytes/sec
total size is 4 speedup is 0.02
james@Jamess-MacBook rsync-test %
This use of flags leaves lots of empty directories in diff/ so you might expect the prune-empty-dirs flag to help, but it doesn't has explained here: https://lists.samba.org/archive/rsync/2009-January/022488.html.
Instead I run these commands to prune diff/ manually afterwards:
WARNING: Be careful you run these in the right place, otherwise you might be deleting things from the wrong directories.
cd diff
find . -type f -name .DS_Store -delete
find . -type d -empty -delete
With find
, -delete
also implies -depth
.
If you want to find duplicates from a source directory anywhere in another directory, you can use rmlint
:
Use data
as master directory. Find only duplicates in backup
that are also in data. Do not delete any files in data
:
mkdir -p data backup
echo 'one' > data/1.txt
echo 'one' > backup/1.txt
echo 'two' > backup/2.txt
echo 'two' > backup/2b.txt
rmlint backup // data --keep-all-tagged --must-match-tagged -T 'df' -g
./rmlint.sh -d
% tree
.
├── backup
│ ├── 2.txt
│ └── 2b.txt
├── data
│ └── 1.txt
└── rmlint.json
2 directories, 4 files
If you want to do something complicated, like not include all the files in backup
for de-duplication, you can do something like this:
mkdir -p data backup backup/photos.photoslibrary backup/photos.photolibrary
echo 'one' > data/1.txt
echo 'one' > backup/1.txt
echo 'two' > backup/2.txt
echo 'two' > backup/2b.txt
echo 'three' > data/3.txt
echo 'three' > backup/photos.photolibrary/3.txt
echo 'three' > backup/photos.photoslibrary/3.txt
find backup -type f -not -path '*.photo*library/*' -print0 | rmlint -0 // data --keep-all-tagged --must-match-tagged -T 'df' -g
./rmlint.sh -d
tree
.
├── backup
│ ├── 2.txt
│ ├── 2b.txt
│ ├── photos.photolibrary
│ │ └── 3.txt
│ └── photos.photoslibrary
│ └── 3.txt
├── data
│ ├── 1.txt
│ └── 3.txt
└── rmlint.json
3 directories, 6 files
Be the first to comment.
Copyright James Gardner 1996-2020 All Rights Reserved. Admin.