| -*- indented-text -*- |
| |
| Notes towards a new version of rsync |
| Martin Pool <mbp@samba.org>, September 2001. |
| |
| |
| Good things about the current implementation: |
| |
| - Widely known and adopted. |
| |
| - Fast/efficient, especially for moderately small sets of files over |
| slow links (transoceanic or modem.) |
| |
| - Fairly reliable. |
| |
| - The choice of running over a plain TCP socket or tunneling over |
| ssh. |
| |
| - rsync operations are idempotent: you can always run the same |
| command twice to make sure it worked properly without any fear. |
| (Are there any exceptions?) |
| |
| - Small changes to files cause small deltas. |
| |
| - There is a way to evolve the protocol to some extent. |
| |
| - rdiff and rsync --write-batch allow generation of standalone patch |
| sets. rsync+ is pretty cheesy, though. xdelta seems cleaner. |
| |
| - Process triangle is creative, but seems to provoke OS bugs. |
| |
| - "Morning-after property": you don't need to know anything on the |
| local machine about the state of the remote machine, or about |
| transfers that have been done in the past. |
| |
| - You can easily push or pull simply by switching the order of |
| files. |
| |
| - The "modules" system has some neat features compared to |
| e.g. Apache's per-directory configuration. In particular, because |
| you can set a userid and chroot directory, there is strong |
| protection between different modules. I haven't seen any calls |
| for a more flexible system. |
| |
| |
| Bad things about the current implementation: |
| |
| - Persistent and hard-to-diagnose hang bugs remain |
| |
| - Protocol is sketchily documented, tied to this implementation, and |
| hard to modify/extend |
| |
| - Both the program and the protocol assume a single non-interactive |
| one-way transfer |
| |
| - A list of all files are held in memory for the entire transfer, |
| which cripples scalability to large file trees |
| |
| - Opening a new socket for every operation causes problems, |
| especially when running over SSH with password authentication. |
| |
| - Renamed files are not handled: the old file is removed, and the |
| new file created from scratch. |
| |
| - The versioning approach assumes that future versions of the |
| program know about all previous versions, and will do the right |
| thing. |
| |
| - People always get confused about ':' vs '::' |
| |
| - Error messages can be cryptic. |
| |
| - Default behaviour is not intuitive: in too many cases rsync will |
| happily do nothing. Perhaps -a should be the default? |
| |
| - People get confused by trailing slashes, though it's hard to think |
| of another reasonable way to make this necessary distinction |
| between a directory and its contents. |
| |
| |
| Protocol philosophy: |
| |
| *The* big difference between protocols like HTTP, FTP, and NFS is |
| that their fundamental operations are "read this file", "delete |
| this file", and "make this directory", whereas rsync is "make this |
| directory like this one". |
| |
| |
| Questionable features: |
| |
| These are neat, but not necessarily clean or worth preserving. |
| |
| - The remote rsync can be wrapped by some other program, such as in |
| tridge's rsync-mail scripts. The general feature of sending and |
| retrieving mail over rsync is good, but this is perhaps not the |
| right way to implement it. |
| |
| |
| Desirable features: |
| |
| These don't really require architectural changes; they're just |
| something to keep in mind. |
| |
| - Synchronize ACLs and extended attributes |
| |
| - Anonymous servers should be efficient |
| |
| - Code should be portable to non-UNIX systems |
| |
| - Should be possible to document the protocol in RFC form |
| |
| - --dry-run option |
| |
| - IPv6 support. Pretty straightforward. |
| |
| - Allow the basis and destination files to be different. For |
| example, you could use this when you have a CD-ROM and want to |
| download an updated image onto a hard drive. |
| |
| - Efficiently interrupt and restart a transfer. We can write a |
| checkpoint file that says where we're up to in the filesystem. |
| Alternatively, as long as transfers are idempotent, we can just |
| restart the whole thing. [NFSv4] |
| |
| - Scripting support. |
| |
| - Propagate atimes and do not modify them. This is very ugly on |
| Unix. It might be better to try to add O_NOATIME to kernels, and |
| call that. |
| |
| - Unicode. Probably just use UTF-8 for everything. |
| |
| - Open authentication system. Can we use PAM? Is SASL an adequate |
| mapping of PAM to the network, or useful in some other way? |
| |
| - Resume interrupted transfers without the --partial flag. We need |
| to leave the temporary file behind, and then know to use it. This |
| leaves a risk of large temporary files accumulating, which is not |
| good. Perhaps it should be off by default. |
| |
| - tcpwrappers support. Should be trivial; can already be done |
| through tcpd or inetd. |
| |
| - Socks support built in. It's not clear this is any better than |
| just linking against the socks library, though. |
| |
| - When run over SSH, invoke with predictable command-line arguments, |
| so that people can restrict what commands sshd will run. (Is this |
| really required?) |
| |
| - Comparison mode: give a list of which files are new, gone, or |
| different. Set return code depending on whether anything has |
| changed. |
| |
| - Internationalized messages (gettext?) |
| |
| - Optionally use real regexps rather than globs? |
| |
| - Show overall progress. Pretty hard to do, especially if we insist |
| on not scanning the directory tree up front. |
| |
| |
| Regression testing: |
| |
| - Support automatic testing. |
| |
| - Have hard internal timeouts against hangs. |
| |
| - Be deterministic. |
| |
| - Measure performance. |
| |
| |
| Hard links: |
| |
| At the moment, we can recreate hard links, but it's a bit |
| inefficient: it depends on holding a list of all files in the tree. |
| Every time we see a file with a linkcount >1, we need to search for |
| another known name that has the same (fsid,inum) tuple. We could do |
| that more efficiently by keeping a list of only files with |
| linkcount>1, and removing files from that list as all their names |
| become known. |
| |
| |
| Command-line options: |
| |
| We have rather a lot at the moment. We might get more if the tool |
| becomes more flexible. Do we need a .rc or configuration file? |
| That wouldn't really fit with its pattern of use: cp and tar don't |
| have them, though ssh does. |
| |
| |
| Scripting issues: |
| |
| - Perhaps support multiple scripting languages: candidates include |
| Perl, Python, Tcl, Scheme (guile?), sh, ... |
| |
| - Simply running a subprocess and looking at its stdout/exit code |
| might be sufficient, though it could also be pretty slow if it's |
| called often. |
| |
| - There are security issues about running remote code, at least if |
| it's not running in the users own account. So we can either |
| disallow it, or use some kind of sandbox system. |
| |
| - Python is a good language, but the syntax is not so good for |
| giving small fragments on the command line. |
| |
| - Tcl is broken Lisp. |
| |
| - Lots of sysadmins know Perl, though Perl can give some bizarre or |
| confusing errors. The built in stat operators and regexps might |
| be useful. |
| |
| - Sadly probably not enough people know Scheme. |
| |
| - sh is hard to embed. |
| |
| |
| Scripting hooks: |
| |
| - Whether to transfer a file |
| |
| - What basis file to use |
| |
| - Logging |
| |
| - Whether to allow transfers (for public servers) |
| |
| - Authentication |
| |
| - Locking |
| |
| - Cache |
| |
| - Generating backup path/name. |
| |
| - Post-processing of backups, e.g. to do compression. |
| |
| - After transfer, before replacement: so that we can spit out a diff |
| of what was changed, or kick off some kind of reconciliation |
| process. |
| |
| |
| VFS: |
| |
| Rather than talking straight to the filesystem, rsyncd talks through |
| an internal API. Samba has one. Is it useful? |
| |
| - Could be a tidy way to implement cached signatures. |
| |
| - Keep files compressed on disk? |
| |
| |
| Interactive interface: |
| |
| - Something like ncFTP, or integration into GNOME-vfs. Probably |
| hold a single socket connection open. |
| |
| - Can either call us as a separate process, or as a library. |
| |
| - The standalone process needs to produce output in a form easily |
| digestible by a calling program, like the --emacs feature some |
| have. Same goes for output: rpm outputs a series of hash symbols, |
| which are easier for a GUI to handle than "\r30% complete" |
| strings. |
| |
| - Yow! emacs support. (You could probably build that already, of |
| course.) I'd like to be able to write a simple script on a remote |
| machine that rsyncs it to my workstation, edits it there, then |
| pushes it back up. |
| |
| |
| Pie-in-the-sky features: |
| |
| These might have a severe impact on the protocol, and are not |
| clearly in our core requirements. It looks like in many of them |
| having scripting hooks will allow us |
| |
| - Transport over UDP multicast. The hard part is handling multiple |
| destinations which have different basis files. We can look at |
| multicast-TFTP for inspiration. |
| |
| - Conflict resolution. Possibly general scripting support will be |
| sufficient. |
| |
| - Integrate with locking. It's hard to see a good general solution, |
| because Unix systems have several locking mechanisms, and grabbing |
| the lock from programs that don't expect it could cause deadlocks, |
| timeouts, or other problems. Scripting support might help. |
| |
| - Replicate in place, rather than to a temporary file. This is |
| dangerous in the case of interruption, and it also means that the |
| delta can't refer to blocks that have already been overwritten. |
| On the other hand we could semi-trivially do this at first by |
| simply generating a delta with no copy instructions. |
| |
| - Replicate block devices. Most of the difficulties here are to do |
| with replication in place, though on some systems we will also |
| have to do I/O on block boundaries. |
| |
| - Peer to peer features. Flavour of the year. Can we think about |
| ways for clients to smoothly and voluntarily become servers for |
| content they receive? |
| |
| - Imagine a situation where the destination has a much faster link |
| to the cloud than the source. In this case, Mojo Nation downloads |
| interleaved blocks from several slower servers. The general |
| situation might be a way for a master rsync process to farm out |
| tasks to several subjobs. In this particular case they'd need |
| different sockets. This might be related to multicast. |
| |
| |
| Unlikely features: |
| |
| - Allow remote source and destination. If this can be cleanly |
| designed into the protocol, perhaps with the remote machine acting |
| as a kind of echo, then it's good. It's uncommon enough that we |
| don't want to shape the whole protocol around it, though. |
| |
| In fact, in a triangle of machines there are two possibilities: |
| all traffic passes from remote1 to remote2 through local, or local |
| just sets up the transfer and then remote1 talks to remote2. FTP |
| supports the second but it's not clearly good. There are some |
| security problems with being able to instruct one machine to open |
| a connection to another. |
| |
| |
| In favour of evolving the protocol: |
| |
| - Keeping compatibility with existing rsync servers will help with |
| adoption and testing. |
| |
| - We should at the very least be able to fall back to the new |
| protocol. |
| |
| - Error handling is not so good. |
| |
| |
| In favour of using a new protocol: |
| |
| - Maintaining compatibility might soak up development time that |
| would better go into improving a new protocol. |
| |
| - If we start from scratch, it can be documented as we go, and we |
| can avoid design decisions that make the protocol complex or |
| implementation-bound. |
| |
| |
| Error handling: |
| |
| - Errors should come back reliably, and be clearly associated with |
| the particular file that caused the problem. |
| |
| - Some errors ought to cause the whole transfer to abort; some are |
| just warnings. If any errors have occurred, then rsync ought to |
| return an error. |
| |
| |
| Concurrency: |
| |
| - We want to keep the CPU, filesystem, and network as full as |
| possible as much of the time as possible. |
| |
| - We can do nonblocking network IO, but not so for disk. |
| |
| - It makes sense to on the destination be generating signatures and |
| applying patches at the same time. |
| |
| - Can structure this with nonblocking, threads, separate processes, |
| etc. |
| |
| |
| Uses: |
| |
| - Mirroring software distributions: |
| |
| - Synchronizing laptop and desktop |
| |
| - NFS filesystem migration/replication. See |
| http://www.ietf.org/proceedings/00jul/00july-133.htm#P24510_1276764 |
| |
| - Sync with PDA |
| |
| - Network backup systems |
| |
| - CVS filemover |
| |
| |
| Conflict resolution: |
| |
| - Requires application-specific knowledge. We want to provide |
| policy, rather than mechanism. |
| |
| - Possibly allowing two-way migration across a single connection |
| would be useful. |
| |
| |
| Moved files: |
| |
| - There's no trivial way to detect renamed files, especially if they |
| move between directories. |
| |
| - If we had a picture of the remote directory from last time on |
| either machine, then the inode numbers might give us a hint about |
| files which may have been renamed. |
| |
| - Files that are renamed and not modified can be detected by |
| examining the directory listing, looking for files with the same |
| size/date as the origin. |
| |
| |
| Filesystem migration: |
| |
| NFSv4 probably wants to migrate file locks, but that's not really |
| our problem. |
| |
| |
| Atomic updates: |
| |
| The NFSv4 working group wants atomic migration. Most of the |
| responsibility for this lies on the NFS server or OS. |
| |
| If migrating a whole tree, then we could do a nearly-atomic rename |
| at the end. This ties in to having separate basis and destination |
| files. |
| |
| There's no way in Unix to replace a whole set of files atomically. |
| However, if we get them all onto the destination machine and then do |
| the updates quickly it would greatly reduce the window. |
| |
| |
| Scalability: |
| |
| We should aim to work well on machines in use in a year or two. |
| That probably means transfers of many millions of files in one |
| batch, and gigabytes or terabytes of data. |
| |
| For argument's sake: at the low end, we want to sync ten files for a |
| total of 10kb across a 1kB/s link. At the high end, we want to sync |
| 1e9 files for 1TB of data across a 1GB/s link. |
| |
| On the whole CPU usage is not normally a limiting factor, if only |
| because running over SSH burns a lot of cycles on encryption. |
| |
| Perhaps have resource throttling without relying on rlimit. |
| |
| |
| Streaming: |
| |
| A big attraction of rsync is that there are few round-trip delays: |
| basically only one to get started, and then everything is |
| pipelined. This is a problem with FTP, and NFS (at least up to |
| v3). NFSv4 can pipeline operations, but building on that is |
| probably a bit complicated. |
| |
| |
| Related work: |
| |
| - mirror.pl |
| |
| - ProFTPd |
| |
| - Apache |
| |
| - BitTorrent -- p2p mirroring |
| http://bitconjurer.org/BitTorrent/ |