$treeview $search $mathjax $extrastylesheet
librsync
2.0.2
$projectbrief
|
$projectbrief
|
$searchbox |
00001 # Streaming jobs {#api_streaming} 00002 00003 A key design requirement for librsync is that it should handle data as 00004 and when the hosting application requires it. librsync can be used 00005 inside applications that do non-blocking IO or filtering of network 00006 streams, because it never does IO directly, or needs to block waiting 00007 for data. 00008 00009 Arbitrary-length input and output buffers are passed to the 00010 library by the application, through an instance of ::rs_buffers_t. The 00011 library proceeds as far as it can, and returns an ::rs_result value 00012 indicating whether it needs more data or space. 00013 00014 All the state needed by the library to resume processing when more 00015 data is available is kept in a small opaque ::rs_job_t structure. 00016 After creation of a job, repeated calls to rs_job_iter() in between 00017 filling and emptying the buffers keeps data flowing through the 00018 stream. The ::rs_result values returned may indicate 00019 00020 - ::RS_DONE: processing is complete 00021 - ::RS_BLOCKED: processing has blocked pending more data 00022 - one of various possible errors in processing (see ::rs_result.) 00023 00024 These can be converted to a human-readable string by rs_strerror(). 00025 00026 \note Smaller buffers have high relative handling costs. Application 00027 performance will be improved by using buffers of at least 32kb or so 00028 on each call. 00029 00030 \sa \ref api_whole - Simpler but more limited interface than the streaming 00031 interface. 00032 00033 \sa \ref api_pull - Intermediate-complexity callback interface. 00034 00035 \sa \ref api_callbacks - for reading from the basis file 00036 when doing a "patch" operation. 00037 00038 00039 ## Creating Jobs 00040 00041 All streaming librsync jobs are initiated using a `_begin` 00042 function to create a ::rs_job_t object, passing in any necessary 00043 initialization parameters. The various jobs available are: 00044 00045 - rs_sig_begin(): Calculate the signature of a file. 00046 - rs_loadsig_begin(): Load a signature into memory. 00047 - rs_delta_begin(): Calculate the delta between a signature and a new 00048 file. 00049 - rs_patch_begin(): Apply a delta to a basis to recreate the new 00050 file. 00051 00052 The patch job accepts the patch as input, and uses a callback to look up 00053 blocks within the basis file. 00054 00055 You must configure read, write and basis callbacks after creating the 00056 job but before it is run. 00057 00058 You can set job->sig_file_bytes to signature file size or 00059 job->estimated_signature_count before running the job 00060 if the signature file size (or the number of chunks) is known in advance. 00061 If both are set, estimated_signature_count is used. 00062 This will preallocate the needed memory for signature sums instead of 00063 calling realloc for each block. 00064 00065 00066 ## Running Jobs 00067 00068 The work of the operation is done when the application calls 00069 rs_job_iter(). This includes reading from input files via the callback, 00070 running the rsync algorithms, and writing output. 00071 00072 The IO callbacks are only called from inside rs_job_iter(). If any of 00073 them return an error, rs_job_iter() will generally return the same error. 00074 00075 When librsync needs to do input or output, it calls one of the callback 00076 functions. rs_job_iter() returns when the operation has completed or 00077 failed, or when one of the IO callbacks has blocked. 00078 00079 rs_job_iter() will usually be called in a loop, perhaps alternating 00080 librsync processing with other application functions. 00081 00082 00083 ## Deleting Jobs 00084 00085 A job is deleted and its memory freed up using rs_job_free(). 00086 00087 This is typically called when the job has completed or failed. It can be 00088 called earlier if the application decides it wants to cancel 00089 processing. 00090 00091 rs_job_free() does not delete the output of the job, such as the sumset 00092 loaded into memory. It does delete the job's statistics. 00093 00094 00095 00096 ## State Machine Internals 00097 00098 Internally, the operations are implemented as state machines that move 00099 through various states as input and output buffers become available. 00100 00101 All computers and programs are state machines. So why is the 00102 representation as a state machine a little more explicit (and perhaps 00103 verbose) in librsync than other places? Because we need to be able to 00104 let the real computer go off and do something else like waiting for 00105 network traffic, while still remembering where it was in the librsync 00106 state machine. 00107 00108 librsync will never block waiting for IO, unless the callbacks do 00109 that. 00110 00111 The current state is represented by the private field 00112 ::rs_job_t::statefn, which points to a function with a name like 00113 `rs_OPERATION_s_STATE`. Every time librsync tries to make progress, 00114 it will call this function. 00115 00116 The state function returns one of the ::rs_result values. The 00117 most important values are 00118 00119 * ::RS_DONE: Completed successfully. 00120 00121 * ::RS_BLOCKED: Cannot make further progress at this point. 00122 00123 * ::RS_RUNNING: The state function has neither completed nor blocked but 00124 wants to be called again. **XXX**: Perhaps this should be removed? 00125 00126 States need to correspond to suspension points. The only place the 00127 job can resume after blocking is at the entry to a state function. 00128 00129 Therefore states must be "all or nothing" in that they can either 00130 complete, or restart without losing information. 00131 00132 Basically every state needs to work from one input buffer to one 00133 output buffer. 00134 00135 States should never generally return ::RS_DONE directly. Instead, they 00136 should call rs__job_done(), which sets the state function to 00137 rs__s_done(). This makes sure that any pending output is flushed out 00138 before ::RS_DONE is returned to the application.