Version: | 2.0.3 |
---|---|
Date: | 2004-07-29 |
Summary: | high-level FTP client library for Python |
Keywords: | FTP, ftplib substitute, virtual filesystem |
Author: | Stefan Schwarzer <sschwarzer@sschwarzer.net> |
The ftputil module is a high-level interface to the ftplib module. The FTPHost objects generated from it allow many operations similar to those of os and os.path.
Examples:
import ftputil # download some files from the login directory host = ftputil.FTPHost('ftp.domain.com', 'user', 'password') names = host.listdir(host.curdir) for name in names: if host.path.isfile(name): host.download(name, name, 'b') # remote, local, binary mode # make a new directory and copy a remote file into it host.mkdir('newdir') source = host.file('index.html', 'r') # file-like object target = host.file('newdir/index.html', 'w') # file-like object host.copyfileobj(source, target) # similar to shutil.copyfileobj source.close() target.close()
Also, there are FTPHost.lstat and FTPHost.stat to request size and modification time of a file. The latter can also follow links, similar to os.stat. Even FTPHost.path.walk works.
The distribution contains a custom UserTuple module to provide stat results with Python versions 2.0 and 2.1.
The exceptions are in the namespace of the ftputil package (e. g. ftputil.TemporaryError). They are organized as follows:
FTPError FTPOSError(FTPError, OSError) TemporaryError(FTPOSError) PermanentError(FTPOSError) ParserError(FTPOSError) FTPIOError(FTPError) InternalError(FTPError) RootDirError(InternalError) InaccessibleLoginDirError(InternalError) TimeShiftError(FTPError)
and are described here:
FTPError
is the root of the exception hierarchy of the module.
FTPOSError
is derived from OSError. This is for similarity between the os module and FTPHost objects. Compare
try: os.chdir('nonexisting_directory') except OSError: ...
with
host = ftputil.FTPHost('host', 'user', 'password') try: host.chdir('nonexisting_directory') except OSError: ...
Imagine a function
def func(path, file): ...
which works on the local file system and catches OSErrors. If you change the parameter list to
def func(path, file, os=os): ...
where os denotes the os module, you can call the function also as
host = ftputil.FTPHost('host', 'user', 'password') func(path, file, os=host)
to use the same code for a local and remote file system. Another similarity between OSError and FTPOSError is that the latter holds the FTP server return code in the errno attribute of the exception object and the error text in strerror.
TemporaryError
is raised for FTP return codes from the 4xx category. This corresponds to ftplib.error_temp (though TemporaryError and ftplib.error_temp are not identical).
PermanentError
is raised for 5xx return codes from the FTP server (again, that's similar but not identical to ftplib.error_perm).
ParserError
is used for errors during the parsing of directory listings from the server. This exception is used by the FTPHost methods stat, lstat, and listdir.
FTPIOError
denotes an I/O error on the remote host. This appears mainly with file-like objects which are retrieved by invoking FTPHost.file (FTPHost.open is an alias). Compare
>>> try: ... f = open('notthere') ... except IOError, obj: ... print obj.errno ... print obj.strerror ... 2 No such file or directory
with
>>> host = ftputil.FTPHost('host', 'user', 'password') >>> try: ... f = host.open('notthere') ... except IOError, obj: ... print obj.errno ... print obj.strerror ... 550 550 notthere: No such file or directory.
As you can see, both code snippets are similar. (However, the error codes aren't the same.)
InternalError
subsumes exception classes for signaling errors due to limitations of the FTP protocol or the concrete implementation of ftputil.
RootDirError
Because of the implementation of the lstat method it is not possible to do a stat call on the root directory /. If you know any way to do it, please let me know. :-)
InaccessibleLoginDirError
This exception is only raised if both of the following conditions are met:
TimeShiftError
is used to denote errors which relate to setting the time shift, e. g. trying to set a value which is no multiple of a full hour.
FTPHost instances may be generated with the following call:
host = ftputil.FTPHost(host, user, password, account, session_factory=ftplib.FTP)
The first four parameters are strings with the same meaning as for the FTP class in the ftplib module. The keyword argument session_factory may be used to generate FTP connections with other factories than the default ftplib.FTP. For example, the M2Crypto distribution uses a secure FTP class which is derived from ftplib.FTP.
In fact, all positional and keyword arguments other than session_factory are passed to the factory to generate a new background session (which happens for every remote file that is opened; see below).
This functionality of the constructor also allows to wrap ftplib.FTP objects to do something that wouldn't be possible with the ftplib.FTP constructor alone.
As an example, assume you want to connect to another than the default port but ftplib.FTP only offers this by means of its connect method, but not via its constructor. The solution is to provide a wrapper class:
import ftplib import ftputil EXAMPLE_PORT = 50001 class MySession(ftplib.FTP): def __init__(self, host, userid, password, port): """Act like ftplib.FTP's constructor but connect to other port.""" ftplib.FTP.__init__(self) self.connect(host, port) self.login(userid, password) # try not to use MySession() as factory, - use the class itself host = ftputil.FTPHost(host, userid, password, port=EXAMPLE_PORT, session_factory=MySession) # use `host` as usual
On login, the format of the directory listings (needed for stat'ing files and directories) should be determined automatically. If not, you may use the method set_directory_format to set the format "manually".
curdir, pardir, sep
are strings which denote the current and the parent directory on the remote server. sep identifies the path separator. Though RFC 959 (File Transfer Protocol) notes that these values may be server dependent, the Unix counterparts seem to work well in practice, even for non-Unix servers.
set_time_shift(time_shift)
sets the so-called time shift value (measured in seconds). The time shift is the difference between the local time of the server and the local time of the client at a given moment, i. e. by definition
time_shift = server_time - client_time
Setting this value is important if upload_if_newer and download_if_newer should work correctly even if the time zone of the FTP server differs from that of the client (where ftputil runs). Note that the time shift value can be negative.
If the time shift value is invalid, e. g. no multiple of a full hour or its absolute (unsigned) value larger than 24 hours, a TimeShiftError is raised.
See also synchronize_times for a way to set the time shift with a simple method call.
time_shift()
return the currently-set time shift value. See set_time_shift (above) for its definition.
synchronize_times()
synchronizes the local times of the server and the client, so that upload_if_newer and download_if_newer work as expected, even if the client and the server are in different time zones. For this to work, all of the following conditions must be true:
If you can't fulfill these conditions, you can nevertheless set the time shift value manually with set_time_shift. Trying to call synchronize_times if the above conditions aren't true results in a TimeShiftError exception.
file(path, mode='r')
returns a file-like object that is connected to the path on the remote host. This path may be absolute or relative to current directory on the remote host (this directory can be determined with the getcwd method). As with local file objects the default mode is "r", i. e. reading text files. Valid modes are "r", "rb", "w", and "wb".
open(path, mode='r')
is an alias for file (see above).
copyfileobj(source, target, length=64*1024)
copies the contents from the file-like object source to the file-like object target. The only difference to shutil.copyfileobj is the default buffer size.
close()
closes the connection to the remote host. After this, no more interaction with the FTP server is possible without using a new FTPHost object.
getcwd()
returns the absolute current directory on the remote host. This method acts similar to os.getcwd.
chdir(directory)
sets the current directory on the FTP server. This resembles os.chdir, as you may have expected. :-)
mkdir(path, [mode])
makes the given directory on the remote host. In the current implementation, this doesn't construct "intermediate" directories which don't already exist. The mode parameter is ignored. This is for compatibilty with os.mkdir if an FTPHost object is passed into a function instead of the os module (see the subsection on Python exceptions above for an explanation).
rmdir(path)
removes the given remote directory.
In previous versions of ftputil, it depended on the remote server whether non-empty directories could be deleted. ftputil version 2.0 and up by default allow only to delete empty directories.
If you want to update to ftputil 2.0 with minimal changes to your source code, pass in an additional parameter _remove_only_empty=False. Note that this is deprecated and will probably be unsupported in future ftputil versions.
remove(path)
removes a file on the remote host (similar to os.remove).
unlink(path)
is an alias for remove.
rename(source, target)
renames the source file (or directory) on the FTP server.
listdir(path)
returns a list containing the names of the files and directories in the given path; similar to os.listdir.
upload(source, target, mode='')
copies a local source file (given by a filename, i. e. a string) to the remote host under the name target. Both source and target may be absolute paths or relative to their corresponding current directory (on the local or the remote host, respectively). The mode may be "" or "a" for ASCII uploads or "b" for binary uploads. ASCII mode is the default (again, similar to regular local file objects).
download(source, target, mode='')
performs a download from the remote source to a target file. Both source and target are strings. Additionally, the description of the upload method applies here, too.
upload_if_newer(source, target, mode='')
is similar to the upload method. The only difference is that the upload is only invoked if the time of the last modification for the source file is more recent than that of the target file, or the target doesn't exist at all. If an upload actually happened, the return value is a true value, else a false value.
Note that this method only checks the existence and/or the modification time of the source and target file; it can't recognize a change in the transfer mode, e. g.
# transfer in ASCII mode host.upload_if_newer('source_file', 'target_file', 'a') # won't transfer the file again host.upload_if_newer('source_file', 'target_file', 'b')
Similarly, if a transfer is interrupted, the remote file will have a newer modification time than the local file, and thus the transfer won't be repeated if upload_if_newer is used a second time. There are (at least) two possibilities after a failed upload:
If it seems that a file is uploaded unnecessarily, read the subsection on time shift settings.
download_if_newer(source, target, mode='')
corresponds to upload_if_newer but performs a download from the server to the local host. Read the descriptions of download and upload_if_newer for more. If a download actually happened, the return value is a true value, else a false value.
If it seems that a file is downloaded unnecessarily, read the subsection on time shift settings.
The methods lstat and stat (and others) rely on the directory listing format used by the FTP server. When connecting to a host, FTPHost's constructor tries to guess the right format, which mostly succeeds. However, if you get strange results (or exceptions), it may be necessary to set the directory format "manually". This is done - immediately after connecting - with
set_directory_format(server_format)
server_format is one of the strings "unix" or "ms". To choose the correct format, you have to start a command line FTP client and request a directory listing (most clients do this with the DIR command).
If the resulting lines look like
drwxr-sr-x 2 45854 200 512 Jul 30 17:14 image -rw-r--r-- 1 45854 200 4604 Jan 19 23:11 index.html
use "unix" as the argument.
If the output looks like
12-07-01 02:05PM <DIR> XPLaunch 07-17-00 02:08PM 12266720 digidash.exe
use "ms" as the argument string server_format.
If none of the above settings help, contact me. It would be very helpful if you could provide an example listing (DIR's output).
If calling lstat or stat yields wrong modification dates or times, look at the methods that deal with time zone differences (time shift).
lstat(path)
returns an object similar that from os.lstat (a "tuple" with additional attributes; see the documentation of the os module for details). However, due to the nature of the application, there are some important aspects to keep in mind:
Currently, ftputil recognizes the MS Robin FTP server. Otherwise, a format commonly used by Unix servers is assumed. If you need to parse output from another server type, please contact me under the email address at the end of this text.
FTPHost objects contain an attribute named path, similar to os.path. The following methods can be applied to the remote host with the same semantics as for os.path:
abspath(path) basename(path) commonprefix(path_list) dirname(path) exists(path) getmtime(path) getsize(path) isabs(path) isdir(path) isfile(path) islink(path) join(path1, path2, ...) normcase(path) normpath(path) split(path) splitdrive(path) splitext(path) walk(path, func, arg)
FTPFile objects as returned by a call to FTPHost.file (or FTPHost.open) have the following methods - with the same arguments and semantics as for local files:
close() read([count]) readline([count]) readlines() write(data) writelines(string_sequence) xreadlines()
and the attribute closed. For details, see the section File objects in the Library Reference.
Note that ftputil supports both binary mode and text mode with the appropriate line ending conversions.
See http://www.sschwarzer.net/python/python_software.html#ftputil . Announcements will also be sent to the mailing list (see the question below). Announcements on major updates will also be posted to the newsgroup comp.lang.python .
Yes, you can subscribe at http://codespeak.net/mailman/listinfo/ftputil or read the archives at http://codespeak.net/pipermail/ftputil/ .
Before reporting a bug, make sure that you already tried the latest version of ftputil. There the bug might have already been fixed.
Please send your bug report to the ftputil mailing list (see above for the address) or to me (email address at the top of this file). In either case you must not include confidential information (user id, password, file names, etc.) in the mail.
When reporting a bug, please provide the following information:
By default, an instantiated FTPHost object connects on the usual FTP ports. If you have to use a different port, refer to the section FTPHost construction.
You can use the same approach to connect in active or passive mode, as you like.
Please see the previous tip.
You may find that ftputil uploads or downloads files unnecessarily, or not when it should. This can happen when the FTP server is in a different time zone than the client on which ftputil runs. Please see the the section on setting the time shift. It may even be sufficient to call synchronize_times.
Please see the previous and the next tip.
Please send me an email with your question, and I'll see what I can do for you. :-) Probably a better way is to send your question to the mailing list at ftputil@codespeak.net; potentially more people might be able to help you.
If not overwritten via installation options, the ftputil files reside in the ftputil package. The documentation (in reStructured Text and in HTML format) is in the same directory.
The files _test_*.py and _mock_ftplib.py are for unit-testing. If you only use ftputil (i. e. not modifying it), you can delete these files.
ftputil is written by Stefan Schwarzer <sschwarzer@sschwarzer.net>.
Feedback is appreciated. :-)