``ftputil`` - a high-level FTP client library ============================================= :Version: 2.2.2 :Date: 2007-04-22 :Summary: high-level FTP client library for Python :Keywords: FTP, ``ftplib`` substitute, virtual filesystem, pure Python :Author: Stefan Schwarzer :`Russian translation`__: Anton Stepanov .. __: ftputil_ru.html .. contents:: Introduction ------------ The ``ftputil`` module is a high-level interface to the ftplib_ module. The `FTPHost objects`_ generated from it allow many operations similar to those of os_, `os.path`_ and `shutil`_. .. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html .. _os: http://www.python.org/doc/current/lib/module-os.html .. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html .. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html Examples:: import ftputil # download some files from the login directory host = ftputil.FTPHost('ftp.domain.com', 'user', 'password') names = host.listdir(host.curdir) for name in names: if host.path.isfile(name): host.download(name, name, 'b') # remote, local, binary mode # make a new directory and copy a remote file into it host.mkdir('newdir') source = host.file('index.html', 'r') # file-like object target = host.file('newdir/index.html', 'w') # file-like object host.copyfileobj(source, target) # similar to shutil.copyfileobj source.close() target.close() Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and modification time of a file. The latter can also follow links, similar to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work. .. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698 ``ftputil`` features -------------------- * Method names are familiar from Python's ``os``, ``os.path`` and ``shutil`` modules * Remote file system navigation (``getcwd``, ``chdir``) * Upload and download files (``upload``, ``upload_if_newer``, ``download``, ``download_if_newer``) * Time zone synchronization between client and server (needed for ``upload_if_newer`` and ``download_if_newer``) * Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``, ``rmtree``) and remove files (``remove``) * Get information about directories, files and links (``listdir``, ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``, ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.) * Iterate over remote file systems (``walk``) * Local caching of results from ``lstat`` and ``stat`` calls to reduce network access (also applies to ``exists``, ``getmtime`` etc.). * Read files from and write files to remote hosts via file-like objects (``FTPHost.file``; the generated file-like objects have many common methods like ``read``, ``readline``, ``readlines``, ``write``, ``writelines``, ``close`` and can do automatic line ending conversions on the fly, i. e. text/binary mode) Exception hierarchy ------------------- The exceptions are in the namespace of the ``ftp_error`` module, e. g. ``ftp_error.TemporaryError``. Getting the exception classes from the "package module" ``ftputil`` is deprecated. The exceptions are organized as follows:: FTPError FTPOSError(FTPError, OSError) PermanentError(FTPOSError) TemporaryError(FTPOSError) FTPIOError(FTPError) InternalError(FTPError) InaccessibleLoginDirError(InternalError) ParserError(InternalError) RootDirError(InternalError) TimeShiftError(InternalError) and are described here: - ``FTPError`` is the root of the exception hierarchy of the module. - ``FTPOSError`` is derived from ``OSError``. This is for similarity between the os module and ``FTPHost`` objects. Compare :: try: os.chdir('nonexisting_directory') except OSError: ... with :: host = ftputil.FTPHost('host', 'user', 'password') try: host.chdir('nonexisting_directory') except OSError: ... Imagine a function :: def func(path, file): ... which works on the local file system and catches ``OSErrors``. If you change the parameter list to :: def func(path, file, os=os): ... where ``os`` denotes the ``os`` module, you can call the function also as :: host = ftputil.FTPHost('host', 'user', 'password') func(path, file, os=host) to use the same code for a local and remote file system. Another similarity between ``OSError`` and ``FTPOSError`` is that the latter holds the FTP server return code in the ``errno`` attribute of the exception object and the error text in ``strerror``. - ``PermanentError`` is raised for 5xx return codes from the FTP server (again, that's similar but *not* identical to ``ftplib.error_perm``). - ``TemporaryError`` is raised for FTP return codes from the 4xx category. This corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and ``ftplib.error_temp`` are *not* identical). - ``FTPIOError`` denotes an I/O error on the remote host. This appears mainly with file-like objects which are retrieved by invoking ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare :: >>> try: ... f = open('not_there') ... except IOError, obj: ... print obj.errno ... print obj.strerror ... 2 No such file or directory with :: >>> host = ftputil.FTPHost('host', 'user', 'password') >>> try: ... f = host.open('not_there') ... except IOError, obj: ... print obj.errno ... print obj.strerror ... 550 550 not_there: No such file or directory. As you can see, both code snippets are similar. (However, the error codes aren't the same.) - ``InternalError`` subsumes exception classes for signaling errors due to limitations of the FTP protocol or the concrete implementation of ``ftputil``. - ``InaccessibleLoginDirError`` This exception is only raised if *both* of the following conditions are met: - The directory in which "you" are placed upon login is not accessible, i. e. a ``chdir`` call fails. - You try to access a path which contains whitespace. - ``ParserError`` is used for errors during the parsing of directory listings from the server. This exception is used by the ``FTPHost`` methods ``stat``, ``lstat``, and ``listdir``. - ``RootDirError`` Because of the implementation of the ``lstat`` method it is not possible to do a ``stat`` call on the root directory ``/``. If you know *any* way to do it, please let me know. :-) This problem does *not* affect stat calls on items *in* the root directory. - ``TimeShiftError`` is used to denote errors which relate to setting the `time shift`_, *for example* trying to set a value which is no multiple of a full hour. ``FTPHost`` objects ------------------- .. _`FTPHost construction`: Construction ~~~~~~~~~~~~ ``FTPHost`` instances may be generated with the following call:: host = ftputil.FTPHost(host, user, password, account, session_factory=ftplib.FTP) The first four parameters are strings with the same meaning as for the FTP class in the ``ftplib`` module. The keyword argument ``session_factory`` may be used to generate FTP connections with other factories than the default ``ftplib.FTP``. For example, the M2Crypto distribution uses a secure FTP class which is derived from ``ftplib.FTP``. In fact, all positional and keyword arguments other than ``session_factory`` are passed to the factory to generate a new background session (which happens for every remote file that is opened; see below). This functionality of the constructor also allows to wrap ``ftplib.FTP`` objects to do something that wouldn't be possible with the ``ftplib.FTP`` constructor alone. As an example, assume you want to connect to another than the default port but ``ftplib.FTP`` only offers this by means of its ``connect`` method, but not via its constructor. The solution is to provide a wrapper class:: import ftplib import ftputil EXAMPLE_PORT = 50001 class MySession(ftplib.FTP): def __init__(self, host, userid, password, port): """Act like ftplib.FTP's constructor but connect to other port.""" ftplib.FTP.__init__(self) self.connect(host, port) self.login(userid, password) # try not to use MySession() as factory, - use the class itself host = ftputil.FTPHost(host, userid, password, port=EXAMPLE_PORT, session_factory=MySession) # use `host` as usual On login, the format of the directory listings (needed for stat'ing files and directories) should be determined automatically. If not, please `file a bug`_. .. _`file a bug`: http://ftputil.sschwarzer.net/issuetrackernotes ``FTPHost`` attributes and methods ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Attributes `````````` - ``curdir``, ``pardir``, ``sep`` are strings which denote the current and the parent directory on the remote server. sep identifies the path separator. Though `RFC 959`_ (File Transfer Protocol) notes that these values may depend on the FTP server implementation, the Unix counterparts seem to work well in practice, even for non-Unix servers. .. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_ Remote file system navigation ````````````````````````````` - ``getcwd()`` returns the absolute current directory on the remote host. This method acts similar to ``os.getcwd``. - ``chdir(directory)`` sets the current directory on the FTP server. This resembles ``os.chdir``, as you may have expected. Uploading and downloading files ``````````````````````````````` - ``upload(source, target, mode='')`` copies a local source file (given by a filename, i. e. a string) to the remote host under the name target. Both source and target may be absolute paths or relative to their corresponding current directory (on the local or the remote host, respectively). The mode may be "" or "a" for ASCII uploads or "b" for binary uploads. ASCII mode is the default (again, similar to regular local file objects). - ``download(source, target, mode='')`` performs a download from the remote source to a target file. Both source and target are strings. Additionally, the description of the upload method applies here, too. .. _`upload_if_newer`: - ``upload_if_newer(source, target, mode='')`` is similar to the upload method. The only difference is that the upload is only invoked if the time of the last modification for the source file is more recent than that of the target file, or the target doesn't exist at all. If an upload actually happened, the return value is a true value, else a false value. Note that this method only checks the existence and/or the modification time of the source and target file; it can't recognize a change in the transfer mode, e. g. :: # transfer in ASCII mode host.upload_if_newer('source_file', 'target_file', 'a') # won't transfer the file again, which is bad! host.upload_if_newer('source_file', 'target_file', 'b') Similarly, if a transfer is interrupted, the remote file will have a newer modification time than the local file, and thus the transfer won't be repeated if ``upload_if_newer`` is used a second time. There are (at least) two possibilities after a failed upload: - use ``upload`` instead of ``upload_if_newer``, or - remove the incomplete target file with ``FTPHost.remove``, then use ``upload`` or ``upload_if_newer`` to transfer it again. If it seems that a file is uploaded unnecessarily, read the subsection on `time shift`_ settings. .. _`download_if_newer`: - ``download_if_newer(source, target, mode='')`` corresponds to ``upload_if_newer`` but performs a download from the server to the local host. Read the descriptions of download and ``upload_if_newer`` for more. If a download actually happened, the return value is a true value, else a false value. If it seems that a file is downloaded unnecessarily, read the subsection on `time shift`_ settings. .. _`time shift`: Time zone correction ```````````````````` .. _`set_time_shift`: - ``set_time_shift(time_shift)`` sets the so-called time shift value (measured in seconds). The time shift is the difference between the local time of the server and the local time of the client at a given moment, i. e. by definition :: time_shift = server_time - client_time Setting this value is important if `upload_if_newer`_ and `download_if_newer`_ should work correctly even if the time zone of the FTP server differs from that of the client (where ``ftputil`` runs). Note that the time shift value *can* be negative. If the time shift value is invalid, e. g. no multiple of a full hour or its absolute (unsigned) value larger than 24 hours, a ``TimeShiftError`` is raised. See also `synchronize_times`_ for a way to set the time shift with a simple method call. - ``time_shift()`` return the currently-set time shift value. See ``set_time_shift`` (above) for its definition. .. _`synchronize_times`: - ``synchronize_times()`` synchronizes the local times of the server and the client, so that `upload_if_newer`_ and `download_if_newer`_ work as expected, even if the client and the server are in different time zones. For this to work, *all* of the following conditions must be true: - The connection between server and client is established. - The client has write access to the directory that is current when ``synchronize_times`` is called. If you can't fulfill these conditions, you can nevertheless set the time shift value manually with `set_time_shift`_. Trying to call ``synchronize_times`` if the above conditions aren't true results in a ``TimeShiftError`` exception. Creating and removing directories ````````````````````````````````` - ``mkdir(path, [mode])`` makes the given directory on the remote host. This doesn't construct "intermediate" directories which don't already exist. The ``mode`` parameter is ignored; this is for compatibility with ``os.mkdir`` if an ``FTPHost`` object is passed into a function instead of the os module (see the subsection on Python exceptions above for an explanation). - ``makedirs(path, [mode])`` works similar to ``mkdir`` (see above, but also makes intermediate directories, like ``os.makedirs``). The ``mode`` parameter is only there for compatibility with ``os.makedirs`` and is ignored. - ``rmdir(path)`` removes the given remote directory. If it's not empty, raise a ``PermanentError``. - ``rmtree(path, ignore_errors=False, onerror=None)`` removes the given remote, possibly non-empty, directory tree. The interface of this method is rather complex, in favor of compatibility with ``shutil.rmtree``. If ``ignore_errors`` is set to a true value, errors are ignored. If ``ignore_errors`` is a false value *and* ``onerror`` isn't set, all exceptions occurring during the tree iteration and processing are raised. These exceptions are all of type ``PermanentError``. To distinguish between error situations and/or pass in a callable for ``onerror``. This callable must accept three arguments: ``func``, ``path`` and ``exc_info``). ``func`` is a bound method object, *for example* ``your_host_object.listdir``. ``path`` is the path that was the recent argument of the respective method (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception info as it is got from ``sys.exc_info``. The code of ``rmtree`` is taken from Python's ``shutil`` module and adapted for ``ftputil``. **Note: I find this interface rather complicated and would like to simplify it without making error handling too difficult. Possible changes to ``rmtree`` will depend on the discussion between the versions 2.1b and 2.1.** Removing files and links ```````````````````````` - ``remove(path)`` removes a file or link on the remote host (similar to ``os.remove``). - ``unlink(path)`` is an alias for ``remove``. Retrieving information about directories, files and links ````````````````````````````````````````````````````````` - ``listdir(path)`` returns a list containing the names of the files and directories in the given path; similar to ``os.listdir``. The special names ``.`` and ``..`` are not in the list. The methods ``lstat`` and ``stat`` (and others) rely on the directory listing format used by the FTP server. When connecting to a host, ``FTPHost``'s constructor tries to guess the right format, which mostly succeeds. However, if you get strange results or ``ParserError`` exceptions by a mere ``lstat`` call, please `file a bug`_. If ``lstat`` or ``stat`` yield wrong modification dates or times, look at the methods that deal with time zone differences (`time shift`_). .. _`FTPHost.lstat`: - ``lstat(path)`` returns an object similar that from ``os.lstat`` (a "tuple" with additional attributes; see the documentation of the ``os`` module for details). However, due to the nature of the application, there are some important aspects to keep in mind: - The result is derived by parsing the output of a ``DIR`` command on the server. Therefore, the result from ``FTPHost.lstat`` can not contain more information than the received text. In particular: - User and group ids can only be determined as strings, not as numbers, and that only if the server supplies them. This is usually the case with Unix servers but may not be for other FTP server programs. - Values for the time of the last modification may be rough, depending on the information from the server. For timestamps older than a year, this usually means that the precision of the modification timestamp value is not better than days. For newer files, the information may be accurate to a minute. - Links can only be recognized on servers that provide this information in the ``DIR`` output. - Items that can't be determined at all are set to ``None``. - There's a special problem with stat'ing the root directory. (Stat'ing things *in* the root directory is fine though.) In this case, a ``RootDirError`` is raised. This has to do with the algorithm used by ``(l)stat`` and I know of no approach which mends this problem. .. Currently, ``ftputil`` recognizes the common Unix-style and Microsoft/DOS-style directory formats. If you need to parse output from another server type, please write to the `ftputil mailing list`_. You may consider to `write your own parser`_. .. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist .. _`write your own parser`: `Writing directory parsers`_ .. _`FTPHost.stat`: - ``stat(path)`` returns ``stat`` information also for files which are pointed to by a link. This method follows multiple links until a regular file or directory is found. If an infinite link chain is encountered, a ``PermanentError`` is raised. .. _`FTPHost.path`: ``FTPHost`` objects contain an attribute named ``path``, similar to `os.path`_. The following methods can be applied to the remote host with the same semantics as for ``os.path``: :: abspath(path) basename(path) commonprefix(path_list) dirname(path) exists(path) getmtime(path) getsize(path) isabs(path) isdir(path) isfile(path) islink(path) join(path1, path2, ...) normcase(path) normpath(path) split(path) splitdrive(path) splitext(path) walk(path, func, arg) Local caching of file system information ```````````````````````````````````````` Many of the above methods need access to the remote file system to obtain data on directories and files. To get the most recent data, *each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc. would require to fetch a directory listing from the server, which can make the program very slow. This effect is more pronounced for operations which mostly scan the file system rather than transferring file data. For this reason, ``ftputil`` by default saves (caches) the results from directory listings locally and reuses those results. This reduces network accesses and so speeds up the software a lot. However, since data is more rarely fetched from the server, the risk of obsolete data also increases. This will be discussed below. Caching can - if necessary at all - be controlled via the ``stat_cache`` object in an ``FTPHost``'s namespace. For example, after calling :: host = ftputil.FTPHost(host, user, password, account, session_factory=ftplib.FTP) the cache can be accessed as ``host.stat_cache``. While ``ftputil`` usually manages the cache quite well, there are two possible reasons for modifying cache parameters. The first is when the number of possible entries is too low. You may notice that when you are processing very large directories (e. g. above 1000 directories or files) and the program becomes much slower than before. It's common for code to read a directory with ``listdir`` and then process the found directories and files. For this application, it's a good rule of thumb to set the cache size to somewhat more than the number of directory entries fetched with ``listdir``. This is done by the ``resize`` method:: host.stat_cache.resize(2000) where the argument is the maximal number of ``lstat`` results to store (the default is 1000). Note that each path on the server, e. g. "/home/schwa/some_dir", corresponds to a single cache entry. (Methods like ``exists`` or ``getmtime`` all derive their results from a previously fetched ``lstat`` result.) The value 2000 above means that the cache will hold at most 2000 entries. If more are about to be stored, the entries which have not been used for the longest time will be deleted to make place for newer entries. Caching is so effective because it reduces network accesses. This can also be a disadvantage if the file system data on the remote server changes after a stat result has been retrieved; the client, when looking at the cached stat data, will use obsolete information. There are two ways to get such out-of-date stat data. The first happens when an ``FTPHost`` instance modifies a file path for which it has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such changes are handled transparently; the path will be deleted from the cache. A different matter are changes unknown to the ``FTPHost`` object which reads its cache. Obviously, for example, these are changes by programs running on the remote host. On the other hand, cache inconsistencies can also occur if two ``FTPHost`` objects change a file system simultaneously:: host1 = ftputil.FTPHost(server, user1, password1) host2 = ftputil.FTPHost(server, user1, password1) try: stat_result1 = host1.stat("some_file") stat_result2 = host2.stat("some_file") host2.remove("some_file") # `host1` will still see the obsolete cache entry! print host1.stat("some_file") # will raise an exception since an `FTPHost` object # knows of its own changes print host2.stat("some_file") finally: host1.close() host2.close() At first sight, it may appear to be a good idea to have a shared cache among several ``FTPHost`` objects. After some thinking, this turns out to be very error-prone. For example, it won't help with different processes using ``ftputil``. So, if you have to deal with concurrent write accesses to a server, you have to handle them explicitly. The most useful tool for this probably is the ``invalidate`` method. In the example above, it could be used as:: host1 = ftputil.FTPHost(server, user1, password1) host2 = ftputil.FTPHost(server, user1, password1) try: stat_result1 = host1.stat("some_file") stat_result2 = host2.stat("some_file") host2.remove("some_file") # invalidate using an absolute path absolute_path = host1.path.abspath( host1.path.join(host1.curdir, "some_file")) host1.stat_cache.invalidate(absolute_path) # will now raise an exception as it should print host1.stat("some_file") # would raise an exception since an `FTPHost` object # knows of its own changes, even without `invalidate` print host2.stat("some_file") finally: host1.close() host2.close() The method ``invalidate`` can be used on any *absolute* path, be it a directory, a file or a link. By default, the cache entries are stored indefinitely, i. e. if you start your Python process using ``ftputil`` and let it run for three days a stat call may still access cache data that old. To avoid this, you can set the ``max_age`` attribute:: host = ftputil.FTPHost(server, user, password) host.stat_cache.max_age = 60 * 60 # = 3600 seconds This sets the maximum age of entries in the cache to an hour. This means any entry older won't be retrieved from the cache but its data instead fetched again from the remote host (and then again stored for up to an hour). To reset `max_age` to the default of unlimited age, i. e. cache entries never expire, use ``None`` as value. If you are certain that the cache is in the way, you can disable and later re-enable it completely with ``disable`` and ``enable``:: host = ftputil.FTPHost(server, user, password) host.stat_cache.disable() ... host.stat_cache.enable() During that time, the cache won't be used; all data will be fetched from the network. After enabling the cache, its entries will be the same as when the cache was disabled, that is, entries won't get updated with newer data during this period. Note that even when the cache is disabled, the file system data in the code can become inconsistent:: host = ftputil.FTPHost(server, user, password) host.stat_cache.disable() if host.path.exists("some_file"): mtime = host.path.getmtime("some_file") In that case, the file ``some_file`` may have been removed by another process between the calls to ``exists`` and ``getmtime``! Iteration over directories `````````````````````````` .. _`FTPHost.walk`: - ``walk(top, topdown=True, onerror=None)`` iterates over a directory tree, similar to `os.walk`_ in Python 2.3 and above. Actually, ``FTPHost.walk`` uses the code from Python with just the necessary modifications, so see the linked documentation. .. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707 .. _`FTPHost.path.walk`: - ``path.walk(path, func, arg)`` Similar to ``os.path.walk``, the ``walk`` method in `FTPHost.path`_ can be used. Other methods ````````````` - ``close()`` closes the connection to the remote host. After this, no more interaction with the FTP server is possible without using a new ``FTPHost`` object. - ``rename(source, target)`` renames the source file (or directory) on the FTP server. - ``copyfileobj(source, target, length=64*1024)`` copies the contents from the file-like object source to the file-like object target. The only difference to ``shutil.copyfileobj`` is the default buffer size. Note that arbitrary file-like objects can be used as arguments (e. g. local files, remote FTP files). See `File-like objects`_ for construction and use of remote file-like objects. .. _`set_parser`: - ``set_parser(parser)`` sets a custom parser for FTP directories. Note that you have to pass in a parser *instance*, not the class. An `extra section`_ shows how to write own parsers. Possibly you are lucky and someone has already written a parser you can use. Please ask on the `mailing list`_. .. _`extra section`: `Writing directory parsers`_ File-like objects ----------------- Construction ~~~~~~~~~~~~ ``FTPFile`` objects are returned by a call to ``FTPHost.file`` (or ``FTPHost.open``). - ``FTPHost.file(path, mode='r')`` returns a file-like object that refers to the path on the remote host. This path may be absolute or relative to the current directory on the remote host (this directory can be determined with the getcwd method). As with local file objects the default mode is "r", i. e. reading text files. Valid modes are "r", "rb", "w", and "wb". - ``FTPHost.open(path, mode='r')`` is an alias for ``file`` (see above). Attributes and methods ~~~~~~~~~~~~~~~~~~~~~~ The methods :: close() read([count]) readline([count]) readlines() write(data) writelines(string_sequence) xreadlines() and the attribute ``closed`` have the same semantics as for file objects of a local disk file system. The iterator protocol is also supported, i. e. you can use a loop to read a file line by line:: host = ftputil.FTPHost(...) input_file = host.file("some_file") for line in input_file: # do something with the line, e. g. print line.strip().replace("ftplib", "ftputil") input_file.close() For more on file objects, see the section `File objects`_ in the Library Reference. .. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html Note that ``ftputil`` supports both binary mode and text mode with the appropriate line ending conversions. Writing directory parsers ------------------------- ``ftputil`` recognizes the two most widely-used FTP directory formats (Unix and MS style) and adjusts itself automatically. However, if your server uses a format which is different from the two provided by ``ftputil``, you can plug in an own custom parser and have it used by a single method call. For this, you need to write a parser class by inheriting from the class ``Parser`` in the ``ftp_stat`` module. Here's an example:: from ftputil import ftp_error from ftputil import ftp_stat class XyzParser(ftp_stat.Parser): """ Parse the default format of the FTP server of the XYZ corporation. """ def parse_line(self, line, time_shift=0.0): """ Parse a `line` from the directory listing and return a corresponding `StatResult` object. If the line can't be parsed, raise `ftp_error.ParserError`. The `time_shift` argument can be used to fine-tune the parsing of dates and times. See the class `ftp_stat.UnixParser` for an example. """ # split the `line` argument and examine it further; if # something goes wrong, raise an `ftp_error.ParserError` ... # make a `StatResult` object from the parts above stat_result = ftp_stat.StatResult(...) # `_st_name` and `_st_target` are optional stat_result._st_name = ... stat_result._st_target = ... return stat_result # define `ignores_line` only if the default in the base class # doesn't do enough! def ignores_line(self, line): """ Return a true value if the line should be ignored. For example, the implementation in the base class handles lines like "total 17". On the other hand, if the line should be used for stat'ing, return a false value. """ is_total_line = super(XyzParser, self).ignores_line(line) my_test = ... return is_total_line or my_test A ``StatResult`` object is similar to the value returned by `os.stat`_ and is usually built with statements like :: stat_result = StatResult( (st_mode, st_ino, st_dev, st_nlink, st_uid, st_gid, st_size, st_atime, st_mtime, st_ctime) ) stat_result._st_name = ... stat_result._st_target = ... with the arguments of the ``StatResult`` constructor described in the following table. ===== ========== ============ =============== ======================= Index Attribute os.stat type StatResult type Notes ===== ========== ============ =============== ======================= 0 st_mode int int 1 st_ino long long 2 st_dev long long 3 st_nlink int int 4 st_uid int str usually only available as string 5 st_gid int str usually only available as string 6 st_size long long 7 st_atime int/float float 8 st_mtime int/float float 9 st_ctime int/float float \- _st_name \- str file name without directory part \- _st_target \- str link target ===== ========== ============ =============== ======================= If you can't extract all the desirable data from a line (for example, the MS format doesn't contain any information about the owner of a file), set the corresponding values in the ``StatResult`` instance to ``None``. Parser classes can use several helper methods which are defined in the class ``Parser``: - ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns an appropriate ``st_mode`` value. - ``parse_unix_time`` returns a float number usable for the ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or "May"/"26"/"2005". Note that the method expects the timestamp string already split at whitespace. - ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and returns a float number like from ``time.mktime``. Note that the method expects the timestamp string already split at whitespace. Additionally, there's an attribute ``_month_numbers`` which maps three-letter month abbreviations to integers. For more details, see the two "standard" parsers ``UnixParser`` and ``MSParser`` in the module ``ftp_stat.py``. To actually *use* the parser, call the method `set_parser`_ of the ``FTPHost`` instance. If you can't write a parser or don't want to, please ask on the `ftputil mailing list`_. Possibly someone has already written a parser for your server or can help to do it. FAQ / Tips and tricks --------------------- Where can I get the latest version? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See the `download page`_. Announcements will be sent to the `mailing list`_. Announcements on major updates will also be posted to the newsgroup `comp.lang.python`_ . .. _`download page`: http://ftputil.sschwarzer.net/download .. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist .. _`comp.lang.python`: news:comp.lang.python Is there a mailing list on ``ftputil``? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Yes, please visit http://ftputil.sschwarzer.net/mailinglist to subscribe or read the archives. I found a bug! What now? ~~~~~~~~~~~~~~~~~~~~~~~~ Before reporting a bug, make sure that you already tried the `latest version`_ of ``ftputil``. There the bug might have already been fixed. .. _`latest version`: http://ftputil.sschwarzer.net/download Please see http://ftputil.sschwarzer.net/issuetrackernotes for guidelines on entering a bug in ``ftputil``'s ticket system. If you are unsure if the behaviour you found is a bug or not, you can write to the `ftputil mailing list`_. In *either* case you *must not* include confidential information (user id, password, file names, etc.) in the problem report! Be careful! Does ``ftputil`` support SSL? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``ftputil`` has no *built-in* SSL support. On the other hand, you can use M2Crypto_ (in the source code archive, look for the file ``M2Crypto/ftpslib.py``) which has a class derived from ``ftplib.FTP`` that supports SSL. You then can use a class (not an object of it) similar to the following as a "session factory" in ``ftputil.FTPHost``'s constructor:: import ftputil from M2Crypto import ftpslib class SSLFTPSession(ftpslib.FTP_TLS): def __init__(self, host, userid, password): """ Use M2Crypto's `FTP_TLS` class to establish an SSL connection. """ ftpslib.FTP_TLS.__init__(self) # do anything necessary to set up the SSL connection ... self.connect(host, port) self.login(userid, password) ... # note the `session_factory` parameter host = ftputil.FTPHost(host, userid, password, session_factory=SSLFTPSession) # use `host` as usual .. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads Connecting on another port ~~~~~~~~~~~~~~~~~~~~~~~~~~ By default, an instantiated ``FTPHost`` object connects on the usual FTP ports. If you have to use a different port, refer to the section `FTPHost construction`_. You can use the same approach to connect in active or passive mode, as you like. Using active or passive connections ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use a wrapper class for ``ftplib.FTP``, as described in section `FTPHost construction`_:: import ftplib class ActiveFTPSession(ftplib.FTP): def __init__(self, host, userid, password): """ Act like ftplib.FTP's constructor but use active mode explicitly. """ ftplib.FTP.__init__(self) self.connect(host, port) self.login(userid, password) # see http://docs.python.org/lib/ftp-objects.html self.set_pasv(False) Use this class as the ``session_factory`` argument in ``FTPHost``'s constructor. Conditional upload/download to/from a server in a different time zone ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You may find that ``ftputil`` uploads or downloads files unnecessarily, or not when it should. This can happen when the FTP server is in a different time zone than the client on which ``ftputil`` runs. Please see the section on setting the `time shift`_. It may even be sufficient to call `synchronize_times`_. Wrong dates or times when stat'ing on a server ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please see the previous tip. I tried to upload or download a file and it's corrupt ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Perhaps you used the upload or download methods without a ``mode`` argument. For compatibility with Python's code for local file systems, ``ftputil`` defaults to ASCII/text mode which will try to convert presumable line endings and thus corrupt binary files. Pass "b" as the ``mode`` argument (see `Uploading and downloading files`_). When I use ``ftputil``, all I get is a ``ParserError`` exception ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The FTP server you connect to uses a directory format that ``ftputil`` doesn't understand. You can either write and `plug in an own parser`_, or preferably ask on the `mailing list`_ for help. .. _`plug in an own parser`: `Writing directory parsers`_ I don't find an answer to my problem in this document ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please send an email with your problem report or question to the `ftputil mailing list`_, and we'll see what we can do for you. :-) Bugs and limitations -------------------- - ``ftputil`` needs at least Python 2.3 to work. - Due to the implementation of ``lstat`` it can not return a sensible value for the root directory ``/`` though stat'ing entries *in* the root directory isn't a problem. If you know an implementation that can do this, please let me know. The root directory is handled appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though. - Timeouts of individual child sessions currently are not handled. This is only a problem if your ``FTPHost`` object or the generated ``FTPFile`` objects are inactive for about ten minutes or longer. - Until now, I haven't paid attention to thread safety. In principle, at least, different ``FTPFile`` objects should be usable in different threads. - ``FTPFile`` objects in text mode *may not* support charsets with more than one byte per character. Please email me your experiences (address above), if you work with multibyte text streams in FTP sessions. - Currently, it is not possible to continue an interrupted upload or download. Contact me if you have problems with that. - There's exactly one cache for lstat results for each ``FTPHost`` object, i. e. there's no sharing of cache results determined by several ``FTPHost`` objects. Files ----- If not overwritten via installation options, the ``ftputil`` files reside in the ``ftputil`` package. The documentation (in `reStructuredText`_ and in HTML format) is in the same directory. .. _`reStructuredText`: http://docutils.sourceforge.net/rst.html The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing. If you only *use* ``ftputil`` (i. e. *don't* modify it), you can delete these files. References ---------- - Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing: Unit Testing with Mock Objects`_. - Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_. - Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_. .. _`Endo-Testing: Unit Testing with Mock Objects`: http://www.connextra.com/aboutUs/mockobjects.pdf .. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt .. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html Authors ------- ``ftputil`` is written by Stefan Schwarzer , in part based on suggestions from users. The ``lrucache`` module is written by Evan Prodromou . Feedback is appreciated. :-)