This post gives a brief overview of the File Transfer Protocol (FTP) and its basic theory of operation.
Facilities provided by FTP
- Reliability: The most important feature of FTP is that it provides reliable transfer of files, irrespective of the nature of the underlying network transmission links. That means FTP ensures that every bit of each source file sent by the sending computer reaches the receiving computer intact, without loss or corruption.
- Authentication : FTP supports a simple client authentication mechanism based on username and password, before allowing any type of file transfer between the client and the server. The FTP server authenticates the FTP client before allowing any data transfer.
- Anonymous FTP : If a server wants to expose some files to the public, then a user need not have a separate login and password in the server. The user can access such files using the Username “anonymous” and password “guest”.
- Bi-directional File Transfer : FTP supports transferring files from client to the server and from the server to the client. The file naming conventions, file formats, directory structures, text and data representation formats, Operating System etc. could be different in the client and the server machines. Also FTP supports transferring files of any type (text, image, audio, video etc.).
- File Management Services: FTP supports viewing/traversing the directories of the server (e.g. “dir”, “ls” equivalents). It also supports renaming and deletion of files in the server machine.
- Secured File Transfer : SFTP is a secured version of this protocol where data is sent in encrypted format.
Basic Theory of Operation of FTP
- FTP is an application layer, Client-Server, Request-Response based protocol and it uses TCP as the underlying Transport layer protocol.
- FTP relies completely on TCP to provide reliability across the underlying unreliable best effort IP based networks.
- FTP uses two separate sessions (TCP connections), one for control and another for data. Since control information is sent in a separate logical channel, FTP is an example of an out-of-band signalling protocol.
- During an FTP session between a client and a server, while there is just one control TCP connection, a separate TCP data connection is opened for each file transferred. A new TCP data connection is opened just before a file is to be transferred and closed as soon as the file transfer gets completed.
- To take care of different file formats/representations at the client and server ends, FTP uses three attributes named file type, data structure and transmission mode. For each file transfer, the values of these three attributes are communicated by the client to the server through the control TCP connection, before the actual file is transferred.
- File Type attribute can take three values namely ASCII (default), EBCDIC and Image (used for binary file transmissions).
- Data Structure attribute contains information about the structure of data inside the files and can be of three types namely a) File Structure (file has no specific structure and it can be treated as a stream of bytes). b) Record Structure (file consists of a list of records, with each record having a fixed number of fields). c) Page Structure (file is organized into pages).
- Transmission Mode attribute denotes the method in which FTP gives data to the underlying TCP layer. It can take three values, namely a) Stream Mode (file data is delivered as a stream of bytes) b) Block Mode (FTP splits the file into blocks and gives it to TCP along with a block number and block size header). c) Compressed Mode (data is compressed and then given to TCP).
Typical Organization of FTP implementations
FTP implementations typically consists of three software sub-layers, namely
- Control Process
- Data Transfer Process
As shown in the diagram above,
- The User Interface sub-layer at the client provides the necessary end user interface commands to connect to remote FTP servers and transfer files. At the server end, usually, it is only required to start the FTP server service without the need for additional user interface commands.
- The Control Process sub-layer is present at both the client and server ends. While the client-end Control process takes care of the processing required for sending Request commands for different services over the control TCP connection, the server-end Control process takes care of processing and sending appropriate Response commands.
The Control Process Sub-layer at both ends work together to take care of processing related to the following services:
- Client Authentication using commands like USER, PASS etc., for client user authentication.
- Sending File Read and File Write Requests to the server by the client end using commands like RETR, STOR etc. and corresponding replies from the server end.
- Communicating the TCP socket end ports for data connections using the PORT and PASV commands (active/passive modes).
- Opening and closing a separate TCP data connection for each file transfer.
- Controlling the File type, Data Structure and Transmission Mode parameters through commands like TYPE, STRU and MODE, before the actual file transfer.
- File management activity commands like LIST, CWD (for listing a directory’s contents, for checking the current working directory at the server end etc.)
- The Data Transfer Processes at the client and server ends (peers) take care of the processing required for the actual file transfers. It is used for getting a file from the server (reading operation) and also for putting a file onto the server (writing operation). The files are split and carried inside TCP segments. The segment size is decided by TCP, based on the path MTU.
- While the control connection is always initiated by the FTP client end, data connections can be initiated either by the client (passive mode) or by the server (active mode) based on firewall policies at the client and server ends
Thus FTP is able to offer a reliable file transfer mechanism between remote computers using the above principles.