How to download Files from SFTP server Using python script

Recently, I was working on a project that involved fetching files from an sftp server periodically.
The files were being added daily so the task was to login to the sftp server at a specific time (like 3am at night) then download the previous day’s file.

So what is sftp?

SFTP (Secure File Transfer Protocol) is a file transfer protocol that leverages a set of utilities that provide secure access to a remote computer to deliver secure communications. It relies on SSH.

Why fetch files from sftp using python?

The reason we are using python for this task is because python is a powerful yet simple programming languages that has some rich modules providing the functionality we want to achieve. This will enable us to achieve what we want to do faster.

Prerequisites

To follow along, make sure:

  • You can install python packages in your system
  • You have basic knowledge of Python
  • You have basic knowledge of sftp
  • Comfortable with the terminal. We will type some commands.

Table of Content

  1. Setting up python Env and installing pysftp
  2. Creating the python script
  3. Setting up a cron to run the script

1. Setting up python Env and installing pysftp

For us to be able to set up and run the script, we need to import a package that already implements logic to login to an sftp server and perform the necessary operations. We can install the package in our linux system but the recommended way is to use a python virtualenv.

A Python virtualenv is a tool that allows you isolated environment for your application so you can set up its dependencies without conflicting with the system.

creating virtual environment for our script

In Python 2, create a virtual environment using this command:

virtualenv sftpenv

Python 3 uses modules that are specified using the -m flag. to create a virtualenv using the venv module, type this command:

python3 -m venv sftpenv

The above command will create a virtual env in our system but we will need to enable it. Use the source command to activate the virtual env we just created like shown below:

source sftpenv/bin/activate

Now that the virtualenv is set up, Install the pysftp dependency with this command:

pip install pysftp

2. Creating the python script

Now that the environment is set up, let’s create a file that we will use to add the code.

vim get-files.py

Firt let’s import some dependencies. Use this command:

#!/usr/bin/env python

import pysftp
import time

The directive in line 1 #!/usr/bin/env python instructs the script to use the python command when we run get-files.py. We then import pysftp for our functionality and time as a dependency in our code.

The next section is creating a connection. We encapsulate this in a try catch block so we handle errors well if connection fails.

try:
    print("connecting to %s as %s" % (host, username))
    conn = pysftp.Connection(
      host=host,
      port=port,
      username=username,
      password=password,
    )
    print("connection established successfully: ", conn)
except Exception:
  print('failed to establish connection to targeted server')

With the connection, we can now list the files or directories or get the working directory.

current_dir = conn.pwd
print('our current working directory is: ', current_dir)

print('available list of directories: ', conn.listdir())

To get a single file if you know the path, use this:

conn.get('/paymentfiles/09282021/TRXN_HIST_RPT_PARTNER-V0001.CSV')

We can also use the with command:

with conn.cd('/paymentfiles/09282021/'):
    conn.get('TRXN_HIST_RPT_PARTNER-V0001.CSV')

A better way to do the fetch if we do not know the file name is to switch into that directory then list the files and for each fetch.

with conn.cd('/paymentfiles/09282021/'):
    files = conn.listdir()
    for file in files:
        conn.get(file)

That is all!

Here is the full code:

#!/usr/bin/env python

import pysftp
import time

host = '10.2.11.50'
port = 22
username = 'citizix_user'
password= 'str0NgP45sword'

try:
    print("connecting to %s as %s" % (host, username))
    conn = pysftp.Connection(
      host=host,
      port=port,
      username=username,
      password=password,
    )
    print("connection established successfully: ", conn)
except Exception:
  print('failed to establish connection to targeted server')

current_dir = conn.pwd
print('our current working directory is: ', current_dir)

print('available list of directories: ', conn.listdir())

dlfiles = []
with conn.cd('/paymentfiles/09282021/'):
    files = conn.listdir()
    for file in files:
        conn.get(file)
        dlfiles.append(file)
        print(file, ' downloaded successfully ')

print("These files were downloades ", dlfiles)

3. Setting up a cron to run the script

Now that our script is done, we need to automate the process of running it periodically.
Let’s create a cron that runs every 3 am fetching the files and logging to /var/logs/scripts/file-fetcher.log

To init cron:

crontab -e

Then in the window that appears:

17 03 * * * /opt/scripts/get-files.py > /var/logs/scripts/file-fetcher.log

Up to here we managed to fetch files from an sftp server.
To check more on pysftp, heck out docs here https://pysftp.readthedocs.io/en/release_0.2.9/

comments powered by Disqus
Citizix Ltd
Built with Hugo
Theme Stack designed by Jimmy