Recently, I was working on a project that involved fetching files from an sftp server periodically.
The files were being added daily so the task was to login to the sftp server at a specific time (like 3am at night) then download the previous day’s file.
So what is sftp?
SFTP (Secure File Transfer Protocol) is a file transfer protocol that leverages a set of utilities that provide secure access to a remote computer to deliver secure communications. It relies on SSH.
Why fetch files from sftp using python?
The reason we are using python for this task is because python is a powerful yet simple programming languages that has some rich modules providing the functionality we want to achieve. This will enable us to achieve what we want to do faster.
Related Content
- How to work with SFTP client in Linux – 10 sftp commands
- How to set up an SFTP server on Debian 11 Server
- Download Files from SFTP server Using a python script
- List, Upload and Download files from an SFTP Server using golang
- How to set up an SFTP server on OpenSUSE Leap 15.3 Server
- How to install and set up sftp server in Ubuntu 20.04
- How to set up an SFTP server on CentOS 8 /RHEL 8 Server
Prerequisites
To follow along, make sure:
- You can install python packages in your system
- You have basic knowledge of Python
- You have basic knowledge of sftp
- Comfortable with the terminal. We will type some commands.
Table of Content
- Setting up python Env and installing pysftp
- Creating the python script
- Setting up a cron to run the script
1. Setting up python Env and installing pysftp
For us to be able to set up and run the script, we need to import a package that already implements logic to login to an sftp server and perform the necessary operations. We can install the package in our linux system but the recommended way is to use a python virtualenv.
A Python virtualenv is a tool that allows you isolated environment for your application so you can set up its dependencies without conflicting with the system.
creating virtual environment for our script
In Python 2, create a virtual environment using this command:
virtualenv sftpenv
Python 3 uses modules that are specified using the -m
flag. to create a virtualenv using the venv
module, type this command:
python3 -m venv sftpenv
The above command will create a virtual env in our system but we will need to enable it. Use the source command to activate the virtual env we just created like shown below:
source sftpenv/bin/activate
Now that the virtualenv is set up, Install the pysftp
dependency with this command:
pip install pysftp
2. Creating the python script
Now that the environment is set up, let’s create a file that we will use to add the code.
vim get-files.py
Firt let’s import some dependencies. Use this command:
#!/usr/bin/env python
import pysftp
import time
The directive in line 1 #!/usr/bin/env python
instructs the script to use the python command when we run get-files.py
. We then import pysftp
for our functionality and time
as a dependency in our code.
The next section is creating a connection. We encapsulate this in a try catch block so we handle errors well if connection fails.
try:
print("connecting to %s as %s" % (host, username))
conn = pysftp.Connection(
host=host,
port=port,
username=username,
password=password,
)
print("connection established successfully: ", conn)
except Exception:
print('failed to establish connection to targeted server')
With the connection, we can now list the files or directories or get the working directory.
current_dir = conn.pwd
print('our current working directory is: ', current_dir)
print('available list of directories: ', conn.listdir())
To get a single file if you know the path, use this:
conn.get('/paymentfiles/09282021/TRXN_HIST_RPT_PARTNER-V0001.CSV')
We can also use the with command:
with conn.cd('/paymentfiles/09282021/'):
conn.get('TRXN_HIST_RPT_PARTNER-V0001.CSV')
A better way to do the fetch if we do not know the file name is to switch into that directory then list the files and for each fetch.
with conn.cd('/paymentfiles/09282021/'):
files = conn.listdir()
for file in files:
conn.get(file)
That is all!
Here is the full code:
#!/usr/bin/env python
import pysftp
import time
host = '10.2.11.50'
port = 22
username = 'citizix_user'
password= 'str0NgP45sword'
try:
print("connecting to %s as %s" % (host, username))
conn = pysftp.Connection(
host=host,
port=port,
username=username,
password=password,
)
print("connection established successfully: ", conn)
except Exception:
print('failed to establish connection to targeted server')
current_dir = conn.pwd
print('our current working directory is: ', current_dir)
print('available list of directories: ', conn.listdir())
dlfiles = []
with conn.cd('/paymentfiles/09282021/'):
files = conn.listdir()
for file in files:
conn.get(file)
dlfiles.append(file)
print(file, ' downloaded successfully ')
print("These files were downloades ", dlfiles)
3. Setting up a cron to run the script
Now that our script is done, we need to automate the process of running it periodically.
Let’s create a cron that runs every 3 am fetching the files and logging to /var/logs/scripts/file-fetcher.log
To init cron:
crontab -e
Then in the window that appears:
17 03 * * * /opt/scripts/get-files.py > /var/logs/scripts/file-fetcher.log
Up to here we managed to fetch files from an sftp server.
To check more on pysftp, heck out docs here https://pysftp.readthedocs.io/en/release_0.2.9/