Pylogsparser : visualizing ssh attacks in video

  • Sharebar

Wallix logoIn this article we will show another possible application for the pylogsparser library. We will also discover a simple way to draw and use world maps with python. You should read the previous article in this series if you haven’t done so, since we will use what we have done there as a starting point.

Here at Wallix, we have set up a SSH honeypot for testing and analysis purposes. It always amazes me how often this machine gets randomly attacked on this service, and how the brute-force attacks even started mere minutes after the SSH server was up. In our previous article, we’ve gained insight on the origins of attacks and targeted accounts with a classic pie chart. This time, I would like to have a visual way to represent and understand a typical day of brute force attempts. We could picture that as a world map where a country lights up when an attacker from this country tries to gain access to the honeypot. There will be a world map drawn for every moment of the day, and then the resulting pictures will be aggregated into a timelapse animation.

Before we get to work, here are the elements we need :

  • obviously, a ssh log file ! It should span a full day for accuracy.
  • the pylogsparser library along with the GeoIP library. Since pylogsparser 0.3, the geoIP conversion has been included in the library so the countries are already tagged when available, if the GeoIP library is installed !
  • the matplotlib library, and more specifically the Basemap optional extension that can be found here : http://matplotlib.sourceforge.net/basemap/doc/html/ Follow the installation instructions there, as unfortunately this extension is not always packaged for easy deployment.
  • the numpy library, but it is optional as we will use it only with matplotlib’s color maps. It should be installed along matplotlib anyway.
  • a shapefile describing countries borders (more on that later). For this article, I am using the one freely available at http://thematicmapping.org/downloads/world_borders.php. It is probably not the most accurate nor up-to-date dataset, but it is more than enough for our project.
  • python libraries for manipulating shapefile datasets. In this article, we will use pyshapelib : http://ftp.intevation.de/users/bh/ but there are many other libraries available, as pyshapelib hasn’t been maintained in a little while. See this article’s comments for details : http://www.geophysique.be/2011/01/27/matplotlib-basemap-tutorial-07-shapefiles-unleached/ (incidentally, this article was the inspiration for this work)
  • ffmpeg, or anything that can make a timelapse animation out of still pictures.

Now that we’ve got everything, let’s get to work !

First, we will parse our log file and keep only the logs where the action tag is set to “fail”. We will then extract the hour as “1345” for example, and use it as a key in a dictionary. This key will be associated to another dictionary, where keys are the countries where the attacks occurring at this timestamp come from, and the associated values are the amount of attacks from that specific country during that timestamp. This will look a lot like what we’ve done in the previous article :

from logsparser.lognormalizer import LogNormalizer as LN
 
normalizer = LN('/usr/share/normalizers')
auth_logs = open('/var/log/auth.log', 'r')
 
dataset = {}
for log in auth_logs:
    l = {'raw' : log[:-1] } # remove the ending \n
    normalizer.normalize(l)
    if l.get('action') == 'fail':
        key = str(l['date'].hour).rjust(2,'0') +\
              str(l['date'].minute).rjust(2,'0')
        # add the key if not already present
        dataset[key] = dataset.get(key, {})
        # add the country if not already present. If geoIP failed, replace by Unknown
        country = l.get('source_country', 'Unknown')
        dataset[key][country] = dataset[key].get(country, 0) + 1

Next, we need to draw a world map. It is made very straightforward with Basemap. We will go for a classic “Mercator” representation, but there are lots of other possibilities available.

The code below is self-explanatory for the most part. The initialization options for the Basemap object are the map projection, the lower left and upper right corners’ coordinates of the area to draw, the latitude were the projection is the most accurate (this is specific to Mercator), and the drawing resolution.

 
from mpl_toolkits.basemap import Basemap
 
def makemap():
    m = Basemap(projection="merc",
                llcrnrlat=-70,
                urcrnrlat=78,
                llcrnrlon=-180,
                urcrnrlon=180,
                lat_ts=20,
                resolution='c')
    m.drawcoastlines(color="white")
    m.drawmapboundary(fill_color="black")
    m.drawcountries(linewidth = 0.3, color = "gray")
    return m

The Basemap object has an interesting property : when called as a function and passed a list of longitudes and latitudes as arguments, it will automatically convert the coordinates into valid coordinates for the current subplot. It will be used in the next code snippet.

Now all we need is a way to draw a specific country on our map. Unfortunately, while Basemap allows you to draw every boundaries on the map, it cannot be used to draw a specific country. This is why we need a shapefile defining world borders; the shapefile contains a list of vertices coordinates, each list making a polygon covering a specific country. The accompanying description file in the dataset stores the ISO 3166-1 Alpha-2 Country Code for each polygon, which is the country code used by the geoIP library, so we are in luck. Let’s code it !

from shapelib import ShapeFile
import dbflib
from matplotlib.collections import LineCollection
 
class CountryDrawer:
    def __init__(self,
                 shpfile = "worldmap/TM_WORLD_BORDERS-0.3.shp",
                 dbffile = "worldmap/TM_WORLD_BORDERS-0.3.dbf"):
        shp = ShapeFile(shpfile)
        dbf = dbflib.open(dbffile)
        self.countries = {}
        for i in range(shp.info()[0]):
        # we already know where to find the info we need, otherwise some
        # introspection would have been needed.
            c = dbf.read_record(i)['ISO2']
            poly = shp.read_object(i)
            self.countries[c] = poly.vertices()
 
    def drawcountry(self,
                    ax,
                    base_map,
                    iso2,
                    color,
                    alpha = 1):
        if iso2 not in self.countries:
            raise ValueError, "Where is that country ?"
        vertices = self.countries[iso2]
        shape = []
        for vertex in vertices:
            longs, lats = zip(*vertex)
            # conversion to plot coordinates
            x,y = base_map(longs, lats)
            shape.append(zip(x,y))
        lines = LineCollection(shape,antialiaseds=(1,))
        lines.set_facecolors(cm.hot(np.array([color,])))
        lines.set_edgecolors('white')
        lines.set_linewidth(0.5)
        lines.set_alpha(alpha)
        ax.add_collection(lines)

Let’s put everything together and generate a picture for each minute in a day (which is 1440 in case you are wondering). The countries will be colored according to the proportion they represent of the ongoing attacks at a given timestamp. There is also a light fading effect when nothing is happening for some time. Finally, there will be a timestamp in the lower left corner.

 
import numpy as np
import matplotlib.pyplot as plt
# color palette
from matplotlib import cm
 
# insert previous code snippets here ...
 
if __name__ == "__main__":
    cd = CountryDrawer()
    currentkey = "0000"
    alpha = 1
    i = 0
    for hour in range(0,24):
        for minute in range(60):
            key = str(hour).rjust(2,'0') + str(minute).rjust(2,'0')
            fig = plt.figure(figsize=(6.2,3.6))
            plt.subplots_adjust(left=0,right=1,top=1,bottom=0)
            ax = plt.subplot(111)
            m = makemap()
            if key in dataset:
                currentkey = key
                alpha = 1
            else:
                alpha *= 0.7
            if currentkey in dataset:
                data = dataset[currentkey]
                total_attacks = float(sum(data.values()))
                for c in data:
                    if c != 'Unknown':
                        cd.drawcountry(ax, m, c, 0.6*data[c]/total_attacks, alpha )
            plt.text(50,50,str(hour).rjust(2,'0') +":"+ str(minute).rjust(2,'0'), color = 'white', size=20)
            plt.savefig('rendering/plot%s.png' % str(i).rjust(5,'0'), dpi=200)
            i += 1
        print "Hour %i is done !" % hour

Here’s a resulting sample:

worldmap image

If you run this code yourself, be aware that you might need a lot of processing power and memory. It is probably best to split the rendering process in two.

Once we have all our pictures, let’s make the timelapse animation. With a frames-per-second rate of 15, the animation will be still smooth and last around 90 seconds.

  ffmpeg -r 15 -b 3000 -i rendering/plot%05d.png sshd.mp4

And now let’s watch the world light up as our server gets tirelessly assaulted !

Article contributed by Matthieu Huin, R&D engineer in Wallix LogBox development team.

Incoming search terms:

  • www wallix org
  • python ply syslog
  • python
  • matplotlib world map
  • python world map png draw
  • python lines set_facecolors
  • python map visualization
  • log parsing matplotlib
  • 2012February
  • syslog geoip ssh
This entry was posted in log, ssh and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*


eight × 1 =

* Copy This Password *

* Type Or Paste Password Here *

50,496 Spam Comments Blocked so far by Spam Free Wordpress

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>