?? sebek.html
字號:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html> <head> <title>Sebek: kernel based data collection</title> </head> <body> <h1>Sebek: kernel based data collection</h1> <hr> <table width=100% > <tr> <td>Current version:</td> <td>$Name: sebek-0-4 $</td> </tr> </table> <hr> <h2>Introduction</h2> About a year ago I setup my first honeynet. It was mostly Linux honeypots. After a number of weeks of nothing but lame MS virus stuff we had some action one Sunday morning. I was excited as can be, however my joy was dampened by one thing, the intruder immediately installed his own version SSH and then secured his communication channel. This meant I was only doing a black-box analysis of the whole situation, examining just the network traffic that emanated from the honeypot. No keystroke logs, and if he had used ssh to copy the rootkit, I wouldnt have even recoverd that. As it turns out, this was a common situation. One which a number of members of the community have been working on. <p> One result of this labor is a system called <b>sebek</b>. Sebek is a system that allows one to collect data from kernel space and export it to another host on the LAN where it can be processes to extract keystroke logs and files copied to and from the honeypot. The intruder can install his or her own version of SSH, or shell it doesnt matter the end result will be the same, we captured the data we want. <h2>Design</h2> Sebek has 3 basic components: a kernel module to collect data passing through the <i>read()</i> system call, an application to export this data onto the LAN, and a few tools used to collect the data off the LAN and process it.<p> The kernel module is a modified version of the <b>Adore</b> rootkit. By basing the collection components on a rootkit, we are then able to hide the presence of these files on the system, the process that exports the data onto the LAN and even the fact that the kernel module is installed. No rootkit is 100% effective, however by using this approach we can minimize the amount of suspicion caused by an installed instance of the tool. The kernel module itself provides the log data to user space through a device that is essentially a ring buffer. As a user-land process reads from the device the data is lost.<p> <b>sdm</b> is the sebek device monitor. It reads from the sebek device and then sends data onto the LAN. This tool employees a number of techniques to make the presence of the data on the LAN less alarming to an intruder on a honeypot. First the payload is encrypted. Second, based on the user input, the Ethernet, IP and UDP Header data is falsified. This includes IP addresses, MAC addresses, and UDP port numbers. Third, we introduce a variable amount of delay between each packet we transmit. Last, when there is no legitimate data to send on the network we transmit decoys. When using sebek, it is rare to directly execute sdm on the honeypot. Included is a shell script called <b>sebek.sh</b> which provides a central point of configuration and execution control for the sebek components on the honeypot.<p> Once the data is on the LAN we collect this data on a remote host using an application called <b>sebeksniff</b>. This is a fairly straight forward tool that collects the sebek export data, unencryptes it and stores it in a log file based on the IP address of the exporting honeypot. This application is capable of supporting multiple honeypots simultaneously. <b>sbdump</b> is an application provided to parse these log files and extract keystroke logs or SCP file transfers. On the <hr> <h2>Building and Installation</h2> Building and Installing the sebek package is still somewhat crude. However the process is not too difficult. The simple demo process is as follows: <ol> <li>Make sure you are on a Linux box with the kernel source installed. <li>Untar the src distribution tarball. <li>cd into the directory. <li>type "make", this will build the 2 installation tarballs: one for the honeypots called <i>sebek.tar</i> and one for the collector, called <i>sebek_collector.tar</i>. <li>Copy the sebek.tar file to the honeypots. <li>untar the sebek.tar into /tmp. <li>edit /tmp/sebek/sebek.sh to make it executable. <li>execute /tmp/sebek/sebek.sh start. <li>Copy the sebek_collector.tar file to the collector. <li>untar this file here you like. <li>execute ./sebeksniff -d eth0 -m 7777 -s testtesttest </ol> At this point if you do something on the honeypot you will see this on the collector: <font color="red"> <pre>collector# ./sebeksniff -d eth0 -m 7777 -s testtesttestDevice: vmnet1Magic: 7777Sebek monitoring enabledSebek symmetric key: testtesttestwrite 10.0.0.1: 30 byteswrite 10.0.0.1: 360 byteswrite 10.0.0.1: 36 byteswrite 10.0.0.1: 72 byteswrite 10.0.0.1: 36 byteswrite 10.0.0.1: 144 byteswrite 10.0.0.1: 108 byteswrite 10.0.0.1: 36 bytes </pre> </font> If the IP address that sebeksniff displays its recording for is not something that corresponds to one of your honeypots, this is an indication of some normal traffic getting recorded inadvertently or the symmetric key is incorrect. <p> At this point you have a working sebek system, though one that is probably not adjusted how you would like. See the Configuration section for details on how to configure all the various components of the system. <hr> <h2>Configuring Sebek on the honeypot: sebek.sh</h2> Sebek configuration on the honeypot happens by editing the config section of the sebek.sh file. Within sebek.sh, there are a number of options that need to be configured at the top of the script: <p> <font color="green"> <pre>#!/bin/sh#---------------------------------------------------------------------#----- $Header: /home/cvsroot/sebek/sebek.html,v 1.3 2002/09/11 16:21:03 cvs Exp $#---------------------------------------------------------------------#-----Sebek configuration --------------------------------------------#---------------------------------------------------------------------#--- DIR: directory holding the sebek goodiesDIR="/tmp/sebek"#--- LOG: the device or file that sdm should read fromLOG="/dev/sebek"#--- PASSWD: the password to usePASSWD="testtesttest"#----- DST and SRC networks:#----- This controls the IP addresses that are given to the Source and#----- Destination of the packets transmitted by sebek onto the LANDST_NET="10.0.1.1/32"SRC_NET="10.0.0.0/24"#----- UDP port data:#----- This controls the UDP ports assigned to the sebek packets.#----- if you specify both MAGIC and DST, the SRC port will be set#----- to MAGIC - DST.#DST_PORT="123"MAGIC_NO="7777"#----- Inter-packet Delay:#----- Controls the maximum inter-packet delay, expressed in #----- microseconds#PKT_DELAY="500000"#---------------------------------------------------------------------- </pre> </font> The majority of the configuration options currently available control what the sebek packets looks like when transmitted on the LAN. Recall that the sdm application is responsible for reading the sebek device and then transmitting this data onto the LAN. This application is started by the <b>sebek.sh</b> script on the honeypot. By editing the sebek.sh config we control what the sebek packets look like on the network. Below is a list of sdm options used to exact this control. <p> <b>Controlling IP packet header data</b> <dl> <dt>DST_NET <dd>Specified the Destination IP information to use, expressed in CIDR notation. If you specify 10.0.0.1/32 then all packets will have destination IP addresses of 10.0.0.1. If you specify 10.0.0.0/24, then all destination IP addresses will be between 10.0.0.1 and 10.0.0.254. This is a required option. <dt>SRC_NET <dd>This is the same as the DST_NET but for the source IP address. </dl> <p> <b>What about the MAC addresses?</b> <p> The source and destination MAC address are generated automatically based on the matching IP address and a known vendor ID, right now it uses the Intel ID. The algorithm isnt really anything special, we rotate all the bits twice to the left, rolling over the lower 2 to the top. Then we set the last 3 octests in the MAC address to correspond to the last 3 octets in the rotated address. Yeah its weak. <p> <b>Controlling UDP port data</b> <dl> <dt>DST_PORT <dd>If the Destination port number is defined then the packets Dest Port will be set accordingly, if not it will be randomly selected. This optional if MAGIC_NO is defined. <dt>MAGIC_NO <dd>Magic number has two uses. If DST_PORT is configured, then we calculate the UDP Source Port number to use by subtracting DST_PORT from MAGIC_NO. If DST_PORT is not defined, then both The destination and source port will be psudo-random, and the addition of the source and destination port will equal MAGIC_NO. This is required if DST_PORT is not defined. </dd> </dl> <p> <b>Controlling Inter-Packet Timing</b> <dl> <dt>PKT_DELAY</dt> <dd>This specifies the maximum inter-packet delay, expressed in microseconds. Sdm will select a random number between 0 and PKT_DELAY to usleep between packet transmissions. Setting this too high will have a affect of the max transfer rate of sebek. This is optional</dd> </dl> <p> <b>Other Configuration Options</b> <dl> <dt>DIR</td> <dd>This defines the install directory of sebek, on the honeypot.</dd> <dt>LOG</td> <dd>This defines where to look for the sebek device</dd> <dt>PASSWD</td> <dd>Defines the Password with which to encrypt the payload.</dd> </dl> <hr> <h2>Configuring Sebek on the Controller: sebeksniff</h2> The configuration of sebeksniff is dependent on how sebek.sh was configured. All options are passed on the command line. The destination port, magic number, and password need to be configured as they are on the honeypot. Also currently all honeypots on the same LAN need to use the same password and dest_port/magic_no configuration if only one instance of sebeksniff is running. <font color="red"> <pre>collector# ./sebeksniff -hSebek Sniffer -d device -p dest port to look for -m MAGIC number -s symmetric key -h This screencollector# ./sebeksniff -d eth0 -m 7777 -s testtesttest </pre> </font> <hr> <h2>Examining Logs using sbdump</h2> The log files generated by sebeksniff are based on the IP address of inside the sebek packet payload, not the IP addresses configured in the IP header. To examine the log for a specific honeypot use the IP address as configured on that honeypot as the second argument to sbdump. There are modes to run sdbump in: the first dumps all keystroke logs and other interactive data, the second extracts all files copied to or from the honeypot using SCP. <h3>Keystroke Logging Example</h3> Keystroke Logging is fairly straight forward. Data that comes from interactive sessions is recorded, byte by byte. As multiple concurrent interactive sessions can occur simultaneously, the kernel module just records each byte and the TTY, PID, UID etc that the byte is associated with. This data is exported onto the Network where sebeksniff picks it up. When sbdump processes the log data, it then rebuilds the sessions into the commands issued in each session. <p> Example output: <font color="red"> <pre>collector# ./sbdump.pl -c 10.0.0.110:46:24-2002/09/06 [0:bash:1884:pts:0]who10:47:00-2002/09/06 [0:bash:1884:pts:0]last10:47:32-2002/09/06 [0:bash:1884:pts:0]exit </pre> </font> Each line has the following fields: <ul> <li>Timestamp <li>User ID <li>Process Name <li>Process ID <li>TTY <li>File Descriptor <li>Data </ul> <h3>SCP file transfer recovery</h3> For this example we have a honeypot 10.0.0.1, a remote box 10.0.0.3 and a collector on the same LAN. We have 2 examples the first is when a user on the remote box SCPs a file to the honeypot. The second example is when intruder on the honeypot, logged in as UID 0, SCPs a file from the remote box. <p> First example: <font color="red"> <pre>collector# ./sbdump.pl -s 10.0.0.112:12:22-2002/09/06 SCP (remote)->local spyvsspy.jpg 19197 bytes </pre> </font> <p> When extracting SCP files sbdump will specify as much information as possible, however depending on the direction of the copy and where it was initiated, more information is available. In this case, because it was remotely initiated, we have less data to go on. <p> Second Example: <font color="red"> <pre>collector# ./sbdump.pl -c 10.0.0.112:17:07-2002/09/06 [0:bash:1094:vc/1:0]ls12:17:09-2002/09/06 [0:bash:1094:vc/1:0]cd /tmp12:17:12-2002/09/06 [0:bash:1094:vc/1:0]scp foobar@10.0.0.2:/home/edb/images/svs_thinking.gif .collector# ./sbdump.pl -s 10.0.0.112:17:50-2002/09/06 SCP (local)<-remote svs_thinking.gif 194614 bytes12:17:50-2002/09/06 SCP: passwd S00pErD0operPassweerd </pre></font> <p> In this example not only do we extract the file but we also determine the password that was used to authenticate. Using keystroke extraction we can also determine the remote host and the user ID. With the combination of the two we have a great deal of power at our disposal. <hr> <address><a href="mailto:ebalas@indiana.edu">Edward Balas</a></address><!-- Created: Mon Sep 9 10:10:44 EST 2002 --><!-- hhmts start -->Last modified: Wed Sep 11 10:39:21 EST 2002<!-- hhmts end --> </body></html>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -