P0f

P0f is an advanced passive OS/network fingerprinting utility for use in IDS environments, honeypots environments, firewalls and servers in addition to penetration test and ethical hacking.

 

    --=--
                                  p0f 2
                                  --=--

                    "Dr. Jekyll had something to Hyde"
		    
                      passive OS fingerprinting tool
                              version 2.0.4

     (C) Copyright 2000 - 2004 by Michal Zalewski 

                Various ports (C) Copyright 2003 - 2004 by:

                   Michael A. Davis 
                      Kirby Kuehl 
                     Kevin Currie 

      Portions contributed by numerous good people - see CREDITS file.

                  http://lcamtuf.coredump.cx/p0f.shtml

  *********************************************************************
  **** HELP WITH P0F DATABASE: http://lcamtuf.coredump.cx/p0f-help ****
  *********************************************************************

-----------
0. Contents
-----------

  This document describes the concept and history of p0f, its
  command-line options and extensions, and goes into some detail about
  its operation, integration with existing solutions, and so on.

  Table of contents:

   1) What's this, anyway?
   2) Why would I want to use it?
   3) What's new then?
   4) Command-line
   5) Active service integration
   6) SQL database integration
   6) Masquerade detection
   7) Fingerprinting accuracy and precision
   8) Adding signatures
   9) Security
  10) Limitations
  11) Is it better than other software?
  12) Program no work!
  13) Appendix A: Exact output format
  14) Appendix B: Links to OS fingerprinting resources

-----------------------
1. What's this, anyway?
-----------------------

  The passive OS fingerprinting technique is based on analyzing the
  information sent by a remote host while performing usual communication
  tasks - such as whenever a remote party visits your webpage, connecs to 
  your MTA - or whenever you connect to a remote system while browsing the
  web or performing other routine tasks. In contrast to active 
  fingerprinting (with tools such as NMAP or Queso), the process of passive 
  fingerprinting does not generate any additional or unusual traffic,
  and thus cannot be detected.

  Captured packets contain enough information to identify the remote OS,
  thanks to subtle differences between TCP/IP stacks, and sometimes certain
  implementation flaws that, although harmless, make certain systems quite
  unique. Some additional metrics can be used to gather information about 
  the configuration of a remote system or even its ISP and network setup.

  The name of the fingerprinting technique might be somewhat misleading - 
  although the act of discovery is indeed passive, p0f can be used for
  active testing. It is just that you are not required to send any unusual
  or undesirable traffic, and can rely what you would be getting from
  the remote party anyway. 

  To accomplish the job, p0f equips you with three different detection
  modes:

    - Incoming connection fingerprinting (SYN mode, default) - whenever
      you want to know what the guy or gal who connects to you runs,

    - Outgoing connection (remote party) fingerprinting (SYN+ACK mode) -
      to fingerprint systems you or your users connect to,

    - Outgoing connection refused (remote party) fingerprinting (RST+ mode) 
      - to fingerprint systems that reject your traffic.

  P0f was the first (and I believe remains the best) fully-fledged 
  implementation of the passive fingerprinting technique. The current version 
  uses a number of detailed metrics, often invented specifically for p0f, 
  and achieves a very high level of accuracy and detail, is designed for 
  hands-free operation over an extended period of time, and has a number 
  of features to make it easy to integrate it with other solutions.

  Portions of this code are used in several IDS systems, some sniffer  
  software; p0f is also shipped with several operating systems and 
  incorporated into an interesting OpenBSD pf hack by Mike Frantzen, that 
  allows you to filter out or redirect traffic based on the source OS. 
  There is also a beta patch for Linux netfilter, courtesy of Evgeniy 
  Polyakov. In short, p0f is a rather well-established software at this 
  point. 

------------------------------
2. Why would I want to use it?
------------------------------

  Oh, a number of uses come to mind:

    - Profiling / espionage - run on a server, firewall, proxy or router, 
      p0f can be used to silently gather statistical and profiling information 
      about your visitors, users, or competitors. P0f also gathers netlink
      and distance information suitable for determining remote network
      topology.

    - Active response / policy enforcement - integrated with your server
      or firewall, p0f can be used to handle specific OSes in the most
      suitable manner and serve most appropriate content; you may also enforce 
      a specific corporate OS policy, restrict SMTP connections to a set of 
      systems, etc; with masquerade detection capabilities, p0f can be used
      to detect illegal network hook-ups and TOS violations.

    - PEN-TEST - in the SYN+ACK or RST+ mode, or when a returning connection
      can be triggered on a remote system (HTML-enabled mail with p_w_picpaths,
      ftp data connection, mail bounce, identd connection, IRC DCC connection,
      etc), p0f is an invaluable tool for silent probing of a subject of
      such a test.
      
    - Network troubleshooting - RST+ mode can be used to debug network
      connectivity problems you or your visitors encounter.

    - Bypassing a firewall - p0f can "see thru" most NAT devices, packet
      firewalls, etc. In SYN+ACK mode, it can be used for fingerprinting
      over a connection allowed by the firewall, even if other types of
      packets are dropped; as such, p0f is the solution when NMAP and
      other active tools fail.

    - Amusement value is also pretty important. Want to know what this
      guy runs? Does he have a DSL, X.25 WAN hookup, or a shoddy SLIP 
      connection? What's Google crawlbot's uptime?

  Of course, "a successful [software] tool is one that was used to do 
  something undreamed of by its author" ;-)

-------------------
3. What's new then?
-------------------

  The original version of p0f was written somewhere in 2000 by Michal 
  Zalewski (that be me), and later taken over William Stearns (circa 2001). 
  The original author still contributes to the code from time to time, and 
  the version you're holding right now is his sole fault - although I'd like 
  William to take over further maintenance, if he's interested.

  Version 2 is a complete rewrite of the original v1 code. The main reason 
  for this is to make signatures more flexible, and to implement certain 
  additional checks for very subtle packet characteristics to improve 
  fingerprint accuracy. Changes include:

  NEW CORE CHECKS:

    - Option layout and count check,
    - EOL presence and trailing data [*],
    - Unrecognized options handling (TTCP, etc),
    - WSS to MSS/MTU correlation checks [*],
    - Zero timestamp check,
    - Non-zero ACK in initial SYN [*],
    - Non-zero "unused" TCP fields [*],
    - Non-zero urgent pointer in SYN [*],
    - Non-zero second timestamp [*],
    - Zero IP ID in initial packet,
    - Unusual auxilinary flags,
    - Data payload in control packets [*],
    - SEQ number equal to ACK number [*],
    - Zero SEQ number [*],
    - Non-empty IP options.

    [*] denotes metrics "invented" for p0f, as far as I am concerned. Other 
    metrics were discussed by certain researchers before, although usually 
    not implemented anywhere. A detailed discussion of all checks performed
    by p0f can be found in the introductory comments in p0f.fp, p0fa.fp
    and p0fr.fp.

    As a matter of fact, some of the metrics were so precise I managed
    to find several previously unknown TCP/IP stack bugs :-) See
    doc/win-memleak.txt and p0fr.fp for more information.

  IMPROVEMENTS:

    - Major performance boost - no more runtime signature parsing, added 
      BPF pre-filtering, signature hash lookups - to make p0f suitable for 
      running on high-throughput devices,

    - Advanced masquerade detection for policy enforcement (ISPs,
      corporate networks),

    - Modulo and wildcard operators for certain TCP/IP parameters to make
      it easier to come up with generic last chance signatures for
      systems that tweak settings notoriously (think Windows),

    - Auto-detection of DF-zeroing firewalls,

    - Auto-detection of MSS-tweaking NAT and router devices,

    - Media type detection based on MSS, with a database of common
      link types,
      
    - Origin network detection based on unusual ToS / precedence bits,

    - Ability to detect and skip ECN option when examining flags,

    - Better fingerprint file structure and contents - all fingerprints
      are rigorously reviewed before being added.

    - Generic last-chance signatures to cover general OS characteristics,

    - Query mode to enable easy integration with third party software -
      p0f caches recent fingerprints and answer queries for src-dst
      combinations on a local stream socket in a easy to parse
      form,

    - Usability features: greppable output option, daemon mode, host
      name resolution option, promiscuous mode switch, built-in signature 
      collision detector, ToS reporting, full packet dumps, pcap dump
      output, etc,

    - Brand new SYN+ACK and RST+ fingerprinting modes for silent
      identifications of systems you connect to the usual way (web
      browser, MTA), or even systems you cannot connect to at all;
      now also with RST+ACK flag and value validator.

    - Fixed WSCALE handling in general, and WSS passing on little-endian,
      many other bug-fixes and improvements of the packet parser
      (including some sanity checks).

    - Fuzzy checks option when no precise matches are found (limited).

  Sadly, this will break all compatibility with v1 signatures, but it's
  well worth it.

---------------
4. Command-line
---------------

  P0f is rather easy to use. There's a number of options, but you don't
  need to know most of them for normal operation:

  p0f [ -f file ] [ -i device ] [ -s file ] [ -o file ] [ -Q socket ]
      [ -w file ] [ -u user ] [ -c size ] [ -T nn ] [ -FNDVUKAXMqxtpdlRL ] 
      [ 'filter rule' ]

  -f file   - read fingerprints from file; by default, p0f reads
              signatures from ./p0f.fp or /etc/p0f/p0f.fp (the latter on
              Unix systems only). You can use this to load custom 
              fingerprint data. Specifying multiple -f values will NOT
              combine several signature files together.

  -i device - listen on this device; p0f defaults to whatever device
              libpcap considers to be the best. On some newer systems you
              might be able to specify 'any' to listen on all devices,
              but don't rely on this.

  -s file   - read packets from tcpdump snapshot; this is an alternate
              mode of operation, in which p0f reads packet from pcap
              data capture file, instead of a live network. Useful for
              forensics (this will parse tcpdump -w output, for example). 

  -w file   - writes matching packets to a tcpdump snapshot; useful when you 
              need to save the traffic in case it has to be verified or
              reviewed later on. Also useful if you encounter any parser bugs 
              - data is being written prior to parsing.

  -o file   - write to this logfile. This option is required for -d
              and implies -t.

  -Q socket - listen on a specified local stream socket (a filesystem
              object, for example /var/run/p0f-sock) for queries. You can
              later send a packet to this socket with p0f_query structure
              from p0f-query.h, and wait for p0f_response. This is a
              method of integrating p0f with active services (web server
              or web scripts, etc). P0f will still continue to report
              events the usual way, but you can use -qKU to suppress any
              text output. Also see -c notes.

              From a shell script, you can query p0f using the p0fq tool
              provided in test/ subdirectory.

              NOTE: The socket will be created with permissions corresponding
              to your current umask. If you want to restrict access to
              this interface, use caution.

              This option is currently Unix-only.

  -c size   - cache size for -Q and -M options. The default is 128, which
              is sane for a system with a moderate load (under 10 connections
              per second or such). Setting it too high will slow down p0f
              and may result in some -M false positives for dial-up nodes,
              dual-boot systems, etc. Setting it too low will result in
              cache misses for -Q option. To choose the right value,
              use the number of connections on average per the interval
              of time you want to cache, then pass it to p0f with -c.

              P0f, when run without -q, also reports average packet ratio
              on exit. You can use this to determine the optimal -c
              setting.

              This option has no effect if you do not use -Q nor -M.

  -u user   - chroot to this user's home directory after reading 
              configuration data and binding to sockets, then switch to his 
              UID, GID and supplementary groups.
              
              This is a security feature for the paranoid - when running 
              p0f in daemon mode, you might want to create a new 
              unprivileged user with an empty home directory, and limit the 
              exposure when p0f is compromised. That said, should such a 
              compromise occur, the attacker will still have a socket he can 
              use for sniffing some network traffic (better than rm -rf /).

              This option is Unix-only.

  -N        - do not report distances and link media. This option 
              logs only source IP and OS data.

  -F        - deploy fuzzy matching algorithm if no precise matches are
              found (currently applies to TTL only). This option is not
              recommended for RST+ modes.

  -D        - do not report OS details (just genre). This option is useful
              if you don't want p0f to elaborate on OS versions and such.

  -U        - do not display unknown signatures. Use this option if
              you want to keep your log file clean and are not interested
              in hosts that are not recognized.

  -K        - do not display known signatures. This option is only useful
              for fingerprint gathering.

  -q        - be quiet - do not display banners.

  -p        - switch card to promiscuous mode; by default, p0f listens 
              only to packets addressed or routed thru the machine it
              runs on. This setting might decrease performance, depending
              on your network design and load. On switched networks,
              this usually has little or no effect.

              Note that promiscuous mode on IP-enabled interfaces can be 
              detected remotely, and is sometimes not welcome by network 
              administrators.

  -t        - add human-readable timestamps to every entry (use
              multiple times to change date format, a la tcpdump).

  -d        - go into daemon mode (detach from current terminal and 
              fork into background). Requires -o.

  -l	    - outputs data in line-per-record style (easier to grep).

  -A	    - a semi-supported option for SYN+ACK mode. This option
              will fingerprint systems you connect to, as opposed to
	      systems that connect to you (default). With this option,	      
              p0f will look for p0fa.fp file instead of the usual
              p0f.fp. The usual config is NOT SUITABLE for this mode.

              The SYN+ACK signature database is sort of small at the
              moment, and still looks for a maintainer.
	      
  -R        - go into RST+ACK/RST mode. This option will fingerprint
              several different types of traffic, most importantly
              "connection refused" and "timeout" messages.
		  
	      It is similar to SYN+ACK mode, except that the program
	      will now look for p0fr.fp. The mode is also called RST+.
              Please refer to p0fr.fp before using it.

  -r	    - resolve host names; this mode is MUCH slower and poses some
              security risk. Do not use except for interactive runs or
              low traffic situations. NOTE: the option ONLY resolves
              IP address into a name, and does not perform any checks to
              verify this revDNS result. Do not rely on the name alone.

  -C        - perform collision check on signatures prior to running. This
              is an essential option whenever you add new signatures to
              the p0f.fp file, but is not necessary otherwise. 

  -L        - list all network interfaces. This option is Windows-only.

  -x        - dump full packet contents; this option is not compatible  with
              -l and is intended for debugging and packet comparison only.

  -X        - display packet payload; rarely, control packets we examine
              may carry a payload. This is a bug for the default (SYN)
              and -A (SYN+ACK) modes, but is (sometimes) acceptable in
              -R (RST+) mode.

  -M        - deploy masuqerade detection algotihm. The algorithm looks over 
              recent (cached) hits and looks for indications of multiple 
              systems being behind a single gateway. This is useful on routers 
              and such to detect policy violations. Note that this mode is 
              somewhat slower due to caching and lookups. 

  -T nn     - masquerade detection threshold; only meaningful with -M,
              sets the threshold for masquerade reporting.

  -V        - use verbose masquerade detection reporting. This option
              describes the status of all indicators, not only an overall
              value.

  The last part, 'filter rule', is a bpf-style filter expression for
  incoming packets. It is very useful for excluding or including certain
  networks, hosts, or specific packets, in the logfile. See man tcpdump for 
  more information, few examples:

     'src port ftp-data'
     'not dst net 10.0.0.0 mask 255.0.0.0'
     'dst port 80 and ( src host 195.117.3.59 or src host 217.8.32.51 )'

  The baseline rule is to select only TCP packets with SYN set, no RST, no 
  ACK, no FIN (SYN, ACK, no RST, no FIN for -A mode; RST, no FIN, no SYN
  for -R mode). You cannot make the rule any broader, the optional filter 
  expression can only narrow it down.

  You also can use a companion log report utility for p0f. Simply 
  run 'p0frep' for help.

-----------------------------
5. Active service integration
-----------------------------

  In some cases, you want to feed the p0f output to a specific
  application to take certain active measures based on the operating system
  (handle specific visitors differently, block some unwanted OSes,
  optimize the content served).

  As mentioned earlier, OpenBSD users can simply use the pf OS fingerprinting 
  implementation, a cool functionality coded by Mike Frantzen and based on 
  p0f methodology and signature database. This software allows them to 
  redirect or block OSes any way they want. Linux netfilter users can also 
  check out patches by Evgeniy Polyakov.

  In other cases, you want to use the -Q option, and then query p0f
  by connecting to a specific local stream socket and sending a single
  packet with p0f_query struct (p0f-query.h), and receiving
  p0f_response. P0f, when running in -Q mode, will cache a number of
  last OS matches, and when queried for a specified host and port
  combination, will return what it detected. Check test/p0fq.c for
  a clean example.

  The query structure (p0f_query) has the following fields (all
  values, addresses and port numbers are in machine's native endian):

    magic	- must be set to QUERY_MAGIC,
    id		- query ID, copied literally to the response,
    src_ad	- source address,
    dst_ad	- destination address,
    src_port	- source port,
    dst_port	- destination port.

  The response (p0f_response) is as follows:

    magic	- must be set to QUERY_MAGIC,
    id		- copied from the query,
    type	- RESP_OK, RESP_BADQUERY (error), RESP_NOMATCH (cache miss),
    genre[20]	- OS genre, zero length if no match,
    detail[40]	- OS version, zero length if no match,
    dist	- distance, -1 if unknown,
    link[30]	- link type description, zero length if unknown,
    tos[30]	- ToS information, zero length if unknown,
    fw,nat	- firewall and NAT flags, if spotted,
    real	- "real" OS versus userland stack,
    score	- masquerade score (or NO_SCORE), see next section,
    mflags	- exact masquerade flags (D_*), see next section.

  The connection is one-shot. Always send the query and recv the
  response immediately after connect - p0f handles the connection in
  a single thread, and you are blocking other applications (until 
  timeout, that is, the timeout is defined as two seconds in config.h).

---------------------------
6. SQL database integration
---------------------------

  At the very moment, p0f does not feature built-in database connectivity,
  although I am looking for a willing contributor to take care of it.
  In the meantime, however, you may use p0f_db utility authored by
  Nerijus Krukauskas:

  http://nk.puslapiai.lt/projects/p0f_db/

  Jonas Eckerman has some tools to make it easier to move p0f output
  from one system to another, and then to run basic visualization:

  http://whatever.frukt.org/p0f-stats.shtml