wmap v2.0


wmap started as a proof-of-concept pro­gram writ­ten in Java. Recently, it was rewrit­ten in C++ (using STL, cURL and getopt), allow­ing it to reach a more broad audi­ence. At present, it is only avail­able in source-code form, but is very easy to com­pile.


The cur­rent ver­sion of wmap is 2.0, and it can be found at:

The older (Java) ver­sion is still avail­able at:


See INSTALL.txt for details.


In order to explain wmap, we must first look at another com­puter secu­rity tool, nmap. Security pro­fes­sion­als: please excuse the over-simplification. I will be explain­ing it just enough to draw par­al­lels between nmap and wmap with­out con­fus­ing less expe­ri­enced read­ers.

As we all know, servers on the inter­net pro­vide ser­vices to user. Those ser­vices come in the form of web site, FTP sites, data­bases, email, and count­less oth­ers. Those ser­vices also come run­ning on pretty stan­dard port num­bers so that peo­ple want­ing to use those ser­vices know where to find them. Ftp is on port 21, email is on port 25, sim­ple logins are on port 23, web servers are on port 80, etc. If another sys­tem needs to deliver email, it con­nects on port 25 and sends it, with­out hav­ing to guess what port the pro­gram is run­ning on.

Nmap takes advan­tage of this piece of knowl­edge. Since there are dozens of pretty stan­dard ser­vices run­ning on pretty stan­dard port num­bers, you can eas­ily tell what kinds of soft­ware a machine is run­ning by what ports are lis­ten­ing for con­nec­tions. If you can suc­cess­fully con­nect to port 80, you know there is bound to be a web­site. If you can suc­cess­fully con­nect on port 25, you know that it accepts inbound email.

In the web world, web servers send you the files you request. Your browser requests a page because you hap­pen to know the URL–either it was linked from another page, a search engine, an email, etc. or you guessed the name. That page then likely con­tains ref­er­ences out to other pages as well as media files–images, sounds, applets, etc.

You can request files because you know they are there. Of course, there could be files and direc­to­ries that exist, but are not explic­itly linked from any­where. A good num­ber of web­sites have a “/logs” direc­tory that is web-accessable (often with a pass­word, but some­times with­out), but not actu­ally linked from anywhere–you just have to know it is there. Many per­sonal sites have a folder called “/stuff” or “/junk” where they put ran­dom stuff to share amongst their friends, but are not gen­er­ally for pub­lic con­sump­tion. Most sites have an “/images” folder to hold graph­i­cal assets–but a good amount of the time, that folder has no default page and allows “direc­tory brows­ing” so you can see a list of every image the site employs.


This is where wmap comes in. Wmap has a list of com­mon folder names. When you point it at a base URL, it appends each of the folder names, requests the page (actu­ally just a “HEAD” request for the techies that want to know), and takes note of the response. If the response is a 200-series code, there might be some­thing there worth pay­ing atten­tion to. If the response is a “403 Forbidden,” you know some­thing is there, but you will be unable to get a listing–you might have to chalk it up as not avail­able unless you want to guess file­names. If the response is a dif­fer­ent 400-series code, there is prob­a­bly noth­ing of inter­est (i.e. it doesn’t exist).

I am one of those peo­ple who learns best by exam­ple, so let us cut to an exam­ple:

% ./wmap --auto --delay=3 http://netninja.com
404 http://netninja.com/default.asp
404 http://netninja.com/default.htm
404 http://netninja.com/thumbnails/
404 http://netninja.com/gallery/
404 http://netninja.com/_img/
404 http://netninja.com/pics/
404 http://netninja.com/img/
404 http://netninja.com/image/
200 http://netninja.com/images/ <<<<<<<<<<
404 http://netninja.com/log/
404 http://netninja.com/journal/
404 http://netninja.com/blog/
404 http://netninja.com/weblog/
404 http://netninja.com/MP3S/
404 http://netninja.com/MP3/
404 http://netninja.com/mp3s/
404 http://netninja.com/mp3/
404 http://netninja.com/music/
404 http://netninja.com/flash/
404 http://netninja.com/MP3s/
404 http://netninja.com/media/
404 http://netninja.com/assets/
404 http://netninja.com/classes/
404 http://netninja.com/logs/
200 http://netninja.com/files/ <<<<<<<<<<
404 http://netninja.com/db/
404 http://netninja.com/default.html
404 http://netninja.com/sql/
404 http://netninja.com/index.jsp
404 http://netninja.com/data/
404 http://netninja.com/index.asp
404 http://netninja.com/index.pl
404 http://netninja.com/archives/
404 http://netninja.com/index.phps
404 http://netninja.com/documents/
404 http://netninja.com/index.php3
404 http://netninja.com/support/
200 http://netninja.com/index.php <<<<<<<<<<
404 http://netninja.com/index.htm
404 http://netninja.com/index.html
404 http://netninja.com/backup/

EXISTS, LOADS         http://netninja.com/images
EXISTS, LOADS         http://netninja.com/files
EXISTS, LOADS         http://netninja.com/index.php

You can see that a num­ber of files and direc­to­ries were tried. Three (the ones with “««<” arrows) returned results we might be inter­ested in. After every­thing has been tried, those three results are pre­sented in sum­mary form.

Keep in mind that send­ing repeated requests can pound a server and use a lot of resources (band­with, CPU, etc). It can also get you banned from their net­work for a while if they have a good fire­wall run­ning. Take advan­tage of the “–delay” para­me­ter. This will wait a num­ber of sec­onds between requests.


Sometimes, you might want a lit­tle more con­trol over what pages are being requested, based on a site’s con­tent. You might want to run an auto­matic scan, but fol­low it with a few man­ual requests because you think some­thing is there. This is where the man­ual scan comes in.

In man­ual mode, you give wmap a base URL, then it inter­ac­tively asks you for words. It will then attempt to locate inter­est­ing files/folders on the web server based on the word you give. For exam­ple, if you give it the word “sword­fish,” it will attempt to locate a “/swordfish/” folder, then a “swordfish.html” web page, then a “swordfish.htm” (three-letter Windows exten­sion) web page, then a “swordfish.php” script, etc.

Once again, I think a lit­tle exam­ple is in order:

% ./wmap --manual http://netninja.com
Enter directory/file name to search for at base URL (enter to quit)
> ninja
404 http://netninja.com/ninja/
404 http://netninja.com/ninja.html
404 http://netninja.com/ninja.htm
404 http://netninja.com/ninja.php
404 http://netninja.com/ninja.php3
404 http://netninja.com/ninja.phps
404 http://netninja.com/ninja.asp
404 http://netninja.com/ninja.pl
404 http://netninja.com/ninja.jsp
404 http://netninja.com/ninja.txt
404 http://netninja.com/ninja.jpg
404 http://netninja.com/ninja.gif
404 http://netninja.com/ninja.png
Enter directory/file name to search for at base URL (enter to quit)
> netninja
404 http://netninja.com/netninja/
404 http://netninja.com/netninja.html
404 http://netninja.com/netninja.htm
200 http://netninja.com/netninja.php <<<<<<<<<<
404 http://netninja.com/netninja.php3
404 http://netninja.com/netninja.phps
404 http://netninja.com/netninja.asp
404 http://netninja.com/netninja.pl
404 http://netninja.com/netninja.jsp
404 http://netninja.com/netninja.txt
404 http://netninja.com/netninja.jpg
404 http://netninja.com/netninja.gif
404 http://netninja.com/netninja.png
Enter directory/file name to search for at base URL (enter to quit)
> projects
404 http://netninja.com/projects/
404 http://netninja.com/projects.html
404 http://netninja.com/projects.htm
200 http://netninja.com/projects.php <<<<<<<<<<
404 http://netninja.com/projects.php3
404 http://netninja.com/projects.phps
404 http://netninja.com/projects.asp
404 http://netninja.com/projects.pl
404 http://netninja.com/projects.jsp
404 http://netninja.com/projects.txt
404 http://netninja.com/projects.jpg
404 http://netninja.com/projects.gif
404 http://netninja.com/projects.png
Enter directory/file name to search for at base URL (enter to quit)

As you can see, “ninja” did not turn up any­thing, but “net­ninja” and “projects” turned up some inter­est­ing pages. Of course, on the net­ninja web site, these are explic­itly linked from the front page, so noth­ing “secret” was discovered–but you should get the point.


Wmap is started with the Java com­mand “java –jar wmap.jar,” but wrap­per scripts for Windows (wmap.bat) and Unix (wmap.sh) have been pro­vided. The wrap­per scripts just need the jar file to be in the cur­rent direc­tory. Feel free to mod­ify them if you want to use them system-wide on your com­puter. Unix users might have to “chmod +x wmap.sh” since the zip file this is dis­trib­uted in does not store per­mis­sion bits.

There are two main flags avail­able: –auto (or –a) starts auto­matic mode –man­ual (or –m) starts man­ual mode

If you are using auto­matic mode, the you are advised to use: –delay={seconds} (or –d{seconds}) to set a delay

All modes require a base URL to be sup­plied on the com­mand line. It should be a stan­dard “http://” for­mat­ted URL.


At present, wmap looks for about 44 dif­fer­ent “stan­dard” fold­ers. These are all things I have per­son­ally run across in the past, as either stan­dard con­ven­tion, fold­ers auto-generated by web tools, or lazy peo­ple using sim­ple folder names. My per­sonal expe­ri­ence is noth­ing com­pared to the entire inter­net. Take a look at the file auto.properties (it can be found in the “src” folder or within the JAR file). If you know of some­thing that should be there, drop me a note at the email address listed at the top of this file.

6 thoughts on “wmap

  1. I get this when com­pil­ing

    wmap-2.0$ make
    cd src; g++ –Wall –c autolist.cc
    autolist.cc: In mem­ber func­tion ‘int AutoList::parseLine(std::string, AutoTouple*)’:
    autolist.cc:70:75: error: ‘strlen’ was not declared in this scope
    make: *** [src/autolist.o] Error 1

  2. Add #include to autolist.cc (for strlen)
    Add #include to main.c (for atoi)

    and after that curl library link miss­ing

    $ make
    g++ –Wall ‘curl-config –libs‘ –o wmap src/*.o
    src/tryurl.o: In func­tion ‘TryUrl::TryUrl()’:
    tryurl.cc:(.text+0x12): unde­fined ref­er­ence to ‘curl_global_init’
    tryurl.cc:(.text+0x17): unde­fined ref­er­ence to ‘curl_easy_init’
    tryurl.cc:(.text+0x3f): unde­fined ref­er­ence to ‘curl_easy_setopt’
    tryurl.cc:(.text+0x5d): unde­fined ref­er­ence to ‘curl_easy_setopt’
    tryurl.cc:(.text+0x7b): unde­fined ref­er­ence to ‘curl_easy_setopt’
    src/tryurl.o: In func­tion ‘TryUrl::~TryUrl()’:
    tryurl.cc:(.text+0x8f): unde­fined ref­er­ence to ‘curl_global_cleanup’
    tryurl.cc:(.text+0x9e): unde­fined ref­er­ence to ‘curl_easy_cleanup’
    src/tryurl.o: In func­tion ‘TryUrl::tryUrl(std::basic_string<char, std::char_traits, std::allocator >, std::basic_string<char, std::char_traits, std::allocator >*)’:
    tryurl.cc:(.text+0x512): unde­fined ref­er­ence to ‘curl_easy_setopt’
    tryurl.cc:(.text+0x540): unde­fined ref­er­ence to ‘curl_easy_setopt’
    tryurl.cc:(.text+0x552): unde­fined ref­er­ence to ‘curl_easy_perform’
    tryurl.cc:(.text+0x5b6): unde­fined ref­er­ence to ‘curl_easy_getinfo’
    collect2: ld returned 1 exit sta­tus
    make: *** [all] Error 1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>