Merge pull request #94 from epi052/93-fix-progress-bar-counting

fixed progress bar being incremented too little
2026-05-22 20:31:13 -03:00 · 2020-10-24 12:34:03 -05:00 · 2020-10-24 12:32:51 -05:00 · 2020-10-24 09:26:54 -05:00 · 2020-10-24 09:20:34 -05:00 · 2020-10-24 09:19:42 -05:00
16 changed files with 1435 additions and 184 deletions
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -41,6 +41,9 @@ jobs:
          use-cross: true
          command: build
          args: --release --target=${{ matrix.target }}
+      - name: Strip symbols from binary
+        run: |
+          strip -s ${{ matrix.path }}
      - name: Build tar.gz for homebrew installs
        if: matrix.type == 'ubuntu-x64'
        run: |
@@ -83,6 +86,9 @@ jobs:
          use-cross: true
          command: build
          args: --release --target=x86_64-apple-darwin
+      - name: Strip symbols from binary
+        run: |
+          strip -u -r target/x86_64-apple-darwin/release/feroxbuster
      - name: Build tar.gz for homebrew installs
        run: |
          tar czf x86_64-macos-feroxbuster.tar.gz -C target/x86_64-apple-darwin/release feroxbuster
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "feroxbuster"
-version = "1.0.5"
+version = "1.1.1"
 authors = ["Ben 'epi' Risher <epibar052@gmail.com>"]
 license = "MIT"
 edition = "2018"
@@ -25,11 +25,13 @@ clap = "2"
 lazy_static = "1.4"
 toml = "0.5"
 serde = { version = "1.0", features = ["derive"] }
+serde_json = "1.0"
 uuid = { version = "0.8", features = ["v4"] }
 indicatif = "0.15"
 console = "0.12"
 openssl = { version = "0.10", features = ["vendored"] }
 dirs = "3.0"
+regex = "1"

 [dev-dependencies]
 tempfile = "3.1"
--- a/README.md
+++ b/README.md
@@ -59,19 +59,20 @@ This attack is also known as Predictable Resource Location, File Enumeration, Di

 📖 Table of Contents
 -----------------
- [Downloads](#-downloads)
 - [Installation](#-installation)
    - [Download a Release](#download-a-release)
    - [Homebrew on MacOS and Linux](#homebrew-on-macos-and-linux)
    - [Cargo Install](#cargo-install)
    - [apt Install](#apt-install)
+    - [AUR Install](#aur-install)
    - [Docker Install](#docker-install)
- [Configuration](#-configuration)
+- [Configuration](#%EF%B8%8F-configuration)
    - [Default Values](#default-values)
    - [ferox-config.toml](#ferox-configtoml)
    - [Command Line Parsing](#command-line-parsing)
 - [Example Usage](#-example-usage)
    - [Multiple Values](#multiple-values)
+    - [Extract Links from Response Body (new in `v1.1.0`)](#extract-links-from-response-body-new-in-v110)
    - [Include Headers](#include-headers)
    - [IPv6, Non-recursive scan with INFO logging enabled](#ipv6-non-recursive-scan-with-info-level-logging-enabled)
    - [Read urls from STDIN; pipe only resulting urls out to another tool](#read-urls-from-stdin-pipe-only-resulting-urls-out-to-another-tool)
@@ -79,6 +80,8 @@ This attack is also known as Predictable Resource Location, File Enumeration, Di
    - [Proxy traffic through a SOCKS proxy](#proxy-traffic-through-a-socks-proxy)
    - [Pass auth token via query parameter](#pass-auth-token-via-query-parameter)
 - [Comparison w/ Similar Tools](#-comparison-w-similar-tools)
+- [Common Problems/Issues (FAQ)](#-common-problemsissues-faq)
+    - [No file descriptors available](#no-file-descriptors-available)

 ## 💿 Installation

@@ -162,6 +165,14 @@ unzip feroxbuster_amd64.deb.zip
 sudo apt install ./feroxbuster_amd64.deb
 ```

+### AUR Install
+
+Install `feroxbuster-git` on Arch Linux with your AUR helper of choice:
+
+```
+yay -S feroxbuster-git
+```
+
 ### Docker Install

 > The following steps assume you have docker installed / setup
@@ -292,6 +303,7 @@ A pre-made configuration file with examples of all available settings can be fou
 # addslash = true
 # stdin = true
 # dontfilter = true
+# extract_links = true
 # depth = 1
 # sizefilters = [5174]
 # queries = [["name","value"], ["rick", "astley"]]
@@ -318,16 +330,18 @@ USAGE:
    feroxbuster [FLAGS] [OPTIONS] --url <URL>...

 FLAGS:
-    -f, --addslash       Append / to each request
-    -D, --dontfilter     Don't auto-filter wildcard responses
-    -h, --help           Prints help information
-    -k, --insecure       Disables TLS certificate validation
-    -n, --norecursion    Do not scan recursively
-    -q, --quiet          Only print URLs; Don't print status codes, response size, running config, etc...
-    -r, --redirects      Follow redirects
-        --stdin          Read url(s) from STDIN
-    -V, --version        Prints version information
-    -v, --verbosity      Increase verbosity level (use -vv or more for greater effect)
+    -f, --addslash         Append / to each request
+    -D, --dontfilter       Don't auto-filter wildcard responses
+    -e, --extract-links    Extract links from response body (html, javascript, etc...); make new requests based on
+                           findings (default: false)
+    -h, --help             Prints help information
+    -k, --insecure         Disables TLS certificate validation
+    -n, --norecursion      Do not scan recursively
+    -q, --quiet            Only print URLs; Don't print status codes, response size, running config, etc...
+    -r, --redirects        Follow redirects
+        --stdin            Read url(s) from STDIN
+    -V, --version          Prints version information
+    -v, --verbosity        Increase verbosity level (use -vv or more for greater effect)

 OPTIONS:
    -d, --depth <RECURSION_DEPTH>           Maximum recursion depth, a depth of 0 is infinite recursion (default: 4)
@@ -365,6 +379,26 @@ All of the methods above (multiple flags, space separated, comma separated, etc.
 ./feroxbuster -u http://127.1 -H Accept:application/json "Authorization: Bearer {token}"
 ```

+### Extract Links from Response Body (New in `v1.1.0`) 
+
+Search through the body of valid responses (html, javascript, etc...) for additional endpoints to scan. This turns
+`feroxbuster` into a hybrid that looks for both linked and unlinked content. 
+
+Example request/response with `--extract-links` enabled:
+- Make request to `http://example.com/index.html`
+- Receive, and read in, the `body` of the response
+- Search the `body` for absolute and relative links (i.e. `homepage/assets/img/icons/handshake.svg`)
+- Add the following directories for recursive scanning:
+    - `http://example.com/homepage`
+    - `http://example.com/homepage/assets`
+    - `http://example.com/homepage/assets/img`
+    - `http://example.com/homepage/assets/img/icons`
+- Make a single request to `http://example.com/homepage/assets/img/icons/handshake.svg`
+
+```
+./feroxbuster -u http://127.1 --extract-links
+```
+
 ### IPv6, non-recursive scan with INFO-level logging enabled

 ```
@@ -410,29 +444,102 @@ a few of the use-cases in which feroxbuster may be a better fit:
 - You want to be able to run your content discovery as part of some crazy 12 command unix **pipeline extravaganza**
 - You want to scan through a **SOCKS** proxy
 - You want **auto-filtering** of Wildcard responses by default
+- You want an integrated **link extractor** to increase discovered endpoints
 - You want **recursion** along with some other thing mentioned above (ffuf also does recursion)
 - You want a **configuration file** option for overriding built-in default values for your scans

-|                                                     | feroxbuster | gobuster | ffuf |
-|-----------------------------------------------------|---|---|---|
-| fast                                                | ✔ | ✔ | ✔ |
-| easy to use                                         | ✔ | ✔ |   |
-| blacklist status codes (in addition to whitelist)   |   | ✔ | ✔ |
-| allows recursion                                    | ✔ |   | ✔ |
-| can specify query parameters                        | ✔ |   | ✔ |
-| SOCKS proxy support                                 | ✔ |   |   |
-| multiple target scan (via stdin or multiple -u)     | ✔ |   | ✔ |
-| configuration file for default value override       | ✔ |   | ✔ |
-| can accept urls via STDIN as part of a pipeline     | ✔ |   | ✔ |
-| can accept wordlists via STDIN                      |   | ✔ | ✔ |
-| filter by response size                             | ✔ |   | ✔ |
-| auto-filter wildcard responses                      | ✔ |   | ✔ |
-| performs other scans (vhost, dns, etc)              |   | ✔ | ✔ |
-| time delay / rate limiting                          |   | ✔ | ✔ |
-| **huge** number of other options                    |   |   | ✔ |
+|                                                                  | feroxbuster | gobuster | ffuf |
+|------------------------------------------------------------------|---|---|---|
+| fast                                                             | ✔ | ✔ | ✔ |
+| easy to use                                                      | ✔ | ✔ |   |
+| blacklist status codes (in addition to whitelist)                |   | ✔ | ✔ |
+| allows recursion                                                 | ✔ |   | ✔ |
+| can specify query parameters                                     | ✔ |   | ✔ |
+| SOCKS proxy support                                              | ✔ |   |   |
+| extracts links from response body to increase scan coverage      | ✔ |   |   |
+| multiple target scan (via stdin or multiple -u)                  | ✔ |   | ✔ |
+| configuration file for default value override                    | ✔ |   | ✔ |
+| can accept urls via STDIN as part of a pipeline                  | ✔ |   | ✔ |
+| can accept wordlists via STDIN                                   |   | ✔ | ✔ |
+| filter by response size                                          | ✔ |   | ✔ |
+| auto-filter wildcard responses                                   | ✔ |   | ✔ |
+| performs other scans (vhost, dns, etc)                           |   | ✔ | ✔ |
+| time delay / rate limiting                                       |   | ✔ | ✔ |
+| **huge** number of other options                                 |   |   | ✔ |

 Of note, there's another written-in-rust content discovery tool, [rustbuster](https://github.com/phra/rustbuster). I 
 came across rustbuster when I was naming my tool (😢). I don't have any experience using it, but it appears to 
 be able to do POST requests with an HTTP body, has SOCKS support, and has an 8.3 shortname scanner (in addition to vhost
 dns, directory, etc...).  In short, it definitely looks interesting and may be what you're looking for as it has some 
 capability I haven't seen in similar tools.  
+
+## 🤯 Common Problems/Issues (FAQ)
+
+### No file descriptors available
+
+Why do I get a bunch of `No file descriptors available (os error 24)` errors?
+
+---
+
+There are a few potential causes of this error.  The simplest is that your operating system sets an open file limit that is aggressively low.  Through personal testing, I've found that `4096` is a reasonable open file limit (this will vary based on your exact setup).
+
+There are quite a few options to solve this particular problem, of which a handful are shown below.  
+
+#### Increase the Number of Open Files
+
+We'll start by increasing the number of open files the OS allows. On my Kali install, the default was `1024`, and I know some MacOS installs use `256` 😕.
+
+##### Edit `/etc/security/limits.conf`
+
+One option to up the limit is to edit `/etc/security/limits.conf` so that it includes the two lines below.  
+
+- `*` represents all users
+- `hard` and `soft` indicate the hard and soft limits for the OS 
+- `nofile` is the number of open files option. 
+
+```
+/etc/security/limits.conf
+-------------------------
+...
+*        soft nofile 4096
+*        hard nofile 8192
+...
+```
+
+##### Use `ulimit` directly
+
+A faster option, that is **not** persistent, is to simply use the `ulimit` command to change the setting.
+
+```
+ulimit -n 4096
+```
+
+#### Additional Tweaks (may not be needed)
+
+If you still find yourself hitting the file limit with the above changes, there are a few additional tweaks that may help.  
+
+> This section was shamelessly stolen from this [stackoverflow answer](https://stackoverflow.com/a/3923785).  More information is included in that post and is recommended reading if you end up needing to use this section.
+
+✨ Special thanks to HTB user [@sparkla](https://www.hackthebox.eu/home/users/profile/221599) for their help with identifying these additional tweaks ✨
+
+##### Increase the ephemeral port range, and decrease the tcp_fin_timeout.
+
+The ephermal port range defines the maximum number of outbound sockets a host can create from a particular I.P. address. The fin_timeout defines the minimum time these sockets will stay in TIME_WAIT state (unusable after being used once). Usual system defaults are
+
+- `net.ipv4.ip_local_port_range = 32768   61000`
+- `net.ipv4.tcp_fin_timeout = 60`
+
+This basically means your system cannot consistently guarantee more than `(61000 - 32768) / 60 = 470` sockets per second.
+
+```
+sudo sysctl net.ipv4.ip_local_port_range="15000 61000"
+sudo sysctl net.ipv4.tcp_fin_timeout=30
+```
+
+##### Allow socket reuse while in a `TIME_WAIT` status
+
+This allows fast cycling of sockets in time_wait state and re-using them. Make sure to read post [Coping with the TCP TIME-WAIT](https://vincent.bernat.ch/en/blog/2014-tcp-time-wait-state-linux) from Vincent Bernat to understand the implications.
+
+```
+sudo sysctl net.ipv4.tcp_tw_reuse=1 
+```
--- a/ferox-config.toml.example
+++ b/ferox-config.toml.example
@@ -23,6 +23,7 @@
 # addslash = true
 # stdin = true
 # dontfilter = true
+# extract_links = true
 # depth = 1
 # sizefilters = [5174]
 # queries = [["name","value"], ["rick", "astley"]]
--- a/src/banner.rs
+++ b/src/banner.rs
@@ -1,4 +1,8 @@
-use crate::{config::Configuration, utils::status_colorizer, VERSION};
+use crate::config::{Configuration, CONFIGURATION};
+use crate::utils::{make_request, status_colorizer};
+use reqwest::{Client, Url};
+use serde_json::Value;
+use std::io::Write;

 /// macro helper to abstract away repetitive string formatting
 macro_rules! format_banner_entry_helper {
@@ -40,31 +44,119 @@ macro_rules! format_banner_entry {
    };
 }

+/// Url used to query github's api; specifically used to look for the latest tagged release name
+const UPDATE_URL: &str = "https://api.github.com/repos/epi052/feroxbuster/releases/latest";
+
+/// Simple enum to hold three different update states
+#[derive(Debug)]
+enum UpdateStatus {
+    /// this version and latest release are the same
+    UpToDate,
+
+    /// this version and latest release are not the same
+    OutOfDate,
+
+    /// some error occurred during version check
+    Unknown,
+}
+
+/// Makes a request to the given url, expecting to receive a JSON response that contains a field
+/// named `tag_name` that holds a value representing the latest tagged release of this tool.
+///
+/// ex: v1.1.0
+///
+/// Returns `UpdateStatus`
+async fn needs_update(client: &Client, url: &str, bin_version: &str) -> UpdateStatus {
+    log::trace!("enter: needs_update({:?}, {})", client, url);
+
+    let unknown = UpdateStatus::Unknown;
+
+    let api_url = match Url::parse(url) {
+        Ok(url) => url,
+        Err(e) => {
+            log::error!("{}", e);
+            log::trace!("exit: needs_update -> {:?}", unknown);
+            return unknown;
+        }
+    };
+
+    if let Ok(response) = make_request(&client, &api_url).await {
+        let body = response.text().await.unwrap_or_default();
+
+        let json_response: Value = serde_json::from_str(&body).unwrap_or_default();
+
+        if json_response.is_null() {
+            // unwrap_or_default above should result in a null value for the json_response variable
+            log::error!("Could not parse JSON from response body");
+            log::trace!("exit: needs_update -> {:?}", unknown);
+            return unknown;
+        }
+
+        let latest_version = match json_response["tag_name"].as_str() {
+            Some(tag) => tag.trim_start_matches('v'),
+            None => {
+                log::error!("Could not get version field from JSON response");
+                log::debug!("{}", json_response);
+                log::trace!("exit: needs_update -> {:?}", unknown);
+                return unknown;
+            }
+        };
+
+        // if we've gotten this far, we have a string in the form of X.X.X where X is a number
+        // all that's left is to compare the current version with the version found above
+
+        return if latest_version == bin_version {
+            // there's really only two possible outcomes if we accept that the tag conforms to
+            // the X.X.X pattern:
+            //   1. the version strings match, meaning we're up to date
+            //   2. the version strings do not match, meaning we're out of date
+            //
+            // except for developers working on this code, nobody should ever be in a situation
+            // where they have a version greater than the latest tagged release
+            log::trace!("exit: needs_update -> UpdateStatus::UpToDate");
+            UpdateStatus::UpToDate
+        } else {
+            log::trace!("exit: needs_update -> UpdateStatus::OutOfDate");
+            UpdateStatus::OutOfDate
+        };
+    }
+
+    log::trace!("exit: needs_update -> {:?}", unknown);
+    unknown
+}
+
 /// Prints the banner to stdout.
 ///
 /// Only prints those settings which are either always present, or passed in by the user.
-pub fn initialize(targets: &[String], config: &Configuration) {
+pub async fn initialize<W>(targets: &[String], config: &Configuration, version: &str, mut writer: W)
+where
+    W: Write,
+{
    let artwork = format!(
        r#"
 ___  ___  __   __     __      __         __   ___
 |__  |__  |__) |__) | /  `    /  \ \_/ | |  \ |__
 |    |___ |  \ |  \ | \__,    \__/ / \ | |__/ |___
 by Ben "epi" Risher {}                  ver: {}"#,
-        '\u{1F913}', VERSION
+        '\u{1F913}', version
    );

+    let status = needs_update(&CONFIGURATION.client, UPDATE_URL, version).await;
+
    let top = "───────────────────────────┬──────────────────────";
    let bottom = "───────────────────────────┴──────────────────────";

-    eprintln!("{}", artwork);
-    eprintln!("{}", top);
+    writeln!(&mut writer, "{}", artwork).unwrap_or_default();
+    writeln!(&mut writer, "{}", top).unwrap_or_default();

    // begin with always printed items
    for target in targets {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1F3af}", "Target Url", target)
-        ); // 🎯
+        )
+        .unwrap_or_default(); // 🎯
    }

    let mut codes = vec![];
@@ -73,206 +165,419 @@ by Ben "epi" Risher {}                  ver: {}"#,
        codes.push(status_colorizer(&code.to_string()))
    }

-    eprintln!(
+    writeln!(
+        &mut writer,
        "{}",
        format_banner_entry!("\u{1F680}", "Threads", config.threads)
-    ); // 🚀
-    eprintln!(
+    )
+    .unwrap_or_default(); // 🚀
+
+    writeln!(
+        &mut writer,
        "{}",
        format_banner_entry!("\u{1f4d6}", "Wordlist", config.wordlist)
-    ); // 📖
-    eprintln!(
+    )
+    .unwrap_or_default(); // 📖
+
+    writeln!(
+        &mut writer,
        "{}",
        format_banner_entry!(
            "\u{1F197}",
            "Status Codes",
            format!("[{}]", codes.join(", "))
        )
-    ); // 🆗
-    eprintln!(
+    )
+    .unwrap_or_default(); // 🆗
+
+    writeln!(
+        &mut writer,
        "{}",
        format_banner_entry!("\u{1f4a5}", "Timeout (secs)", config.timeout)
-    ); // 💥
-    eprintln!(
+    )
+    .unwrap_or_default(); // 💥
+
+    writeln!(
+        &mut writer,
        "{}",
        format_banner_entry!("\u{1F9a1}", "User-Agent", config.useragent)
-    ); // 🦡
+    )
+    .unwrap_or_default(); // 🦡

    // followed by the maybe printed or variably displayed values
    if !config.config.is_empty() {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f489}", "Config File", config.config)
-        ); // 💉
+        )
+        .unwrap_or_default(); // 💉
    }

    if !config.proxy.is_empty() {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f48e}", "Proxy", config.proxy)
-        ); // 💎
+        )
+        .unwrap_or_default(); // 💎
    }

    if !config.headers.is_empty() {
        for (name, value) in &config.headers {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f92f}", "Header", name, value)
-            ); // 🤯
+            )
+            .unwrap_or_default(); // 🤯
        }
    }

    if !config.sizefilters.is_empty() {
        for filter in &config.sizefilters {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f4a2}", "Size Filter", filter)
-            ); // 💢
+            )
+            .unwrap_or_default(); // 💢
        }
    }

+    if config.extract_links {
+        writeln!(
+            &mut writer,
+            "{}",
+            format_banner_entry!("\u{1F50E}", "Extract Links", config.extract_links)
+        )
+        .unwrap_or_default(); // 🔎
+    }
+
    if !config.queries.is_empty() {
        for query in &config.queries {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!(
                    "\u{1f914}",
                    "Query Parameter",
                    format!("{}={}", query.0, query.1)
                )
-            ); // 🤔
+            )
+            .unwrap_or_default(); // 🤔
        }
    }

    if !config.output.is_empty() {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f4be}", "Output File", config.output)
-        ); // 💾
+        )
+        .unwrap_or_default(); // 💾
    }

    if !config.extensions.is_empty() {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!(
                "\u{1f4b2}",
                "Extensions",
                format!("[{}]", config.extensions.join(", "))
            )
-        ); // 💲
+        )
+        .unwrap_or_default(); // 💲
    }

    if config.insecure {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f513}", "Insecure", config.insecure)
-        ); // 🔓
+        )
+        .unwrap_or_default(); // 🔓
    }

    if config.redirects {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f4cd}", "Follow Redirects", config.redirects)
-        ); // 📍
+        )
+        .unwrap_or_default(); // 📍
    }

    if config.dontfilter {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f92a}", "Filter Wildcards", !config.dontfilter)
-        ); // 🤪
+        )
+        .unwrap_or_default(); // 🤪
    }

    match config.verbosity {
        //speaker medium volume (increasing with verbosity to loudspeaker)
        1 => {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f508}", "Verbosity", config.verbosity)
-            ); // 🔈
+            )
+            .unwrap_or_default(); // 🔈
        }
        2 => {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f509}", "Verbosity", config.verbosity)
-            ); // 🔉
+            )
+            .unwrap_or_default(); // 🔉
        }
        3 => {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f50a}", "Verbosity", config.verbosity)
-            ); // 🔊
+            )
+            .unwrap_or_default(); // 🔊
        }
        4 => {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f4e2}", "Verbosity", config.verbosity)
-            ); // 📢
+            )
+            .unwrap_or_default(); // 📢
        }
        _ => {}
    }

    if config.addslash {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1fa93}", "Add Slash", config.addslash)
-        ); // 🪓
+        )
+        .unwrap_or_default(); // 🪓
    }

    if !config.norecursion {
        if config.depth == 0 {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f503}", "Recursion Depth", "INFINITE")
-            ); // 🔃
+            )
+            .unwrap_or_default(); // 🔃
        } else {
-            eprintln!(
+            writeln!(
+                &mut writer,
                "{}",
                format_banner_entry!("\u{1f503}", "Recursion Depth", config.depth)
-            ); // 🔃
+            )
+            .unwrap_or_default(); // 🔃
        }
    } else {
-        eprintln!(
+        writeln!(
+            &mut writer,
            "{}",
            format_banner_entry!("\u{1f6ab}", "Do Not Recurse", config.norecursion)
-        ); // 🚫
+        )
+        .unwrap_or_default(); // 🚫
    }

-    eprintln!("{}", bottom);
+    if matches!(status, UpdateStatus::OutOfDate) {
+        writeln!(
+            &mut writer,
+            "{}",
+            format_banner_entry!(
+                "\u{1f389}",
+                "New Version Available",
+                "https://github.com/epi052/feroxbuster/releases/latest"
+            )
+        )
+        .unwrap_or_default(); // 🎉
+    }
+
+    writeln!(&mut writer, "{}", bottom).unwrap_or_default();
 }

 #[cfg(test)]
 mod tests {
    use super::*;
+    use crate::VERSION;
+    use httpmock::Method::GET;
+    use httpmock::{Mock, MockServer};
+    use std::fs::read_to_string;
+    use std::io::stderr;
+    use std::time::Duration;
+    use tempfile::NamedTempFile;

-    #[test]
+    #[tokio::test(core_threads = 1)]
    /// test to hit no execution of targets for loop in banner
-    fn banner_without_targets() {
+    async fn banner_intialize_without_targets() {
        let config = Configuration::default();
-        initialize(&[], &config);
+        initialize(&[], &config, VERSION, stderr()).await;
    }

-    #[test]
+    #[tokio::test(core_threads = 1)]
    /// test to hit no execution of statuscode for loop in banner
-    fn banner_without_status_codes() {
+    async fn banner_intialize_without_status_codes() {
        let mut config = Configuration::default();
        config.statuscodes = vec![];
-        initialize(&[String::from("http://localhost")], &config);
+        initialize(
+            &[String::from("http://localhost")],
+            &config,
+            VERSION,
+            stderr(),
+        )
+        .await;
    }

-    #[test]
+    #[tokio::test(core_threads = 1)]
    /// test to hit an empty config file
-    fn banner_without_config_file() {
+    async fn banner_intialize_without_config_file() {
        let mut config = Configuration::default();
        config.config = String::new();
-        initialize(&[String::from("http://localhost")], &config);
+        initialize(
+            &[String::from("http://localhost")],
+            &config,
+            VERSION,
+            stderr(),
+        )
+        .await;
    }

-    #[test]
+    #[tokio::test(core_threads = 1)]
    /// test to hit an empty config file
-    fn banner_without_queries() {
+    async fn banner_intialize_without_queries() {
        let mut config = Configuration::default();
        config.queries = vec![(String::new(), String::new())];
-        initialize(&[String::from("http://localhost")], &config);
+        initialize(
+            &[String::from("http://localhost")],
+            &config,
+            VERSION,
+            stderr(),
+        )
+        .await;
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test to show that a new version is available for download
+    async fn banner_intialize_with_mismatched_version() {
+        let config = Configuration::default();
+        let file = NamedTempFile::new().unwrap();
+        initialize(
+            &[String::from("http://localhost")],
+            &config,
+            "mismatched-version",
+            &file,
+        )
+        .await;
+        let contents = read_to_string(file.path()).unwrap();
+        println!("contents: {}", contents);
+        assert!(contents.contains("New Version Available"));
+        assert!(contents.contains("https://github.com/epi052/feroxbuster/releases/latest"));
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test that
+    async fn banner_needs_update_returns_unknown_with_bad_url() {
+        let result = needs_update(&CONFIGURATION.client, &"", VERSION).await;
+        assert!(matches!(result, UpdateStatus::Unknown));
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test return value of good url to needs_update
+    async fn banner_needs_update_returns_up_to_date() {
+        let srv = MockServer::start();
+
+        let mock = Mock::new()
+            .expect_method(GET)
+            .expect_path("/latest")
+            .return_status(200)
+            .return_body("{\"tag_name\":\"v1.1.0\"}")
+            .create_on(&srv);
+
+        let result = needs_update(&CONFIGURATION.client, &srv.url("/latest"), "1.1.0").await;
+
+        assert_eq!(mock.times_called(), 1);
+        assert!(matches!(result, UpdateStatus::UpToDate));
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test return value of good url to needs_update that returns a newer version than current
+    async fn banner_needs_update_returns_out_of_date() {
+        let srv = MockServer::start();
+
+        let mock = Mock::new()
+            .expect_method(GET)
+            .expect_path("/latest")
+            .return_status(200)
+            .return_body("{\"tag_name\":\"v1.1.0\"}")
+            .create_on(&srv);
+
+        let result = needs_update(&CONFIGURATION.client, &srv.url("/latest"), "1.0.1").await;
+
+        assert_eq!(mock.times_called(), 1);
+        assert!(matches!(result, UpdateStatus::OutOfDate));
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test return value of good url that times out
+    async fn banner_needs_update_returns_unknown_on_timeout() {
+        let srv = MockServer::start();
+
+        let mock = Mock::new()
+            .expect_method(GET)
+            .expect_path("/latest")
+            .return_status(200)
+            .return_body("{\"tag_name\":\"v1.1.0\"}")
+            .return_with_delay(Duration::from_secs(8))
+            .create_on(&srv);
+
+        let result = needs_update(&CONFIGURATION.client, &srv.url("/latest"), "1.0.1").await;
+
+        assert_eq!(mock.times_called(), 1);
+        assert!(matches!(result, UpdateStatus::Unknown));
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test return value of good url with bad json response
+    async fn banner_needs_update_returns_unknown_on_bad_json_response() {
+        let srv = MockServer::start();
+
+        let mock = Mock::new()
+            .expect_method(GET)
+            .expect_path("/latest")
+            .return_status(200)
+            .return_body("not json")
+            .create_on(&srv);
+
+        let result = needs_update(&CONFIGURATION.client, &srv.url("/latest"), "1.0.1").await;
+
+        assert_eq!(mock.times_called(), 1);
+        assert!(matches!(result, UpdateStatus::Unknown));
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// test return value of good url with json response that lacks the tag_name field
+    async fn banner_needs_update_returns_unknown_on_json_without_correct_tag() {
+        let srv = MockServer::start();
+
+        let mock = Mock::new()
+            .expect_method(GET)
+            .expect_path("/latest")
+            .return_status(200)
+            .return_body("{\"no tag_name\": \"doesn't exist\"}")
+            .create_on(&srv);
+
+        let result = needs_update(&CONFIGURATION.client, &srv.url("/latest"), "1.0.1").await;
+
+        assert_eq!(mock.times_called(), 1);
+        assert!(matches!(result, UpdateStatus::Unknown));
    }
 }
--- a/src/config.rs
+++ b/src/config.rs
@@ -107,6 +107,10 @@ pub struct Configuration {
    #[serde(default)]
    pub norecursion: bool,

+    /// Extract links from html/javscript
+    #[serde(default)]
+    pub extract_links: bool,
+
    /// Append / to each request
    #[serde(default)]
    pub addslash: bool,
@@ -182,8 +186,9 @@ impl Default for Configuration {
            verbosity: 0,
            addslash: false,
            insecure: false,
-            norecursion: false,
            redirects: false,
+            norecursion: false,
+            extract_links: false,
            proxy: String::new(),
            config: String::new(),
            output: String::new(),
@@ -206,6 +211,7 @@ impl Configuration {
    ///
    /// - **timeout**: `5` seconds
    /// - **redirects**: `false`
+    /// - **extract-links**: `false`
    /// - **wordlist**: [`DEFAULT_WORDLIST`](constant.DEFAULT_WORDLIST.html)
    /// - **config**: `None`
    /// - **threads**: `50`
@@ -390,6 +396,10 @@ impl Configuration {
            config.addslash = args.is_present("addslash");
        }

+        if args.is_present("extract_links") {
+            config.extract_links = args.is_present("extract_links");
+        }
+
        if args.is_present("stdin") {
            config.stdin = args.is_present("stdin");
        } else {
@@ -514,6 +524,7 @@ impl Configuration {
        settings.useragent = settings_to_merge.useragent;
        settings.redirects = settings_to_merge.redirects;
        settings.insecure = settings_to_merge.insecure;
+        settings.extract_links = settings_to_merge.extract_links;
        settings.extensions = settings_to_merge.extensions;
        settings.headers = settings_to_merge.headers;
        settings.queries = settings_to_merge.queries;
@@ -574,6 +585,7 @@ mod tests {
            addslash = true
            stdin = true
            dontfilter = true
+            extract_links = true
            depth = 1
            sizefilters = [4120]
        "#;
@@ -602,6 +614,7 @@ mod tests {
        assert_eq!(config.stdin, false);
        assert_eq!(config.addslash, false);
        assert_eq!(config.redirects, false);
+        assert_eq!(config.extract_links, false);
        assert_eq!(config.insecure, false);
        assert_eq!(config.queries, Vec::new());
        assert_eq!(config.extensions, Vec::<String>::new());
@@ -714,6 +727,13 @@ mod tests {
        assert_eq!(config.addslash, true);
    }

+    #[test]
+    /// parse the test config and see that the value parsed is correct
+    fn config_reads_extract_links() {
+        let config = setup_config_test();
+        assert_eq!(config.extract_links, true);
+    }
+
    #[test]
    /// parse the test config and see that the value parsed is correct
    fn config_reads_extensions() {
--- a/src/extractor.rs
+++ b/src/extractor.rs
@@ -0,0 +1,269 @@
+use crate::FeroxResponse;
+use lazy_static::lazy_static;
+use regex::Regex;
+use reqwest::Url;
+use std::collections::HashSet;
+
+/// Regular expression used in [LinkFinder](https://github.com/GerbenJavado/LinkFinder)
+///
+/// Incorporates change from this [Pull Request](https://github.com/GerbenJavado/LinkFinder/pull/66/files)
+const LINKFINDER_REGEX: &str = r#"(?:"|')(((?:[a-zA-Z]{1,10}://|//)[^"'/]{1,}\.[a-zA-Z]{2,}[^"']{0,})|((?:/|\.\./|\./)[^"'><,;| *()(%%$^/\\\[\]][^"'><,;|()]{1,})|([a-zA-Z0-9_\-/]{1,}/[a-zA-Z0-9_\-/]{1,}\.(?:[a-zA-Z]{1,4}|action)(?:[\?|#][^"|']{0,}|))|([a-zA-Z0-9_\-/]{1,}/[a-zA-Z0-9_\-/]{3,}(?:[\?|#][^"|']{0,}|))|([a-zA-Z0-9_\-.]{1,}\.(?:php|asp|aspx|jsp|json|action|html|js|txt|xml)(?:[\?|#][^"|']{0,}|)))(?:"|')"#;
+
+lazy_static! {
+    /// `LINKFINDER_REGEX` as a regex::Regex type
+    static ref REGEX: Regex = Regex::new(LINKFINDER_REGEX).unwrap();
+}
+
+/// Iterate over a given path, return a list of every sub-path found
+///
+/// example: `path` contains a link fragment `homepage/assets/img/icons/handshake.svg`
+/// the following fragments would be returned:
+///   - homepage/assets/img/icons/handshake.svg
+///   - homepage/assets/img/icons/
+///   - homepage/assets/img/
+///   - homepage/assets/
+///   - homepage/
+fn get_sub_paths_from_path(path: &str) -> Vec<String> {
+    log::trace!("enter: get_sub_paths_from_path({})", path);
+    let mut paths = vec![];
+
+    // filter out any empty strings caused by .split
+    let mut parts: Vec<&str> = path.split('/').filter(|s| !s.is_empty()).collect();
+
+    let length = parts.len();
+
+    for _ in 0..length {
+        // iterate over all parts of the path
+        if parts.is_empty() {
+            // pop left us with an empty vector, we're done
+            break;
+        }
+
+        let possible_path = parts.join("/");
+
+        if possible_path.is_empty() {
+            // .join can result in an empty string, which we don't need, ignore
+            continue;
+        }
+
+        paths.push(possible_path); // good sub-path found
+        parts.pop(); // use .pop() to remove the last part of the path and continue iteration
+    }
+
+    log::trace!("exit: get_sub_paths_from_path -> {:?}", paths);
+    paths
+}
+
+/// simple helper to stay DRY, trys to join a url + fragment and add it to the `links` HashSet
+fn add_link_to_set_of_links(link: &str, url: &Url, links: &mut HashSet<String>) {
+    log::trace!(
+        "enter: add_link_to_set_of_links({}, {}, {:?})",
+        link,
+        url.to_string(),
+        links
+    );
+    match url.join(&link) {
+        Ok(new_url) => {
+            links.insert(new_url.to_string());
+        }
+        Err(e) => {
+            log::error!("Could not join given url to the base url: {}", e);
+        }
+    }
+    log::trace!("exit: add_link_to_set_of_links");
+}
+
+/// Given a `reqwest::Response`, perform the following actions
+///   - parse the response's text for links using the linkfinder regex
+///   - for every link found take its url path and parse each sub-path
+///     - example: Response contains a link fragment `homepage/assets/img/icons/handshake.svg`
+///       with a base url of http://localhost, the following urls would be returned:
+///         - homepage/assets/img/icons/handshake.svg
+///         - homepage/assets/img/icons/
+///         - homepage/assets/img/
+///         - homepage/assets/
+///         - homepage/
+pub async fn get_links(response: &FeroxResponse) -> HashSet<String> {
+    log::trace!("enter: get_links({})", response.url().as_str());
+
+    let mut links = HashSet::<String>::new();
+
+    let body = response.text();
+
+    for capture in REGEX.captures_iter(&body) {
+        // remove single & double quotes from both ends of the capture
+        // capture[0] is the entire match, additional capture groups start at [1]
+        let link = capture[0].trim_matches(|c| c == '\'' || c == '"');
+
+        match Url::parse(link) {
+            Ok(absolute) => {
+                if absolute.domain() != response.url().domain()
+                    || absolute.host() != response.url().host()
+                {
+                    // domains/ips are not the same, don't scan things that aren't part of the original
+                    // target url
+                    continue;
+                }
+
+                for sub_path in get_sub_paths_from_path(absolute.path()) {
+                    // take a url fragment like homepage/assets/img/icons/handshake.svg and
+                    // incrementally add
+                    //     - homepage/assets/img/icons/
+                    //     - homepage/assets/img/
+                    //     - homepage/assets/
+                    //     - homepage/
+                    log::debug!("Adding {} to {:?}", sub_path, links);
+                    add_link_to_set_of_links(&sub_path, &response.url(), &mut links);
+                }
+            }
+            Err(e) => {
+                // this is the expected error that happens when we try to parse a url fragment
+                //     ex: Url::parse("/login") -> Err("relative URL without a base")
+                // while this is technically an error, these are good results for us
+                if e.to_string().contains("relative URL without a base") {
+                    for sub_path in get_sub_paths_from_path(link) {
+                        // incrementally save all sub-paths that led to the relative url's resource
+                        log::debug!("Adding {} to {:?}", sub_path, links);
+                        add_link_to_set_of_links(&sub_path, &response.url(), &mut links);
+                    }
+                } else {
+                    // unexpected error has occurred
+                    log::error!("Could not parse given url: {}", e);
+                }
+            }
+        }
+    }
+
+    log::trace!("exit: get_links -> {:?}", links);
+    links
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::utils::make_request;
+    use httpmock::Method::GET;
+    use httpmock::{Mock, MockServer};
+    use reqwest::Client;
+
+    #[test]
+    /// extract sub paths from the given url fragment; expect 4 sub paths and that all are
+    /// in the expected array
+    fn extractor_get_sub_paths_from_path_with_multiple_paths() {
+        let path = "homepage/assets/img/icons/handshake.svg";
+        let paths = get_sub_paths_from_path(&path);
+        let expected = vec![
+            "homepage",
+            "homepage/assets",
+            "homepage/assets/img",
+            "homepage/assets/img/icons",
+            "homepage/assets/img/icons/handshake.svg",
+        ];
+
+        assert_eq!(paths.len(), expected.len());
+        for expected_path in expected {
+            assert_eq!(paths.contains(&expected_path.to_string()), true);
+        }
+    }
+
+    #[test]
+    /// extract sub paths from the given url fragment; expect 2 sub paths and that all are
+    /// in the expected array. the fragment is wrapped in slashes to ensure no empty strings are
+    /// returned
+    fn extractor_get_sub_paths_from_path_with_enclosing_slashes() {
+        let path = "/homepage/assets/";
+        let paths = get_sub_paths_from_path(&path);
+        let expected = vec!["homepage", "homepage/assets"];
+
+        assert_eq!(paths.len(), expected.len());
+        for expected_path in expected {
+            assert_eq!(paths.contains(&expected_path.to_string()), true);
+        }
+    }
+
+    #[test]
+    /// extract sub paths from the given url fragment; expect 1 sub path, no forward slashes are
+    /// included
+    fn extractor_get_sub_paths_from_path_with_only_a_word() {
+        let path = "homepage";
+        let paths = get_sub_paths_from_path(&path);
+        let expected = vec!["homepage"];
+
+        assert_eq!(paths.len(), expected.len());
+        for expected_path in expected {
+            assert_eq!(paths.contains(&expected_path.to_string()), true);
+        }
+    }
+
+    #[test]
+    /// extract sub paths from the given url fragment; expect 1 sub path, forward slash removed
+    fn extractor_get_sub_paths_from_path_with_an_absolute_word() {
+        let path = "/homepage";
+        let paths = get_sub_paths_from_path(&path);
+        let expected = vec!["homepage"];
+
+        assert_eq!(paths.len(), expected.len());
+        for expected_path in expected {
+            assert_eq!(paths.contains(&expected_path.to_string()), true);
+        }
+    }
+
+    #[test]
+    /// test that a full url and fragment are joined correctly, then added to the given list
+    /// i.e. the happy path
+    fn extractor_add_link_to_set_of_links_happy_path() {
+        let url = Url::parse("https://localhost").unwrap();
+        let mut links = HashSet::<String>::new();
+        let link = "admin";
+
+        assert_eq!(links.len(), 0);
+        add_link_to_set_of_links(link, &url, &mut links);
+
+        assert_eq!(links.len(), 1);
+        assert!(links.contains("https://localhost/admin"));
+    }
+
+    #[test]
+    /// test that an invalid path fragment doesn't add anything to the set of links
+    fn extractor_add_link_to_set_of_links_with_non_base_url() {
+        let url = Url::parse("https://localhost").unwrap();
+        let mut links = HashSet::<String>::new();
+        let link = "\\\\\\\\";
+
+        assert_eq!(links.len(), 0);
+        add_link_to_set_of_links(link, &url, &mut links);
+
+        assert_eq!(links.len(), 0);
+        assert!(links.is_empty());
+    }
+
+    #[tokio::test(core_threads = 1)]
+    /// use make_request to generate a Response, and use the Response to test get_links;
+    /// the response will contain an absolute path to a domain that is not part of the scanned
+    /// domain; expect an empty set returned
+    async fn extractor_get_links_with_absolute_url_that_differs_from_target_domain(
+    ) -> Result<(), Box<dyn std::error::Error>> {
+        let srv = MockServer::start();
+
+        let mock = Mock::new()
+            .expect_method(GET)
+            .expect_path("/some-path")
+            .return_status(200)
+            .return_body("\"http://defintely.not.a.thing.probably.com/homepage/assets/img/icons/handshake.svg\"")
+            .create_on(&srv);
+
+        let client = Client::new();
+        let url = Url::parse(&srv.url("/some-path")).unwrap();
+
+        let response = make_request(&client, &url).await.unwrap();
+
+        let ferox_response = FeroxResponse::from(response, true).await;
+
+        let links = get_links(&ferox_response).await;
+
+        assert!(links.is_empty());
+
+        assert_eq!(mock.times_called(), 1);
+        Ok(())
+    }
+}
--- a/src/heuristics.rs
+++ b/src/heuristics.rs
@@ -1,4 +1,5 @@
 use crate::config::{CONFIGURATION, PROGRESS_PRINTER};
+use crate::scanner::should_filter_response;
 use crate::utils::{
    ferox_print, format_url, get_url_path_length, make_request, module_colorizer, status_colorizer,
 };
@@ -20,7 +21,7 @@ const UUID_LENGTH: u64 = 32;
 ///
 /// `size` is size of the response that should be included with filters passed via runtime
 /// configuration and any static wildcard lengths.
-#[derive(Default, Debug)]
+#[derive(Default, Debug, PartialEq, Copy, Clone)]
 pub struct WildcardFilter {
    /// size of the response that will later be combined with the length of the path of the url
    /// requested
@@ -99,11 +100,15 @@ pub async fn wildcard_test(
                // reflected in the response along with some static content; aka custom 404
                let url_len = get_url_path_length(&resp_one.url());

-                if !CONFIGURATION.quiet {
+                wildcard.dynamic = wc_length - url_len;
+
+                if !CONFIGURATION.quiet
+                    && !should_filter_response(&wildcard.dynamic, &resp_one.url())
+                {
                    let msg = format!(
                            "{} {:>10} Wildcard response is dynamic; {} ({} + url length) responses; toggle this behavior by using {}\n",
                            status_colorizer("WLD"),
-                            wc_length - url_len,
+                            wildcard.dynamic,
                            style("auto-filtering").yellow(),
                            style(wc_length - url_len).cyan(),
                            style("--dontfilter").yellow()
@@ -117,10 +122,11 @@ pub async fn wildcard_test(
                        !CONFIGURATION.output.is_empty(),
                    );
                }
-
-                wildcard.dynamic = wc_length - url_len;
            } else if wc_length == wc2_length {
-                if !CONFIGURATION.quiet {
+                wildcard.size = wc_length;
+
+                if !CONFIGURATION.quiet && !should_filter_response(&wildcard.size, &resp_one.url())
+                {
                    let msg = format!(
                        "{} {:>10} Wildcard response is static; {} {} responses; toggle this behavior by using {}\n",
                        status_colorizer("WLD"),
@@ -138,7 +144,6 @@ pub async fn wildcard_test(
                        !CONFIGURATION.output.is_empty(),
                    );
                }
-                wildcard.size = wc_length;
            }
        } else {
            bar.inc(2);
@@ -199,7 +204,7 @@ async fn make_wildcard_request(
                let url_len = get_url_path_length(&response.url());
                let content_len = response.content_length().unwrap_or(0);

-                if !CONFIGURATION.quiet {
+                if !CONFIGURATION.quiet && !should_filter_response(&content_len, &response.url()) {
                    let msg = format!(
                        "{} {:>10} Got {} for {} (url length: {})\n",
                        wildcard,
@@ -221,31 +226,16 @@ async fn make_wildcard_request(
                if response.status().is_redirection() {
                    // show where it goes, if possible
                    if let Some(next_loc) = response.headers().get("Location") {
-                        if let Ok(next_loc_str) = next_loc.to_str() {
-                            if !CONFIGURATION.quiet {
-                                let msg = format!(
-                                    "{} {:>10} {} redirects to => {}\n",
-                                    wildcard,
-                                    content_len,
-                                    response.url(),
-                                    next_loc_str
-                                );
-
-                                ferox_print(&msg, &PROGRESS_PRINTER);
-
-                                try_send_message_to_file(
-                                    &msg,
-                                    tx_file.clone(),
-                                    !CONFIGURATION.output.is_empty(),
-                                );
-                            }
-                        } else if !CONFIGURATION.quiet {
+                        let next_loc_str = next_loc.to_str().unwrap_or("Unknown");
+                        if !CONFIGURATION.quiet
+                            && !should_filter_response(&content_len, &response.url())
+                        {
                            let msg = format!(
-                                "{} {:>10} {} redirects to => {:?}\n",
+                                "{} {:>10} {} redirects to => {}\n",
                                wildcard,
                                content_len,
                                response.url(),
-                                next_loc
+                                next_loc_str
                            );

                            ferox_print(&msg, &PROGRESS_PRINTER);
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,6 +1,7 @@
 pub mod banner;
 pub mod client;
 pub mod config;
+pub mod extractor;
 pub mod heuristics;
 pub mod logger;
 pub mod parser;
@@ -9,7 +10,8 @@ pub mod reporter;
 pub mod scanner;
 pub mod utils;

-use reqwest::StatusCode;
+use reqwest::header::HeaderMap;
+use reqwest::{Response, StatusCode, Url};
 use tokio::sync::mpsc::{UnboundedReceiver, UnboundedSender};

 /// Generic Result type to ease error handling in async contexts
@@ -58,6 +60,118 @@ pub const DEFAULT_STATUS_CODES: [StatusCode; 9] = [
 /// Expected location is in the same directory as the feroxbuster binary.
 pub const DEFAULT_CONFIG_NAME: &str = "ferox-config.toml";

+/// A `FeroxResponse`, derived from a `Response` to a submitted `Request`
+#[derive(Debug)]
+pub struct FeroxResponse {
+    /// The final `Url` of this `FeroxResponse`
+    url: Url,
+
+    /// The `StatusCode` of this `FeroxResponse`
+    status: StatusCode,
+
+    /// The full response text
+    text: String,
+
+    /// The content-length of this response, if known
+    content_length: u64,
+
+    /// The `Headers` of this `FeroxResponse`
+    headers: HeaderMap,
+}
+
+/// `FeroxResponse` implementation
+impl FeroxResponse {
+    /// Get the `StatusCode` of this `FeroxResponse`
+    pub fn status(&self) -> &StatusCode {
+        &self.status
+    }
+
+    /// Get the final `Url` of this `FeroxResponse`.
+    pub fn url(&self) -> &Url {
+        &self.url
+    }
+
+    /// Get the full response text
+    pub fn text(&self) -> &str {
+        &self.text
+    }
+
+    /// Get the `Headers` of this `FeroxResponse`
+    pub fn headers(&self) -> &HeaderMap {
+        &self.headers
+    }
+
+    /// Get the content-length of this response, if known
+    pub fn content_length(&self) -> u64 {
+        self.content_length
+    }
+
+    /// Set `FeroxResponse`'s `url` attribute, has no affect if an error occurs
+    pub fn set_url(&mut self, url: &str) {
+        match Url::parse(&url) {
+            Ok(url) => {
+                self.url = url;
+            }
+            Err(e) => {
+                log::error!("Could not parse {} into a Url: {}", url, e);
+            }
+        };
+    }
+
+    /// Make a reasonable guess at whether the response is a file or not
+    ///
+    /// Examines the last part of a path to determine if it has an obvious extension
+    /// i.e. http://localhost/some/path/stuff.js where stuff.js indicates a file
+    ///
+    /// Additionally, inspects query parameters, as they're also often indicative of a file
+    pub fn is_file(&self) -> bool {
+        let has_extension = match self.url.path_segments() {
+            Some(path) => {
+                if let Some(last) = path.last() {
+                    last.contains('.') // last segment has some sort of extension, probably
+                } else {
+                    false
+                }
+            }
+            None => false,
+        };
+
+        self.url.query_pairs().count() > 0 || has_extension
+    }
+
+    /// Create a new `FeroxResponse` from the given `Response`
+    pub async fn from(response: Response, read_body: bool) -> Self {
+        let url = response.url().clone();
+        let status = response.status();
+        let headers = response.headers().clone();
+        let content_length = response.content_length().unwrap_or(0);
+
+        let text = if read_body {
+            // .text() consumes the response, must be called last
+            // additionally, --extract-links is currently the only place we use the body of the
+            // response, so we forego the processing if not performing extraction
+            match response.text().await {
+                // await the response's body
+                Ok(text) => text,
+                Err(e) => {
+                    log::error!("Could not parse body from response: {}", e);
+                    String::new()
+                }
+            }
+        } else {
+            String::new()
+        };
+
+        FeroxResponse {
+            url,
+            status,
+            content_length,
+            text,
+            headers,
+        }
+    }
+}
+
 #[cfg(test)]
 mod tests {
    use super::*;
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,12 +1,11 @@
 use feroxbuster::config::{CONFIGURATION, PROGRESS_PRINTER};
 use feroxbuster::scanner::scan_url;
 use feroxbuster::utils::{ferox_print, get_current_depth, module_colorizer, status_colorizer};
-use feroxbuster::{banner, heuristics, logger, reporter, FeroxResult};
+use feroxbuster::{banner, heuristics, logger, reporter, FeroxResponse, FeroxResult, VERSION};
 use futures::StreamExt;
-use reqwest::Response;
 use std::collections::HashSet;
 use std::fs::File;
-use std::io::{BufRead, BufReader};
+use std::io::{stderr, BufRead, BufReader};
 use std::process;
 use std::sync::Arc;
 use tokio::io;
@@ -58,7 +57,7 @@ fn get_unique_words_from_wordlist(path: &str) -> FeroxResult<Arc<HashSet<String>
 /// Determine whether it's a single url scan or urls are coming from stdin, then scan as needed
 async fn scan(
    targets: Vec<String>,
-    tx_term: UnboundedSender<Response>,
+    tx_term: UnboundedSender<FeroxResponse>,
    tx_file: UnboundedSender<String>,
 ) -> FeroxResult<()> {
    log::trace!("enter: scan({:?}, {:?}, {:?})", targets, tx_term, tx_file);
@@ -159,7 +158,8 @@ async fn main() {

    if !CONFIGURATION.quiet {
        // only print banner if -q isn't used
-        banner::initialize(&targets, &CONFIGURATION);
+        let std_stderr = stderr(); // std::io::stderr
+        banner::initialize(&targets, &CONFIGURATION, &VERSION, std_stderr).await;
    }

    // discard non-responsive targets
--- a/src/parser.rs
+++ b/src/parser.rs
@@ -195,6 +195,13 @@ pub fn initialize() -> App<'static, 'static> {
                    "Filter out messages of a particular size (ex: -S 5120 -S 4927,1970)",
                ),
        )
+        .arg(
+            Arg::with_name("extract_links")
+                .short("e")
+                .long("extract-links")
+                .takes_value(false)
+                .help("Extract links from response body (html, javascript, etc...); make new requests based on findings (default: false)")
+        )

        .after_help(r#"NOTE:
    Options that take multiple values are very flexible.  Consider the following ways of specifying
@@ -225,6 +232,9 @@ EXAMPLES:
    Pass auth token via query parameter
        ./feroxbuster -u http://127.1 --query token=0123456789ABCDEF

+    Find links in javascript/html and make additional requests based on results
+        ./feroxbuster -u http://127.1 --extract-links
+
    Ludicrous speed... go!
        ./feroxbuster -u http://127.1 -t 200
    "#)
--- a/src/reporter.rs
+++ b/src/reporter.rs
@@ -1,8 +1,7 @@
 use crate::config::{CONFIGURATION, PROGRESS_PRINTER};
 use crate::utils::{ferox_print, status_colorizer};
-use crate::FeroxChannel;
+use crate::{FeroxChannel, FeroxResponse};
 use console::strip_ansi_codes;
-use reqwest::Response;
 use std::io::Write;
 use std::sync::{Arc, Once, RwLock};
 use std::{fs, io};
@@ -41,14 +40,14 @@ pub fn initialize(
    output_file: &str,
    save_output: bool,
 ) -> (
-    UnboundedSender<Response>,
+    UnboundedSender<FeroxResponse>,
    UnboundedSender<String>,
    JoinHandle<()>,
    Option<JoinHandle<()>>,
 ) {
    log::trace!("enter: initialize({}, {})", output_file, save_output);

-    let (tx_rpt, rx_rpt): FeroxChannel<Response> = mpsc::unbounded_channel();
+    let (tx_rpt, rx_rpt): FeroxChannel<FeroxResponse> = mpsc::unbounded_channel();
    let (tx_file, rx_file): FeroxChannel<String> = mpsc::unbounded_channel();

    let file_clone = tx_file.clone();
@@ -81,7 +80,7 @@ pub fn initialize(
 /// The consumer simply receives responses and prints them if they meet the given
 /// reporting criteria
 async fn spawn_terminal_reporter(
-    mut resp_chan: UnboundedReceiver<Response>,
+    mut resp_chan: UnboundedReceiver<FeroxResponse>,
    file_chan: UnboundedSender<String>,
    save_output: bool,
 ) {
@@ -107,7 +106,7 @@ async fn spawn_terminal_reporter(
                    // 200       3280 https://localhost.com/FAQ
                    "{} {:>10} {}\n",
                    status,
-                    resp.content_length().unwrap_or(0),
+                    resp.content_length(),
                    resp.url()
                )
            };
--- a/src/scanner.rs
+++ b/src/scanner.rs
@@ -1,11 +1,12 @@
 use crate::config::{CONFIGURATION, PROGRESS_BAR};
+use crate::extractor::get_links;
 use crate::heuristics::WildcardFilter;
 use crate::utils::{format_url, get_current_depth, get_url_path_length, make_request};
-use crate::{heuristics, progress, FeroxChannel};
+use crate::{heuristics, progress, FeroxChannel, FeroxResponse};
 use futures::future::{BoxFuture, FutureExt};
 use futures::{stream, StreamExt};
 use lazy_static::lazy_static;
-use reqwest::{Response, Url};
+use reqwest::Url;
 use std::collections::HashSet;
 use std::convert::TryInto;
 use std::ops::Deref;
@@ -20,6 +21,9 @@ static CALL_COUNT: AtomicUsize = AtomicUsize::new(0);
 lazy_static! {
    /// Set of urls that have been sent to [scan_url](fn.scan_url.html), used for deduplication
    static ref SCANNED_URLS: RwLock<HashSet<String>> = RwLock::new(HashSet::new());
+
+    /// Vector of WildcardFilters that have been ID'd through heuristics
+    static ref WILDCARD_FILTERS: Arc<RwLock<Vec<Arc<WildcardFilter>>>> = Arc::new(RwLock::new(Vec::<Arc<WildcardFilter>>::new()));
 }

 /// Adds the given url to `SCANNED_URLS`
@@ -59,6 +63,42 @@ fn add_url_to_list_of_scanned_urls(resp: &str, scanned_urls: &RwLock<HashSet<Str
    }
 }

+/// Adds the given WildcardFilter to `WILDCARD_FILTERS`
+///
+/// If `WILDCARD_FILTERS` did not already contain the filter, return true; otherwise return false
+fn add_filter_to_list_of_wildcard_filters(
+    filter: Arc<WildcardFilter>,
+    wildcard_filters: Arc<RwLock<Vec<Arc<WildcardFilter>>>>,
+) -> bool {
+    log::trace!(
+        "enter: add_filter_to_list_of_wildcard_filters({:?}, {:?})",
+        filter,
+        wildcard_filters
+    );
+
+    match wildcard_filters.write() {
+        Ok(mut filters) => {
+            // If the set did not contain the assigned filter, true is returned.
+            // If the set did contain the assigned filter, false is returned.
+            if filters.contains(&filter) {
+                log::trace!("exit: add_filter_to_list_of_wildcard_filters -> false");
+                return false;
+            }
+
+            filters.push(filter);
+
+            log::trace!("exit: add_filter_to_list_of_wildcard_filters -> true");
+            true
+        }
+        Err(e) => {
+            // poisoned lock
+            log::error!("Set of wildcard filters poisoned: {}", e);
+            log::trace!("exit: add_filter_to_list_of_wildcard_filters -> false");
+            false
+        }
+    }
+}
+
 /// Spawn a single consumer task (sc side of mpsc)
 ///
 /// The consumer simply receives Urls and scans them
@@ -66,7 +106,7 @@ fn spawn_recursion_handler(
    mut recursion_channel: UnboundedReceiver<String>,
    wordlist: Arc<HashSet<String>>,
    base_depth: usize,
-    tx_term: UnboundedSender<Response>,
+    tx_term: UnboundedSender<FeroxResponse>,
    tx_file: UnboundedSender<String>,
 ) -> BoxFuture<'static, Vec<JoinHandle<()>>> {
    log::trace!(
@@ -160,7 +200,7 @@ fn create_urls(target_url: &str, word: &str, extensions: &[String]) -> Vec<Url>
 ///
 /// handles 2xx and 3xx responses by either checking if the url ends with a / (2xx)
 /// or if the Location header is present and matches the base url + / (3xx)
-fn response_is_directory(response: &Response) -> bool {
+fn response_is_directory(response: &FeroxResponse) -> bool {
    log::trace!("enter: is_directory({:?})", response);

    if response.status().is_redirection() {
@@ -240,7 +280,7 @@ fn reached_max_depth(url: &Url, base_depth: usize, max_depth: usize) -> bool {
 ///
 /// When a recursion opportunity is found, the new url is sent across the recursion channel
 async fn try_recursion(
-    response: &Response,
+    response: &FeroxResponse,
    base_depth: usize,
    transmitter: UnboundedSender<String>,
 ) {
@@ -290,6 +330,54 @@ async fn try_recursion(
    log::trace!("exit: try_recursion");
 }

+/// Simple helper to stay DRY; determines whether or not a given `FeroxResponse` should be reported
+/// to the user or not.
+pub fn should_filter_response(content_len: &u64, url: &Url) -> bool {
+    if CONFIGURATION.sizefilters.contains(content_len) {
+        // filtered value from --sizefilters, move on to the next url
+        log::debug!("size filter: filtered out {}", url);
+        return true;
+    }
+
+    match WILDCARD_FILTERS.read() {
+        Ok(filters) => {
+            for filter in filters.iter() {
+                if CONFIGURATION.dontfilter {
+                    // quick return if dontfilter is set
+                    return false;
+                }
+
+                if filter.size > 0 && filter.size == *content_len {
+                    // static wildcard size found during testing
+                    // size isn't default, size equals response length, and auto-filter is on
+                    log::debug!("static wildcard: filtered out {}", url);
+                    return true;
+                }
+
+                if filter.dynamic > 0 {
+                    // dynamic wildcard offset found during testing
+
+                    // I'm about to manually split this url path instead of using reqwest::Url's
+                    // builtin parsing. The reason is that they call .split() on the url path
+                    // except that I don't want an empty string taking up the last index in the
+                    // event that the url ends with a forward slash.  It's ugly enough to be split
+                    // into its own function for readability.
+                    let url_len = get_url_path_length(&url);
+
+                    if url_len + filter.dynamic == *content_len {
+                        log::debug!("dynamic wildcard: filtered out {}", url);
+                        return true;
+                    }
+                }
+            }
+        }
+        Err(e) => {
+            log::error!("{}", e);
+        }
+    }
+    false
+}
+
 /// Wrapper for [make_request](fn.make_request.html)
 ///
 /// Handles making multiple requests based on the presence of extensions
@@ -299,9 +387,8 @@ async fn make_requests(
    target_url: &str,
    word: &str,
    base_depth: usize,
-    filter: Arc<WildcardFilter>,
    dir_chan: UnboundedSender<String>,
-    report_chan: UnboundedSender<Response>,
+    report_chan: UnboundedSender<FeroxResponse>,
 ) {
    log::trace!(
        "enter: make_requests({}, {}, {}, {:?}, {:?})",
@@ -316,61 +403,117 @@ async fn make_requests(

    for url in urls {
        if let Ok(response) = make_request(&CONFIGURATION.client, &url).await {
-            // response came back without error
+            // response came back without error, convert it to FeroxResponse
+            let ferox_response = FeroxResponse::from(response, CONFIGURATION.extract_links).await;

            // do recursion if appropriate
-            if !CONFIGURATION.norecursion && response_is_directory(&response) {
-                try_recursion(&response, base_depth, dir_chan.clone()).await;
+            if !CONFIGURATION.norecursion {
+                try_recursion(&ferox_response, base_depth, dir_chan.clone()).await;
            }

            // purposefully doing recursion before filtering. the thought process is that
            // even though this particular url is filtered, subsequent urls may not

-            let content_len = &response.content_length().unwrap_or(0);
+            let content_len = &ferox_response.content_length();

-            if CONFIGURATION.sizefilters.contains(content_len) {
-                // filtered value from --sizefilters, move on to the next url
-                log::debug!("size filter: filtered out {}", response.url());
+            if should_filter_response(content_len, &ferox_response.url()) {
                continue;
            }

-            if filter.size > 0 && filter.size == *content_len && !CONFIGURATION.dontfilter {
-                // static wildcard size found during testing
-                // size isn't default, size equals response length, and auto-filter is on
-                log::debug!("static wildcard: filtered out {}", response.url());
-                continue;
-            }
+            if CONFIGURATION.extract_links && !ferox_response.status().is_redirection() {
+                let new_links = get_links(&ferox_response).await;

-            if filter.dynamic > 0 && !CONFIGURATION.dontfilter {
-                // dynamic wildcard offset found during testing
+                for new_link in new_links {
+                    let unknown = add_url_to_list_of_scanned_urls(&new_link, &SCANNED_URLS);

-                // I'm about to manually split this url path instead of using reqwest::Url's
-                // builtin parsing. The reason is that they call .split() on the url path
-                // except that I don't want an empty string taking up the last index in the
-                // event that the url ends with a forward slash.  It's ugly enough to be split
-                // into its own function for readability.
-                let url_len = get_url_path_length(&response.url());
+                    if !unknown {
+                        // not unknown, i.e. we've seen the url before and don't need to scan again
+                        continue;
+                    }

-                if url_len + filter.dynamic == *content_len {
-                    log::debug!("dynamic wildcard: filtered out {}", response.url());
-                    continue;
+                    // create a url based on the given command line options, continue on error
+                    let new_url = match format_url(
+                        &new_link,
+                        &"",
+                        CONFIGURATION.addslash,
+                        &CONFIGURATION.queries,
+                        None,
+                    ) {
+                        Ok(url) => url,
+                        Err(_) => continue,
+                    };
+
+                    // make the request and store the response
+                    let new_response = match make_request(&CONFIGURATION.client, &new_url).await {
+                        Ok(resp) => resp,
+                        Err(_) => continue,
+                    };
+
+                    let mut new_ferox_response =
+                        FeroxResponse::from(new_response, CONFIGURATION.extract_links).await;
+
+                    // filter if necessary
+                    let new_content_len = &new_ferox_response.content_length();
+                    if should_filter_response(new_content_len, &new_ferox_response.url()) {
+                        continue;
+                    }
+
+                    if new_ferox_response.is_file() {
+                        // very likely a file, simply request and report
+                        log::debug!(
+                            "Singular extraction: {} ({})",
+                            new_ferox_response.url(),
+                            new_ferox_response.status().as_str(),
+                        );
+
+                        send_report(report_chan.clone(), new_ferox_response);
+
+                        continue;
+                    }
+
+                    if !CONFIGURATION.norecursion {
+                        log::debug!(
+                            "Recursive extraction: {} ({})",
+                            new_ferox_response.url(),
+                            new_ferox_response.status().as_str()
+                        );
+
+                        if new_ferox_response.status().is_success()
+                            && !new_ferox_response.url().as_str().ends_with('/')
+                        {
+                            // since all of these are 2xx, recursion is only attempted if the
+                            // url ends in a /. I am actually ok with adding the slash and not
+                            // adding it, as both have merit.  Leaving it in for now to see how
+                            // things turn out (current as of: v1.1.0)
+                            new_ferox_response.set_url(&format!("{}/", new_ferox_response.url()));
+                        }
+
+                        try_recursion(&new_ferox_response, base_depth, dir_chan.clone()).await;
+                    }
                }
            }

            // everything else should be reported
-            match report_chan.send(response) {
-                Ok(_) => {
-                    log::debug!("sent {}/{} over reporting channel", &target_url, &word);
-                }
-                Err(e) => {
-                    log::error!("wtf: {}", e);
-                }
-            }
+            send_report(report_chan.clone(), ferox_response);
        }
    }
    log::trace!("exit: make_requests");
 }

+/// Simple helper to send a `FeroxResponse` over the tx side of an `mpsc::unbounded_channel`
+fn send_report(report_sender: UnboundedSender<FeroxResponse>, response: FeroxResponse) {
+    log::trace!("enter: send_report({:?}, {:?}", report_sender, response);
+
+    match report_sender.send(response) {
+        Ok(_) => {}
+        Err(e) => {
+            log::error!("{}", e);
+        }
+    }
+
+    log::trace!("exit: send_report");
+}
+
 /// Scan a given url using a given wordlist
 ///
 /// This is the primary entrypoint for the scanner
@@ -378,7 +521,7 @@ pub async fn scan_url(
    target_url: &str,
    wordlist: Arc<HashSet<String>>,
    base_depth: usize,
-    tx_term: UnboundedSender<Response>,
+    tx_term: UnboundedSender<FeroxResponse>,
    tx_file: UnboundedSender<String>,
 ) {
    log::trace!(
@@ -439,25 +582,24 @@ pub async fn scan_url(
            None => Arc::new(WildcardFilter::default()),
        };

+    add_filter_to_list_of_wildcard_filters(filter.clone(), WILDCARD_FILTERS.clone());
+
    // producer tasks (mp of mpsc); responsible for making requests
    let producers = stream::iter(looping_words.deref().to_owned())
        .map(|word| {
-            let wc_filter = filter.clone();
            let txd = tx_dir.clone();
            let txr = tx_term.clone();
            let pb = progress_bar.clone(); // progress bar is an Arc around internal state
            let tgt = target_url.to_string(); // done to satisfy 'static lifetime below
            (
-                tokio::spawn(async move {
-                    make_requests(&tgt, &word, base_depth, wc_filter, txd, txr).await
-                }),
+                tokio::spawn(async move { make_requests(&tgt, &word, base_depth, txd, txr).await }),
                pb,
            )
        })
        .for_each_concurrent(CONFIGURATION.threads, |(resp, bar)| async move {
            match resp.await {
                Ok(_) => {
-                    bar.inc(1);
+                    bar.inc((CONFIGURATION.extensions.len() + 1) as u64);
                }
                Err(e) => {
                    log::error!("error awaiting a response: {}", e);
@@ -616,4 +758,30 @@ mod tests {

        assert_eq!(add_url_to_list_of_scanned_urls(url, &urls), false);
    }
+
+    #[test]
+    /// add a wildcard filter with the `size` attribute set to WILDCARD_FILTERS and ensure that
+    /// should_filter_response correctly returns true
+    fn should_filter_response_filters_wildcard_size() {
+        let mut filter = WildcardFilter::default();
+        let url = Url::parse("http://localhost").unwrap();
+        filter.size = 18;
+        let filter = Arc::new(filter);
+        add_filter_to_list_of_wildcard_filters(filter, WILDCARD_FILTERS.clone());
+        let result = should_filter_response(&18, &url);
+        assert!(result);
+    }
+
+    #[test]
+    /// add a wildcard filter with the `dynamic` attribute set to WILDCARD_FILTERS and ensure that
+    /// should_filter_response correctly returns true
+    fn should_filter_response_filters_wildcard_dynamic() {
+        let mut filter = WildcardFilter::default();
+        let url = Url::parse("http://localhost/some-path").unwrap();
+        filter.dynamic = 9;
+        let filter = Arc::new(filter);
+        add_filter_to_list_of_wildcard_filters(filter, WILDCARD_FILTERS.clone());
+        let result = should_filter_response(&18, &url);
+        assert!(result);
+    }
 }
--- a/src/utils.rs
+++ b/src/utils.rs
@@ -160,7 +160,11 @@ pub fn format_url(
    //
    // the transforms that occur here will need to keep this in mind, i.e. add a slash to preserve
    // the current directory sent as part of the url
-    let url = if !url.ends_with('/') {
+    let url = if word.is_empty() {
+        // v1.0.6: added during --extract-links feature inplementation to support creating urls
+        // that were extracted from response bodies, i.e. http://localhost/some/path/js/main.js
+        url.to_string()
+    } else if !url.ends_with('/') {
        format!("{}/", url)
    } else {
        url.to_string()
--- a/tests/test_banner.rs
+++ b/tests/test_banner.rs
@@ -536,3 +536,30 @@ fn banner_doesnt_print() -> Result<(), Box<dyn std::error::Error>> {
        ));
    Ok(())
 }
+
+#[test]
+/// test allows non-existent wordlist to trigger the banner printing to stderr
+/// expect to see all mandatory prints + extract-links
+fn banner_prints_extract_links() -> Result<(), Box<dyn std::error::Error>> {
+    Command::cargo_bin("feroxbuster")
+        .unwrap()
+        .arg("--url")
+        .arg("http://localhost")
+        .arg("-e")
+        .assert()
+        .failure()
+        .stderr(
+            predicate::str::contains("─┬─")
+                .and(predicate::str::contains("Target Url"))
+                .and(predicate::str::contains("http://localhost"))
+                .and(predicate::str::contains("Threads"))
+                .and(predicate::str::contains("Wordlist"))
+                .and(predicate::str::contains("Status Codes"))
+                .and(predicate::str::contains("Timeout (secs)"))
+                .and(predicate::str::contains("User-Agent"))
+                .and(predicate::str::contains("Extract Links"))
+                .and(predicate::str::contains("true"))
+                .and(predicate::str::contains("─┴─")),
+        );
+    Ok(())
+}
--- a/tests/test_extractor.rs
+++ b/tests/test_extractor.rs
@@ -0,0 +1,229 @@
+mod utils;
+use assert_cmd::prelude::*;
+use httpmock::Method::GET;
+use httpmock::{Mock, MockServer};
+use predicates::prelude::*;
+use std::process::Command;
+use utils::{setup_tmp_directory, teardown_tmp_directory};
+
+#[test]
+/// send a request to a page that contains a relative link, --extract-links should find the link
+/// and make a request to the new link
+fn extractor_finds_absolute_url() -> Result<(), Box<dyn std::error::Error>> {
+    let srv = MockServer::start();
+    let (tmp_dir, file) = setup_tmp_directory(&["LICENSE".to_string()], "wordlist")?;
+
+    let mock = Mock::new()
+        .expect_method(GET)
+        .expect_path("/LICENSE")
+        .return_status(200)
+        .return_body(&srv.url("'/homepage/assets/img/icons/handshake.svg'"))
+        .create_on(&srv);
+
+    let mock_two = Mock::new()
+        .expect_method(GET)
+        .expect_path("/homepage/assets/img/icons/handshake.svg")
+        .return_status(200)
+        .create_on(&srv);
+
+    let cmd = Command::cargo_bin("feroxbuster")
+        .unwrap()
+        .arg("--url")
+        .arg(srv.url("/"))
+        .arg("--wordlist")
+        .arg(file.as_os_str())
+        .arg("--extract-links")
+        .unwrap();
+
+    cmd.assert().success().stdout(
+        predicate::str::contains("/LICENSE")
+            .and(predicate::str::contains("200"))
+            .and(predicate::str::contains(
+                "/homepage/assets/img/icons/handshake.svg",
+            )),
+    );
+
+    assert_eq!(mock.times_called(), 1);
+    assert_eq!(mock_two.times_called(), 1);
+    teardown_tmp_directory(tmp_dir);
+    Ok(())
+}
+
+#[test]
+/// send a request to a page that contains an absolute link to another domain, scanner should not
+/// follow
+fn extractor_finds_absolute_url_to_different_domain() -> Result<(), Box<dyn std::error::Error>> {
+    let srv = MockServer::start();
+    let (tmp_dir, file) = setup_tmp_directory(&["LICENSE".to_string()], "wordlist")?;
+
+    let mock = Mock::new()
+        .expect_method(GET)
+        .expect_path("/LICENSE")
+        .return_status(200)
+        .return_body("\"http://localhost/homepage/assets/img/icons/handshake.svg\"")
+        .create_on(&srv);
+
+    let cmd = Command::cargo_bin("feroxbuster")
+        .unwrap()
+        .arg("--url")
+        .arg(srv.url("/"))
+        .arg("--wordlist")
+        .arg(file.as_os_str())
+        .arg("--extract-links")
+        .unwrap();
+
+    cmd.assert().success().stdout(
+        predicate::str::contains("/LICENSE")
+            .and(predicate::str::contains("200"))
+            .and(predicate::str::contains(
+                "/homepage/assets/img/icons/handshake.svg",
+            ))
+            .not(),
+    );
+
+    assert_eq!(mock.times_called(), 1);
+    teardown_tmp_directory(tmp_dir);
+    Ok(())
+}
+
+#[test]
+/// send a request to a page that contains a relative link, should follow
+fn extractor_finds_relative_url() -> Result<(), Box<dyn std::error::Error>> {
+    let srv = MockServer::start();
+    let (tmp_dir, file) = setup_tmp_directory(&["LICENSE".to_string()], "wordlist")?;
+
+    let mock = Mock::new()
+        .expect_method(GET)
+        .expect_path("/LICENSE")
+        .return_status(200)
+        .return_body("\"/homepage/assets/img/icons/handshake.svg\"")
+        .create_on(&srv);
+
+    let mock_two = Mock::new()
+        .expect_method(GET)
+        .expect_path("/homepage/assets/img/icons/handshake.svg")
+        .return_status(200)
+        .create_on(&srv);
+
+    let cmd = Command::cargo_bin("feroxbuster")
+        .unwrap()
+        .arg("--url")
+        .arg(srv.url("/"))
+        .arg("--wordlist")
+        .arg(file.as_os_str())
+        .arg("--extract-links")
+        .unwrap();
+
+    cmd.assert().success().stdout(
+        predicate::str::contains("/LICENSE")
+            .and(predicate::str::contains("200"))
+            .and(predicate::str::contains(
+                "/homepage/assets/img/icons/handshake.svg",
+            )),
+    );
+
+    assert_eq!(mock.times_called(), 1);
+    assert_eq!(mock_two.times_called(), 1);
+    teardown_tmp_directory(tmp_dir);
+    Ok(())
+}
+
+#[test]
+/// send a request to a page that contains an relative link, follow it, and find the same link again
+/// should follow then filter
+fn extractor_finds_same_relative_url_twice() -> Result<(), Box<dyn std::error::Error>> {
+    let srv = MockServer::start();
+    let (tmp_dir, file) =
+        setup_tmp_directory(&["LICENSE".to_string(), "README".to_string()], "wordlist")?;
+
+    let mock = Mock::new()
+        .expect_method(GET)
+        .expect_path("/LICENSE")
+        .return_status(200)
+        .return_body(&srv.url("\"/homepage/assets/img/icons/handshake.svg\""))
+        .create_on(&srv);
+
+    let mock_two = Mock::new()
+        .expect_method(GET)
+        .expect_path("/README")
+        .return_body(&srv.url("\"/homepage/assets/img/icons/handshake.svg\""))
+        .return_status(200)
+        .create_on(&srv);
+
+    let mock_three = Mock::new()
+        .expect_method(GET)
+        .expect_path("/homepage/assets/img/icons/handshake.svg")
+        .return_status(200)
+        .create_on(&srv);
+
+    let cmd = Command::cargo_bin("feroxbuster")
+        .unwrap()
+        .arg("--url")
+        .arg(srv.url("/"))
+        .arg("--wordlist")
+        .arg(file.as_os_str())
+        .arg("--extract-links")
+        .unwrap();
+
+    cmd.assert().success().stdout(
+        predicate::str::contains("/LICENSE")
+            .and(predicate::str::contains("200"))
+            .and(predicate::str::contains(
+                "/homepage/assets/img/icons/handshake.svg",
+            )),
+    );
+
+    assert_eq!(mock.times_called(), 1);
+    assert_eq!(mock_two.times_called(), 1);
+    assert_eq!(mock_three.times_called(), 1);
+    teardown_tmp_directory(tmp_dir);
+    Ok(())
+}
+
+#[test]
+/// send a request to a page that contains an absolute link that leads to a page with a sizefilter
+/// that should filter it out, expect not to see the second response reported
+fn extractor_finds_filtered_content() -> Result<(), Box<dyn std::error::Error>> {
+    let srv = MockServer::start();
+    let (tmp_dir, file) =
+        setup_tmp_directory(&["LICENSE".to_string(), "README".to_string()], "wordlist")?;
+
+    let mock = Mock::new()
+        .expect_method(GET)
+        .expect_path("/LICENSE")
+        .return_status(200)
+        .return_body(&srv.url("\"/homepage/assets/img/icons/handshake.svg\""))
+        .create_on(&srv);
+
+    let mock_two = Mock::new()
+        .expect_method(GET)
+        .expect_path("/homepage/assets/img/icons/handshake.svg")
+        .return_body("im a little teapot")
+        .return_status(200)
+        .create_on(&srv);
+
+    let cmd = Command::cargo_bin("feroxbuster")
+        .unwrap()
+        .arg("--url")
+        .arg(srv.url("/"))
+        .arg("--wordlist")
+        .arg(file.as_os_str())
+        .arg("--extract-links")
+        .arg("--sizefilter")
+        .arg("18")
+        .unwrap();
+
+    cmd.assert().success().stdout(
+        predicate::str::contains("/LICENSE")
+            .and(predicate::str::contains("200"))
+            .and(predicate::str::contains(
+                "/homepage/assets/img/icons/handshake.svg",
+            ))
+            .not(),
+    );
+
+    assert_eq!(mock.times_called(), 1);
+    assert_eq!(mock_two.times_called(), 1);
+    teardown_tmp_directory(tmp_dir);
+    Ok(())
+}
Author	SHA1	Message	Date
epi	962e22010f	Merge pull request #94 from epi052/93-fix-progress-bar-counting fixed progress bar being incremented too little	2020-10-24 12:34:03 -05:00
epi	fcc27f6770	fixed progress bar being incremented too little	2020-10-24 12:32:51 -05:00
epi	404b231c67	added FAQ section to README	2020-10-24 09:26:54 -05:00
epi	43e5ad14c9	added FAQ section to README	2020-10-24 09:20:34 -05:00
epi	52d05e613c	Update README.md	2020-10-24 09:19:42 -05:00
epi	b84ee91c2e	added FAQ section to README	2020-10-24 09:14:46 -05:00
epi	81456c7074	Merge pull request #91 from epi052/84-add-strip-to-cd-pipeline add strip to linux and macos binaries	2020-10-23 17:30:00 -05:00
epi	5d564c5f28	CD pipeline back to master only trigger	2020-10-23 17:28:56 -05:00
epi	21eb70bdfa	added strip to linux and macos binaries; test 2	2020-10-23 17:10:21 -05:00
epi	48b58664c7	added strip to linux and macos binaries; test 1	2020-10-23 17:07:27 -05:00
epi	c85cf21d4f	Merge pull request #90 from epi052/78-check-for-updates-on-startup feroxbuster now checks for updates on startup	2020-10-23 07:04:36 -05:00
epi	27f649d164	simplified .text() call to retrieve body	2020-10-23 06:45:51 -05:00
epi	4f53bc7b49	removed lint & added debug statement for api rate-limiting	2020-10-23 06:35:35 -05:00
epi	9fa963bb8c	updates checked for and reported on startup	2020-10-23 06:27:38 -05:00
epi	0d6ae79c46	initial PR commit	2020-10-22 06:18:40 -05:00
epi	952f44e798	Merge pull request #74 from epi052/FEATURE-add-link-extraction New feature: added link extraction	2020-10-22 06:12:11 -05:00
epi	6534040992	Merge branch 'FEATURE-add-link-extraction' of github.com:epi052/feroxbuster into FEATURE-add-link-extraction	2020-10-22 05:56:12 -05:00
epi	5db47bf85d	updated readme and exmaple config	2020-10-22 05:55:54 -05:00
epi	ba279079b6	Merge pull request #87 from epi052/FEATURE-add-link-extraction--integrate-get-links-into-scanner-v2 Integrate extractor::get_links into scanner v2	2020-10-21 20:19:28 -05:00
epi	61648394cc	simplified heuristics redirection printing	2020-10-21 06:39:32 -05:00
epi	6a0e27f67c	increased code coverage for scanner	2020-10-21 06:22:44 -05:00
epi	7e518b2921	increased code coverage for scanner	2020-10-21 06:22:25 -05:00
epi	62d4e794da	wildcard filters now shared across recursive scans	2020-10-21 05:39:10 -05:00
epi	280177e7e4	added a test for get_links	2020-10-20 06:38:14 -05:00
epi	090a556212	added integration tests for extractor	2020-10-19 20:46:41 -05:00
epi	e8c76e89ee	added integration tests for extractor	2020-10-19 20:46:24 -05:00
epi	74aa5e8047	even more cleanup; extraction looking mostly complete	2020-10-19 19:47:03 -05:00
epi	6fa542ecc5	lots of post-implementation cleanup done	2020-10-18 21:02:09 -05:00
epi	0ec4f90a09	Merge pull request #86 from spikecodes/patch-1 Update AUR Package Name	2020-10-18 15:21:05 -05:00
Spike	6c5337f6af	Update AUR Package Name	2020-10-18 11:39:15 -07:00
epi	bb57a148ff	added FeroxResponse, old Response channels replaced with FeroxResponse	2020-10-18 12:19:49 -05:00
epi	98619c1c3b	Merge branch 'master' into FEATURE-add-link-extraction	2020-10-18 09:56:25 -05:00
epi	eea5276c5f	Merge pull request #83 from spikecodes/patch-1 Publish to Arch User Repository	2020-10-17 20:22:23 -05:00
Spike	6272699370	Publish to AUR	2020-10-17 16:41:01 -07:00
epi	96ab0381e8	Merge pull request #75 from epi052/FEATURE-add-link-extraction--add-extractor-for-html Added extractor module, exposes `get_links` function	2020-10-16 06:00:20 -05:00
epi	5dff0ab571	removed unwrap from get_links	2020-10-16 05:48:50 -05:00
epi	2d076564b9	added unit tests for add_link_to_set_of_links	2020-10-16 05:17:08 -05:00
epi	f9da98be34	lint in tests	2020-10-15 20:50:53 -05:00
epi	7345d706ff	added unit tests for get_sub_paths_from_path	2020-10-15 20:50:08 -05:00
epi	6921ac03a9	extractor logic complete	2020-10-15 07:34:23 -05:00
epi	3c940b8e03	Merge pull request #72 from epi052/FEATURE-add-link-extraction--add-cli-option added -e\|--extract-links to parser/banner/config 🕵	2020-10-12 19:44:23 -05:00
epi	1dbe99ea19	added banner integration test for extract-links	2020-10-12 17:23:08 -05:00
epi	8845a40510	added -e\|--extract-links to parser/banner/config 🕵	2020-10-12 16:48:51 -05:00