Thursday, June 4, 2015

Quickie: Finding Web redirects with Curl

Today I was given a list of Domain names and asked to determine if they were real sites or simply redirects to another site.

The redirects have 301 and 302 http status codes, so I was able to use curl for this.

I saved the URLs into a text file, sitelist.txt.
Then ran this script.

while read i; do
 echo $i >>results.txt
 curl -s -w "%{http_code} %{url_effective}\\n" "$p" -o /dev/null >> results.txt
done <sitelist.txt

Filtering the output to show only the lines that started with a 30x status code is easy with grep.
cat results.txt  | grep '^30'

To grep, the caret ^ matches the start of the line.  That makes this command equivalent to "dump out the results.txt file and filter to only show me lines that start with 30.

Magic!

(This script is Linux specific, but if you've got the windows versions of grep and curl then the following, completely untested code, batch file should work.)

for /F %%i in (sitelist.txt) do {
  echo %%i>>results.txt
  curl -s -w "%{http_code} %{url_effective}\\n" "$p" -o /dev/null >>results.txt
}


No comments: