Making an Athena table from the SecLists repo


If you’re into web security, you have hopefully heard of SecLists. It’s an amazing repository of keywords, indicators, payloads, passwords and more. It’s great not just for SecOps, but also developers and QA who want to step up their security game.

As part of a project I’m working on, I wanted to be able to quickly compare strings in the Discovery/Web_Content files against logs I have regularly synched to AWS S3 (specifically, ELB logs for my SaaS platform). In order to find interesting data in those logs, I’ve already created Athena tables, so I just need a new table for this content. So I wrote a quick script that fetches the SecLists repo, copies it up to S3, then generates an Athena table.

This gist shows how to make the whole repo searchable, but it’s worth noting that there are README files and other content in there you don’t want to query (including GIFs and other binaries). So it’s a good idea to restrict your queries to subfolders using the $path metavariable, or CREATE your table using that subfolder in the LOCATION path. (For example, since I’m only interested in web content, I gave that full path in my CREATE TABLE statement.)

What’s rad about this is that (a) it’s searchable using standard SQL, (b) I can compare strings to other data files using Athena, and (c) I only incur access/query charges when I run my queries, rather than having an always-on database instance.

Let me know on Twitter what you’re using Athena for!


Detecting public-read S3 buckets


I’m kind of surprised I couldn’t easily find something like this elsewhere. After all the recent news about unsecured (or very poorly secured) AWS S3 buckets, I wanted to find a quick and easy way of checking my own buckets. Between the several AWS accounts I manage, there are hundreds.

AWS sent out an email to account owners listing unsecured buckets a while back. Read more about it from A Cloud Guru, where they also discuss how to secure your buckets. But that doesn’t necessarily help with quick auditing. AWS provides some tools like Inspector to help find issues, but setting it up can take some time (though it’s totally worthwhile in the long run). I’m impatient, and I want to know stuff right now.

My solution was to write a quick script that scans my buckets for glaring issues. Namely, I want to know if any of my buckets have the READ permission set for “everyone” or “any AWS account”. If READ is allowed for “everyone” – anyone can list or download files in that bucket. If it’s allowed for “any AWS account”, a trivial barrier is set – a user just has to have an AWS account to review your bucket contents.

So here’s my script.

It requires the AWS CLI and jq, which is an awesome utility, and can be downloaded here. It’ll check top-level bucket ACLs for public-read settings, and just alert you to those bucket names. From there, I’ll leave it to you to secure your buckets.

If you just want to take the nuclear option and update your buckets to private-only, you can do that with this AWS CLI command:

aws s3api put-bucket-acl --bucket <bucket-name> --acl private

Just no one go breaking prod, please.

Randomly redistributing files with bash


This was was damn confusing, but the solution is absurdly obvious. I needed to relocate a large number of files from a single source into a collection of subfolders. (These subfolders were essentially worker queues, so I wanted a roughly even distribution every time new files appeared in the source folder.)

But I noticed that every time this executed, all my source files were ending up in the same queue folder (and not being evenly distributed). What gives?

Turns out, my call to $RANDOM was being executed only once at runtime, so that value was being set statically, and used for all subsequent mv commands. The Eureka moment what when I realized that as a subshell command, I need to escape my dollar-signs so that they’d be ignored by the parent shell, and translated by the child.

I suddenly found all my files going to the correct folders. Yet another reminder to always keep scope in mind.

Recovering from Apache “400 Bad Request” errors


You never want to encounter errors in your production environment. But what do you do if you release code that generates fatal errors outside your application?

A recent release we deployed caused an issue with links our platform was generating. Instead of nice links like this:
we started seeing links like:

The source of the issue was easy enough to track down. We use “%%” to wrap replacement variables for our email templates, and when a variable can’t be identified, we re-wrap the name with “%?”. (It’s arguable we shouldn’t do that, but that’s a topic for a later discussion.) Rolling back the change fixed the issue for future generated emails, but what to do about the old emails? They’re already in inboxes world-wide, and we can’t fix the URI for those links.

When Apache receives a request in this format, where you have a percent sign followed by a non-alphanumeric tuple, it panics. It’s thinking that the %-sign is there to URL-encode something. When you follow a %-sign with a question mark, it generates a parse error, and throws a “400 Bad Request”. You can’t use mod-rewrite because the error occurs before rewrite rules are processed. But Apache does give you one opportunity to rewrite history (so to speak): ErrorDocument.

We were able to quickly deploy a new script to production, and set the path to that script as the ErrorDocument handler for HTTP 400 responses.

# In /etc/httpd/conf.d/my-app.conf
ErrorDocument 400 /error/400.php

Apache will redirect the request to that page, with the URI intact. It’ll even still load your other modules, including (in this case) the PHP processor. So in that 400.php script, we parse the URI, look for the bad string, and simply rewrite (or remove) it.


if (strpos($_SERVER['REQUEST_URI'], '%?') !== false) {
    $uri = str_replace('%', '%25', $_SERVER['REQUEST_URI']);
    header('Location: ' . $uri);

echo "This server has encountered an error.";

Et voila. It’s not the perfect user experience, since those values weren’t replaced in transit like they should have been. But at least the email recipients clicking on the links aren’t getting standard “400 Bad Request” error pages. It was a tidy little solution to what could have been a devastating problem.