Options for securing public-facing forms

Recaptcha or other "user challenges" are an easy, but risky way to secure user-facing data entry points in WordPress Applications - and there are alternatives.

Options for securing public-facing forms

For all our notable achievements, including the development of super computers and placing our own kind on extra-terrestrial bodies – the humble form remains the most common and reliable way to gather and validate user data – because they offer a familiar UX, are relatively easy to complete and from a developmental point of view – they are also rather simple to develop.

These are also the reasons that they are fresh-meat for rouge bots, which are developed by shadowy actors to scour the web and attack vulnerable entry points – such as login forms and APIs – in order to extract user data, create botnets or perhaps, just for the hack of it!

The power of modern web building tools make it easy to create web applications and build complex forms to gather user data, but naivety – which could be described as a lack of negative experiences – means that many such forms are poorly prepared for the attack vectors they may be subjected to and some are simply open doors inviting bots to enter.

Recaptcha to De-capture

Once developers have been made aware of the inherent risk or user-facing forms – either by training or as a result of an attack – there is often a knee-jerk reaction to apply strong layers of protection capable of withstanding brute force attacks – and these measures are not without reason – but the risks of such an approach are also high and swing the responsibility of proving human-ness back to your genuine users – which is truly hard to justify.

Bad UX is the simple way to explain why nearly all Captcha-like solutions are not the right way to secure user submitted data in WordPress Applications – the longer answer is much more nuanced and deserves a deeper analysis to understand.

It is a subject most people have a clear opinion on – either as end-users or as application developers – perhaps the only people who like Captchas are the security teams, who can see their obvious benefits in terms of reducing nefarious activity – but at what cost are those important gains made?

The central issues with Captcha-like solutions can be summarized as follows:

  • Biases against users – be it photo identification for the visually impaired or mathematical challenges for the numerically dyslectic – each solution presents new accessibility issues which alienate users and in some cases contravene laws designed to ensure universal access to web-based services.
  • Notable performance hits – which can be application-wide if scripts are included and instantiated globally – this may also be render-blocking if not deferred.
  • On a human-level – they adds a layer of user frustration and additional friction to every form – simple or complex – lowering conversion rates – which is another way of saying that people are not able to use the tools you are developing, because the security layer is overwhelming.
  • And, while developing a single form with Captcha is simple enough, developing a complex application with multiple forms per view quickly becomes difficult to manage – often leading developers to reduce the complexity of the verification steps, which in turn, negates their core purpose.

This list is not great reading – so we should also point out how hugely successful Captcha security layers can be – they can block most spam submissions – they do what they say on the tin.

Spam, Spam, Spam..

Spam is annoying, it’s data noise, evaluating it eats away valuable time and while it helpfully exposes security cracks in our systems, these can be expensive – in both a time or money sense – to resolve.

Captchas offer a quick-fix – they come in many shapes and sizes, are often free and most solutions offer good documentation and are simple to set-up – developers can breathe a sigh of relief – the integrity of their system is secure – the door is bolted shut…

When it’s cold outside and there is a crack in the window glass, it’s can be easier to board-up the entire window than to take the time to replace the pane – but the benefit of fixing the problem correctly is that we do not obstruct our view of the outside world.

It is an interesting paradox. We build beautiful houses, guide users to the address and then slam the door shut when they arrive.

When we should be welcoming visitors inside, we instead waylay them with childish challenges and beguile them with indiscernible scribblings – are we leading them on some mystic journey or simply testing their humanness?

Alternatives

If you do decide that you would rather welcome you guests, but also wish to deter robotic visitors – there are many alterative options available, here is a quick list:

  • No security – party-time, all are welcome – and you get to manually sieve thru the junk on a daily basis – this gets old very quick and you’ll be running back to Captcha before you can count to 3 + 7…
  • Invisible Recaptcha offers a lower-resistance route and also shifts the decision making to the developer, but they still present user, performance and developmental problems and rely on behavioural analysis, which means they need to be loaded globally across the application.
  • And then we have Honey Pots… great name, simple concept, fast loading – in short a reliable, but not bullet-proof solution which is simple to develop and maintain.
Honey Pots?

Like bees to honey, or more like flies to $h1t

The concept is pretty simple – place a trap that attracts a hungry visitor – in this case an input-filling bot with an insatiable appetite for adding data to every possible gap it finds – but which is hidden from “real” users, then discard all submissions ( on the server-side ) which have the trap filled.

But, as with all UX questions, it’s not quite so simple – we need to look back at our original objections to Captchas and make sure we have not created different traps for real users and also examine what other new accessibility issues we might be introducing.

Firstly, let’s take a look at a simple code example of a working Honey Pot and then we’ll delve into the detail:

Here is a complete input, which should be generate using JavaScript hooked to an event listener – for example DOMContentLoaded.

<input type="text" name="honey" class="honey" id="honey" data-form-honey tabindex="-1" autocomplete="false" data-form-required value="" />

Now we’ll break this example down and show how to optimize it.

Hide the Element

This can be achieved using CSS either by adding a class selector, such as .honey – you can use either visibility or display to ensure the input is removed from the visual flow of the page and is ignored by screen readers.

.honey{
visibility: hidden !important;
display: none !important;
}

This could also be achieved using inline CSS as follows:

<input type="text".. style="display:none" />
Keyboard Navigation

By assigning the option tabindex="-1" we are removing the possibility for users or devices that navigate via keystrokes – such as tab – from focusing the element.

Auto-complete

We are adding a text input – because this is more tempting for the bot – but we also add the following html attribute autocomplete="false" to ensure that the browser does not attempt to autofill the field – false is actually an invalid value, as only on or off but false ensures that values are never added to the input.

Obscure Purpose

Remember, we are trying to trick robots programmed by people – these bots have been programmed with attack patterns – find forms, fill them in and see what happens next – login or registration pages have common fields ( username, email etc ) – comments forms are nearly universally predictable and in some cases bots are programmed to pry away at specific high traffic targets using dedicated instructions.

We can add some complexity to our honey pot by changing certain attributes and by using less obvious naming for other parts ( remember that bots can also gather and return data to their programmers to enable them to be optimized ) – some examples include:

  • In the example we did – but don’t call the input honey or byebyebots or anything so obvious – the less clues we give the bot the better.
  • As the bots are capable or both recording and learning, a more secure model is to randomize the naming of the honey pot element – you can store a value in a transient field which is revoked on a daily or weekly basis.
  • The form input has an empty value – this is important as it’s more tempting for the bots which hungrily fill every input they find.
  • We have invented a data attribute data-form-required to try to add more sugar – it’s  effectiveness is unproven.
Backend Validation

The front-end provides the bait, but all validation happens on the backend – safely out of the reach of any bots.

We can do a very simple check for the honey key in the $_POST object – and if this is either missing ( no JavaScript on the front-end ) or present and had a value set ( robot filled out the field ) – we can take action – either returning the form with a warning or ramping up the protection.

if (
    ! isset( 'honey' )
    || $_POST[ 'honey' ] != ''
){

    // take avoiding action - bots ahead!

}

Note that all submissions made without JS will fail – as the honeypot is added programmatically. ( the stats show a tiny proportion of users have JS disabled and all all major crawlers are now JS capable, but it’s important to also considering the JS-less experience ).

If we do no discard all submissions which do not include the honeypot element – empty or not – then we are basically introducing a simple backdoor to negate the entire honey pot.

You can also add extra layers of POST validation, for example by creating and validating a nonce or by defining an action in each form and validating that it is set in the POST data.

Bonus Log

The only real-world way to validate a technical solution is by tracking its usage and reviewing data – in this case we can log data for each security check failure to see how many ( for sure there will be some ) false-positives we have bounced back for repeat submissions.

We can simply log the reason for the failure and the posted data – being careful to encrypt sensitive data for example from login or registrations forms, which might contain passwords – we can also capture IP and user-agent data and whatever unique identifiers we add to each form to ensure we know the source clearly.

This data should be regularly audited and tweaks made to the system to attempt to reduce the number of false-positives and to block additional bots

Next Level

Once we start to think about more flexible security solutions, instead of over-engineered quick fixes – we also start to consider the users more and the flows and steps they will need to take to play an active part in your application.

Some further suggestions for tweaks to user-facing forms might include:

  • Leveraging specific tools for each requirement, rather than trying to make one system fit all use-cases, for example Akismet can help to protect against comment spam, while adding email verification and holding new users in low-capability “pending” roles are very effective against registration attacks.
  • To incrementally increase the challenge on each subsequent submission attempt from testing the water to try to lure in robots to presenting Turing tests of the highest complexity – gradually…
  • Introducing some of the more traditional Captcha style systems once we feel more confident that we are blocking a bot and not a genuine user
  • Adding a time delay to each submissions – for example by requiring the user to take 15 seconds longer on each submission – bots probably don’t get frustrated or tired.. so they will keep firing back submissions at the same rate.
  • Blocking IP addresses for repeat violations ( add time and action parameters to your algorithm for more fine-grain control ).
  • There are many advanced bot mitigation processes which can be applied ( most notably behavioural-based approaches ) – these are normally complex and expensive, but would by easily justified on high value projects or where dealing with very sensitive data.

Honey Pots are not a new concept, but they remain a viable solution for reducing SPAM, while also returning control over what happens when a suspected malicious submission is made to the developer – who can escalate or terminate the process or take no action at all – and simply log and review the data – but at least we are aware of what is happening and back in control of our own applications.

Add your thoughts below and happy coding 🙂

Gist List
Get the Gist of things
Read the Comments
Open on github

Comments

No comments on this article yet.

Add a Comment

Your email address will not be published.

These HTML tags work: <a href=""> <code> <strong> <pre>


Connect:
OR

Our Services
Validate your ideas before diving in. Consultancy
Dig deep, audit your data, prepare for action Discovery
Get everything in shape to smooth the project path. Shaping
Let us help you realize your imagination Design
Rock-solid WordPress tools, built to last. Development
Without water, the garden will not grow. Support


Projects
Quinta de Sant’Ana, Portugal
Quinta de Sant’Ana, Portugal

Quinta de Sant'Ana overlooks the picturesque village of Gradil with its cobbled streets, white washed houses and hospitable inhabitants.


Projects: View All or
Releases
Willow
Willow

Willow is a Logic~less Template Engine built for WordPress. Willow plays nicely with ACF, is quick to learn and developer-focused.


Releases: View All or