Save your community from spammers and bots
by Jase June 28, 2018
Save your community from spammers and bots

Let's discuss a topic that's near to us: preventing fake accounts, spammers and trolls in your online community. Having run multiple "community" type sites, with hundreds of thousands of users over the last 20 years, it's a topic that we find interesting, ever-changing and contentious.

What's the problem?

If you run a site/platform that enables users to sign up and publish content, you're giving them a tremendous amount of power. This impacts many kinds of sites but in particular "free" to register and use sites such as online communities, discussion forums and social media.

There's interesting and destructive ways to wreak havoc on an online community. Trolling is equal parts art and science. Usually, bad actors are motivated to damage your community for financial or ideological gain. And recently, creating chaos can also be done for fun or ‘for the Lulz’.

Examples:

 

Mass spam. A user can use disposable emails or the Gmail "+" trick to generate many users on a system. These users flood the site with posts and comments linking to sites unrelated to the platform - e.g. downloading pirated videos, etc.

Mass messaging. A company (potentially a competitor) utilizes a social platform's commenting and messaging feature to mass contact many people to invite them to their service. The bad actor is intent on harvesting your users for their benefit.

Mass trolling. Similar to mass spam, mass trolling involves generating many users on a system and flooding the site with fake or inflammatory content. Sometimes to take the site down, other times just for fun.

Mass voting. If you run a site that does ranking of content from users, this is a problem. Some users will resort to using bots to mass create accounts that upvote their content or downvote an opinion/product they don’t support.

Who this article is for

The content is aimed for those who are: 1. Worried about the implication of bad actors in their community. 2. On the front lines of building and managing an online community with limited resources

We want to run great sites and online businesses, without having to spend a lot of time and money on moderation or banning thousands of accounts.

Our experience has been for sites that have grown to just under a million registered users. The largest site we’ve run to date is ArtStation.com. The content will be from the perspective of those needing to deal with these bad actors using more limited resource set. E.g. Not the Facebook moderation team.

Prevention and Treatment

Prevention is stopping the bad actors at sign up. Treatment is dealing with the issues after they’ve managed to get onto your platform. Typically you're going to do need both to varying degrees.

Prevention

enter wide

 

Prevention covers techniques that generally don't allow for abuse to happen. For example, a site might require more than a simple email confirmation — say a 2nd factor such as SMS verification (more on that later), or forcing first posts to be moderated. Some platforms such as Dribbble are invite-only, which is leads to a higher quality user base.

Signal-to-noise is the priority and this comes at the potential expense of losing users. I have some notes later about why we think the additional friction at sign up isn't as big an issue as the growth-focused people think.

Treatment

treat wide

 

The majority of growth-focused companies opt to treat the problem after a user has registered. Their focus is on getting as many users as possible to register. All you need is an email address, and you're able to start publishing content. Some don’t even confirm that the email address is valid before allowing the user to enter content.

Treatments include shadow banning, straight up banning, rules to limit content posting, and typically a large mod team or least a user base that cares enough to do the heavy lifting.

Companies who go this route believe that friction slows growth, and thus verification is bad. They want the user experience to be as simple as possible, so that the user can get to their goal quickly. And they’re wrong when using this to build a platform or community.

That being said, there is a reason why it’s done — Pump up numbers to look good for an investment or acquisition. New users grows your monthly active users number, shows a large (allegedly) captive audience and indicates a strong business model — You know what’s cool? A billion users!

Tools and Techniques

tools wide

 

 

Confirm email address

This is the minimum. When you confirm an email address, you're checking that the email is valid and the list of users you’re building is real. And you’ll gain trust with email providers. Many registration libraries for popular development frameworks can do this. You should typically only allow a single email address for the site or subsite (for multi-tenant platforms).

There is a problem with disposable emails. I recommend using an email validation API service such as Mailgun or Proofy.io that checks for high risk email addresses. If you're on the cheap, there are several lists of disposable email providers on Github that you can check against during email validation - e.g. this one

Captcha

When a user submits a form, captcha attempts to check that the user is a human. Google has a pretty robust captcha now that is invisible to the user. It’s a must have.

Flood protection

Flood protection prevents requests from a single host from posting content to your community in a short amount of time. While captcha will cover this for public-facing forums, other forms such as a "new post" form have some kind of flood protection e.g. Rack Attack that prevents a bad actor from setting up a script to flood that endpoint.

SMS and 2 Factor Authentication

SMS verification requires the user to provide an SMS and the site will send a code that the user must enter to verify.

There is a problem with SMS verification as people can use Receive SMS Free online services that provide disposable SMS numbers. The way to prevent this is to use a service that can identify the risk associated with the number, such as Telesign. We use Telesign on ArtStation and it is a godsend. It will do a risk assessment on the number to check if it is from a disposable SMS site, prepaid burner phone, etc.

SMS verification is a heavy approach. Use it when your community is your product, and the quality of the interactions and content is why people join. If you’re creating a branded community, the above options are sufficient.

Another option which is becoming the standard is to ask the user to authenticate with either Facebook or LinkedIn. Users might see it as a smaller ask, and be more willing to do so.

Setting limits

Setting limits is another way to avoid being flooded with content. For example, if you allow users to post content, limit the amount of content that can be posted within a time period. You can start small, limiting posts every 60 seconds or intervening after 5 posts within a short period of time. Limit private messages between users, a common vector for spammers who will invite others to buy $9.99 Ray Ban sunglasses or pills that shrink you in some areas and expand you in others. Likewise for voting on content (e.g. Instagram/Reddit), limit that too — Instagram currently has this problem with automated scripts that like anything within a specified hashtag.

It sounds far fetched but it does happen. The stuff we've seen...

Moderation

You will need some kind of way for users to report abuse. Unfortunately, yes you will need someone on staff to look at the abuse reports and deal with them. There are ways to automate abuse reports (e.g. if there are multiple abuse reports for the same content, flag it somehow and don't display it), or if someone receives a lot of complaints, shadow ban them until a mod can take a look. Ultimately you still need human moderation. Some communities opt for the volunteer approach, and yes, it helps save costs, and increase the quality of the content. But beware, volunteer moderators still need official supervision.

A lot of organizations start by combining this role with their Social Media Coordinator. And as you grow you can dedicate more resources to your community.

Once your platform is operating at scale, you'll probably need to professional community managers. Yes, there are people that do this, full time, and there are even companies that specialize in this service. It's said that Facebook has over 20,000 people working on safety and security.

CMX Hub is a fairly large organization that jas job boards, events and training for community managers. Far from the only one in the space, community management could be as large as brand management is for some organizations.

AI and new technology

There are more companies joining the space that provide AI services for detecting bots and bad actors. Not that they replace human intervention, but the augment the efficiency of your teams, making sure that they’re covering more ground and looking at the ‘right’ problems.

These technologies typically work by analyzing your community data (including pictures) to find patterns that represent abuse or mature content.

A couple people in the space are: - https://utopiaanalytics.com/ - https://besedo.com/ - https://www.twohat.com/community-sift/

Typically these softwares will require a bit of work on your end. And they don't come for free, so make sure you do your due diligence before you decide on the tech and the vendor.

Discussion

Isn't adding friction for the user bad?

discussion wide

 

Not necessarily. Some product folks are really fixated on lowering friction for users, so taking any preventative approach (e.g. asking the user for their cellphone number) is considered bad. It’s not so black and white.

You're providing people a platform, for free. It costs you money for the user to use your system. If they cannot take a few moments to verify information, then how invested will they ever become in your community?

Plus, you’ll want to create a high quality community — Especially if it’s branded.

There are ways to get progressive information and verification that feels more natural. For example, asking for email verification when the users tries their first post, or to use the search function.

Still not convinced? Conduct an A/B test. Is SMS verification going to create a problem? What about only allowing Facebook or Google SSO? A/B tests put hard data behind assumptions. Sure, people will complain, but they’ll be happier in the long run when the community stays high value.

In our case on ArtStation, we opted for a preventative approach with both email confirmation and a 2nd factor. Users must confirm their email, and then either provide us with a SMS number or authenticate against Facebook or LinkedIn. The amount of drop-off from when a user registered but failed to verify their account is 15%. Seems high, but when we investigated the 15%, we found that the vast majority of those users were fake — Their names were "asdf asdf", random keys, random names, email addresses, etc.

Anecdote

The VC funded photo site vs the bootstrapped art site

We’ll share an anecdote as it illustrates both sides of the coin and why "one size doesn't fit all". About a year ago, our CEO Leo met with people from a popular photography community. This community boasted having millions of registered users. Contrast that with our site, ArtStation, that has several hundred thousand. Despite the huge difference in registered users, our site ArtStation had more traffic.

This turned into a lively and friendly debate. The criticism was that their registered user numbers were inflated due to fake users. Their platform had been put on blast by users complaining that the community was being overrun by bots who were up-voting content.

They were very focused on growing key (vanity) metrics. They wanted to grow the number of registered users thus keeping sign up friction very low. In contrast, our objective is focused on quality and signal-to-noise, hence we added additional steps to prevent bad actors from signing up. This resulted in a much lower number of registered and monthly active users -- because we know that ours are real and engaged.

That company recently got acquired and (I assume) some people got rich as a result. That's why, I believe it really depends on your objectives and strategy.

Recommendation - find a healthy balance

recommend wide

 

These are recommendations. Nothing is black and white. You may start without any friction while the community is nascent, and as you get more traction add additional security features to prevent bad actors. You have to really think about what is best for your business, and the community you’re building.

If you care deeply about signal to noise and maintaining a high quality community, take a more preventative approach. If you sign-up flows are designed, tested and iterated odds are you’re customers won’t find them cumbersome.

Balance is key.

Ready to transform your business?