SQS Event Duplication When Autoscale

Background

Our microservice from having 1 instance having to scale to 3 instances. They're all listening to SQS event. Upons QA testing, the events got consumed by different instances by the same time.

Issue

Back then when dealing with SQS, the team had to choose a SQS Standard Queue instead of SQS FIFO because of company policy. Therefore, there could be a chance of the message being duplicated. So it could be one of the option

The second option is SQS Message Visibility Timeout, which is default to 30 seconds. By default, because previously there is only 1 instance, it doesn't pick up the new message even if that message has been processed. So the message stays back in the queue but no one consumed it. After it finished consumption, it will delete the message, therefore the message disappear from the queue.

However, now we have 3 instances. if after 30 seconds visibility time, the instance that picked it up cannot finish process it, the event will return back to the queue. Therefore another instance will try to process the same message, hence have duplications.

Solution

We decided to try out to change the SQS Message Visibility Timeout to see if it could fix the problem and it indeed fixed it.

But however in the future, because our queue is SQS Standard Queue, we're going to run into duplication at some point. But our application is already coded in a way to prevent this.