# Database Transactions: Preventing Lost Updates

[Jozef Cipa](https://www.strv.com/blog/authors/jozef)  
Backend Engineer

---

Web applications are often complex systems consisting of several parts such as UI (frontend), API (backend), a database and often other 3rd-party services that the application depends on. Designing the API service properly so it’s robust, secure and works as expected goes without saying — but sometimes other factors should be considered and handled adequately.

These factors include **concurrency and duplication**. It is important to know about them and be aware of the challenges they entail and how to deal with them. In this article, we will look at three situations in which our API would normally work just fine but which encounter issues once concurrent or duplicate requests come in.

## Where’s My Data?

Imagine you’re working as a copywriter at a company. As part of your job, you prepare posts for the company blog and also proofread posts written by employees. One day, an employee writes a new post and asks you to review it. As you start reading it, you notice a typo.

*The quick brown fox jmps over the lazy dog*

Naturally, you correct it and save the post, so it looks like this:

*The quick brown fox jumps over the lazy dog*

But at the same time, as you’re reviewing the article, that employee’s also reading it and realizes they forgot to add something important.

*The quick brown fox jmps over the lazy dog Sphinx of black quartz, judge my vow*

So they go ahead and add it, saving the article. Now, since the employee saves the article later, it looks like this:

*The quick brown fox jmps over the lazy dog Sphinx of black quartz, judge my vow*

As you notice the change, you see that a whole new part was added but the typo that you just fixed has reappeared again. This is known as the **lost update** problem (also referred to as “mid-air collisions”).

The reason for this is simple: Since both of you had a loaded version of the article, when you updated it, the employee’s version inherently became outdated. When they updated the article by adding the missing text, they only saved their changes while the rest remained as it was when the article was first loaded — rewriting all changes published by you in the meantime.

This problem can be avoided by using **ETag**, an HTTP [header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag?ref=strv.ghost.io) that identifies a specific version of the content. Its value is not strictly defined and can be anything the user prefers but, usually, it’s a document hash (e.g. 3da5415599), modification timestamp (e.g. 1672443352) or a simple revision number (e.g. 1, v1, v1.1).

The idea here is that every time a post is created or updated, the API generates an ETag string for it that represents the current revision. This value is then stored and returned as a header to the client whenever it asks for the given post.

Let’s write a simple Node.js API that simply returns some mocked data and sets the ETag HTTP header, so we can see how it works.

```javascript
import Koa from 'koa'
import Router from 'koa-router'
import { koaBody } from 'koa-body'
import { createHash } from 'crypto'

const app = new Koa()
const router = new Router()

function createVersionHash(data) {
  return createHash('md5').update(data).digest('hex')
}

const demoText = 'The quick brown fox jmps over the lazy dog'
const demoPost = {
  id: 1,
  text: demoText,
  etag: createVersionHash(demoText),
}

router.get('/api/posts/:id', ctx => {
  ctx.set('ETag', demoPost.etag)
  ctx.body = demoPost
})

app.use(koaBody())
app.use(router.routes())
app.listen(3000)
```

If we run the script and call the endpoint, we will see that the ETag header is being set and returned in the response.

```bash
$ curl -i http://localhost:3000/api/posts/1

HTTP/1.1 200 OK
ETag: 961248836f12bcd8fada83b5ac06a7de
Content-Type: application/json; charset=utf-8
Content-Length: 182

{
  "id": 1,
  "text": "The quick brown fox jmps over the lazy dog"
}
```

Now, when the client wants to update the post, it sends the updated content along with the ETag header (`If-Match: "<etag value>"`) (note, that according to the [specification](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Match#directives), the ETag value is placed between double quotes) to the API.

The server first checks the header and validates it against the value stored in the database. If it matches, the request can proceed. If a change has been made, the ETag is outdated — so the request gets rejected, usually with an HTTP error 412 Precondition Failed. The client needs to resolve this situation by fetching the post again, thus retrieving the latest version.

Let’s add another endpoint that will handle the updating of posts.

```javascript
// ...

router.put('/api/posts/:id', ctx => {
  const etagMatch = ctx.request.headers['if-match']?.replace('"', '')
  if (!etagMatch) {
    ctx.status = 400 // Bad Request
    ctx.body = { err: 'ETag is missing' }
    return
  }
  if (etagMatch !== demoPost.etag) {
    ctx.status = 412 // Precondition Failed
    ctx.body = { err: 'ETag mismatch' }
    return
  }
  // proceed to update
  demoPost.text = ctx.request.body.text
  demoPost.etag = createVersionHash(demoPost.text)
  ctx.body = demoPost
})
```

This example shows how a first update can go smoothly — but subsequent attempts with outdated ETags will be rejected, preventing lost updates due to concurrency issues.

Utilizing ETags is a way to implement **optimistic concurrency control**. Instead of relying on locks, each transaction, before committing, first verifies that no other transaction has modified the data it has read. This approach is great when conflicts are rare but may have performance implications if they happen often.

---

**Note:** ETags are also used for **caching resources**, where the client (browser) sends the ETag with the request via the `If-None-Match: <etag>` header. The server only returns the content if the version on the server doesn’t match the one sent.

## Don’t Charge Me Twice!

Imagine you’re traveling in a car and decide to buy something online. The connection drops right when you press “Buy”. After reconnecting, you press the button again, resulting in a double charge. How does that happen?

It’s because the server didn’t know the request was a duplicate. The solution here is **idempotency** — ensuring that repeating a request won’t cause unintended side effects.

Idempotency is crucial for operations like payments, where accidental double charges are unacceptable. For example, Stripe and Shopify describe this approach to prevent multiple executions of the same request.

### How Does Idempotency Work?

The key is marking a request with a **unique Idempotency-Key header**. The server stores this key along with the response. If the same request comes with the same key, the server returns the stored response instead of executing the operation again.

Example in code:

```javascript
import { randomBytes } from 'crypto'

function generatePaymentId() {
  return randomBytes(20).toString('hex')
}

const payments = []
const userAccounts = {
  'john.doe@example.org': {
    email: 'john.doe@example.org',
    balance: 200,
  }
}

router.post('/api/payment', (ctx) => {
  const idempotencyKey = ctx.request.headers['idempotency-key']
  const { sender, amount } = ctx.request.body
  const userAccount = userAccounts[sender]
  const paymentByIdempotencyKey = idempotencyKey && payments.find(p => p.idempotencyKey === idempotencyKey)
  if (paymentByIdempotencyKey) {
    ctx.set('Idempotent-Replayed', true)
    ctx.status = paymentByIdempotencyKey.code
    ctx.body = {
      payment: paymentByIdempotencyKey,
      userAccount,
    }
    return
  }
  const payment = {
    id: generatePaymentId(),
    sender,
    amount,
    idempotencyKey,
    status: null,
    code: null
  }
  if (userAccount?.balance >= amount) {
    userAccount.balance -= amount
    payment.status = 'OK'
    payment.code = 200
  } else {
    payment.status = 'NO_MONEY'
    payment.code = 400
  }
  // record payment
  payments.push(payment)
  ctx.status = payment.code
  ctx.body = {
    payment,
    userAccount,
  }
})
```

When we send the same request with the same Idempotency-Key, the server responds with the same result, preventing duplicate charges.

Testing:

```bash
$ curl -i -X POST http://localhost:3000/api/payment -d '{"sender": "john.doe@example.org", "amount": 100}'
# Response: payment created and account charged

$ curl -i -X POST http://localhost:3000/api/payment -H 'Idempotency-Key: 77e76f80-0466-4e83-95bf-bf754eefa37c' -d '{"sender": "john.doe@example.org", "amount": 100}'
# Response: same as above, reusing response if request retried
```

If a duplicate request is sent with the same key, the server responds with the original payment, avoiding double processing.

## Deduplication ID

Idempotency isn’t only for APIs. Message queues also implement similar concepts using **deduplication IDs**. Once a message with a specific deduplication ID is published, subsequent messages with the same ID within a certain interval (e.g., 5 minutes in AWS SQS) are ignored.

## Database Transactions

In cases where HTTP headers and idempotency aren’t enough, **SQL database transactions** provide a more robust solution.

Transactions ensure data integrity via the **ACID** principles:

- **Atomic:** All or nothing; if any query fails, the whole transaction rolls back.
- **Consistent:** Respects all constraints, maintaining valid state.
- **Isolated:** Runs independently without interference, using locks.
- **Durable:** Once committed, persists through crashes.

### Isolation Levels

Isolation levels configure how transaction concurrency is managed:

- **Read uncommitted:** Allows dirty reads (not supported by Postgres)
- **Read committed:** Default in Postgres; only read committed data
- **Repeatable read:** Prevents non-repeatable reads
- **Serializable:** Highest; transactions behave as if run serially

More details: [Check out this article](https://lchsk.com/database-transactions-concurrency-isolation-levels-and-postgresql?ref=strv.ghost.io).

### Locking

Locks can be applied on tables or rows, preventing simultaneous conflicting modifications. For example, `SELECT ... FOR UPDATE` acquires a lock on selected rows, preventing other transactions from modifying them until the lock is released.

Alternatively, `SELECT ... FOR UPDATE NOWAIT` attempts to acquire the lock and errors immediately if unavailable.

Example use case:

```javascript
await sequelize.transaction(async (trx) => {
  const user = await User.findOne({
    where: { id: webhook.userId },
    lock: transaction.LOCK.UPDATE, // SELECT ... FOR UPDATE
    transaction: trx,
  });
  if (user.accountStatus !== webhook.userStatus) {
    await user.update({ accountStatus: webhook.userStatus }, { transaction: trx });
    // send notification
  }
});
```

This ensures only one transaction updates the user at a time, avoiding duplicate notifications or inconsistent state.

---

## Summary

I’ve shown you some essential concepts that could occasionally come in handy:

- Use **ETag** to prevent accidental overwrites in concurrent edits.
- Implement **idempotency** to avoid duplicate operations like double charges.
- Understand how **database locks and transactions** protect data integrity.

Analyzing your project thoroughly and choosing appropriate methods for your specific scenario will save you trouble and bugs down the line.

---

*Don't miss anything.*