Read committed re-evaluation

Read committed. Default Isolation level for postgres based transactions. As Isolation level it prevents transactions to see uncommitted changes of other transactions(transaction still sees own uncommitted changes). If you are not familiar with this isolation level I highly encourage you to take a look at the Official documentation. However, some sentences from documentation were unclear fo me so I assume that I could clarify them for those who are as bad in Database theory as I am.

Tom Reads book
Tom & Jerry by William Hanna and Joseph Barbera

Re-evaluated

The most confusing sentence for me was the one that describes re-evaluation of WHERE clause during two concurrent updated, namely:

The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row. In the case of SELECT FOR UPDATE and SELECT FOR SHARE, this means it is the updated version of the row that is locked and returned to the client.

Will the second updater see all the changes made by the first one? Let's see with our best friends Alice and Bob and the best designed table called account CREATE TABLE account(id int primary key,name text,amount int);

Now let's insert two values into the table
INSERT INTO account(id,name,amount) VALUES (1,'Bob',200),(2,'Bob',800); . Then open two parallel transactions using psql (BTW, do you know about pgcli? I use it daily instead of heavyweight pgAdmin). The first transaction will execute (and not commit) the following query
test> update account set amount = 300 where amount = 200; --1 .
And the second one will execute this query
update account set amount = 400 where amount = 200;--2.
So as the second transaction wants to update a row which has already updated and hasn't been committed yet it will be locked. The question is , will it update amount to 400 when the first transaction will commit it's changes. Let's see , commit first transaction and see if the second transaction will update amount from 300 to 400.
test> select amount from account order by name desc
800
300

So far so good, Postgres follows the documented behavior. The second transaction re-evaluated WHERE clause and as the first transaction committed changes then the second one didn't update any row cause the amount was changed from 200 to 300 and search condition in the second transaction skipped this row.

What about Subqueries?

And here is the most interesting part. Let's return to our previous statement when Bob had two accounts with 800 and 200 respectively.Open two parallel transactions.First one will execute the following query (remember transaction didn't commit this update)
UPDATE account set amount = 100 where amount = 200; -- 1 . And after that the second transaction will execute this query.
UPDATE account set amount = amount+50 --2
where name in
(select DISTINCT name from account group by
name having sum(amount)>=1000); -- 2

So the second transaction wants to add 50 to all accounts whose total amount is bigger or equals to 1000. As it wants to update the same row as the first transaction it will be locked. Bob's total amount is 1000. However the first transaction changed the amount of the first account from 200 to 100 so now Bob should have 900 as soon as the first transaction will be committed. Will the second transaction re-evaluate subquery and skip Bob's accounts? Let's see. Commit first transaction and then the second one. Execute select and account table will look like this.
SELECT * from account
-[ RECORD 1 ]
id | 1
name | Bob
amount | 850
-[ RECORD 2 ]
id | 2
name | Bob
amount | 150

Which means that the second transaction didn't re-evaluate subquery and updated Bob's accounts even if their total was 900 before changes from the second transaction. Why ? As far as I understand , the WHERE clause in the second query where name in... was reevaluated and it checked if name was changed. However the subquery has already been finished the execution and it saw the state when Bob had total amount equals to 1000. How to fix it? Well the easiest way would by to execute two transactions in REPEATABLE READ isolation level. In this case the second transaction will fail with could not serialize access due to concurrent update exception and your application code will have to add retry logic for this transaction.

Conclusion

For me it was unobvious that READ COMMITTED doesn't re-evaluate subqueries inside WHERE clause but this is how it works. Here is my StackOverflow question related to it.