Why you can't allocate a human-readable ID before the record exists
Posted by Gronax_au@reddit | programming | View on Reddit | 28 comments
Author here. The non-obvious thing is that the idempotency key needs to gate the slot reservation, not the number assignment. The number has to stay inside Phase 2 or you get phantom allocations with no backing record. Happy to discuss the reaper design or the 2PC comparison if anyone's curious.
rsclient@reddit
QQ about the post: why is burning an id such a bad thing? The primary value of the ID is to be able to say "I'm working on SYN-42". Does it matter that SYN-41 got burned? Does it even matter if a substantial number of the ids are burned?
As a (former) PM, though, a hint to anyone who does something like this: you can avoid manager freak-out by making sure they show up in the dashboard as "automation task" or something innocuous. Be sure they don't show up in the "overdue tasks" pie slice!
drcforbin@reddit
I worked on a system that tracked patient samples. They were entered into the system when they came in the door, and each was assigned a number that incremented through the day. If they went 260505-1002, 260505-1003, 260505-1005, we'd get a call every time to track down a missing sample, asking whether it was lost in the lab, someone deleted it, or the system had a bug
Gronax_au@reddit (OP)
Yep, a burned ID isn't catastrophic and a gap in the sequence doesn't break the communication value. The concern is what frequent burning indicates about the system. If retry logic is misfiring and burning at a high rate you'd have no insight. The numbers just quietly increment with nothing behind them. Clean allocation is a health indicator as much as a UX feature. It also avoids the PM asking "Who's got SYN-41?"
In practice an occasional burned ID from a bad deployment or network failure is tolerable. The goal is keeping burns rare enough and to minimise massive sequence gaps from a failure thrashing.
Gronax_au@reddit (OP)
Two reasons. Like you mentioned it’s the “who’s got SYN-41?” And the other, which is more severe ,is a sync client thrashing on retries leaving large gaps.
nNaz@reddit
Good read, I agree with your point about allocating IDs at the end. To take go even simpler than 2pc and raft: can’t you just make the entire thing atomic on the server side?
Single threaded actor with a queue that’s responsible for handing out human readable ids: client sends a single request, http handler routes to actor, actor completes or fails atomically. Because it’s an actor behind a queue you can support concurrent requests and because it’s atomic you don’t need the extra two step allocation. Does this sound like a ridiculous idea?
Gronax_au@reddit (OP)
Yeah, not ridiculous at all. The actor-behind-a-queue pattern is exactly the right tool for the concurrency case.
The ordering constraint in the article is solving something different though. Its addressing idempotency across retries. Even with a single-threaded actor, if the HTTP connection drops and the client retries, the actor sees a fresh request. Without idempotency key tracking it creates a second task with a new ID.
There's also in my case there is a two-store problem. TaskChampion and the allocation DB are separate systems that can't share a transaction boundary. An actor serialises access to the counter but it can't make "write to TaskChampion and write to the allocation table" atomic. If the actor crashes between those two writes you're back in reaper territory.
The actor and the two-phase structure solve different things. You'd want both. The actor crash is low probability in practice but Murphy's gonna get me in PROD. The two-phase structure is built not because I expect it to fail. When it does recovery is guaranteed rather than manual.
nNaz@reddit
I understand it a little better now with more context.
To get around the actor crashing problem you could use an externally backed queue liked Redis with multiple actors work stealing. Each request stores the client idempotency key as well. To solve the transaction part you can have a reversible saga with external state in Redis.
Actor would work like this: 1. Pop item from queue incl idempotency key 2. Do first write, if fail either return to client or retry 3. When first succeeds, write (client key, allocation id to separate ‘started’ state to Redis) 4. Attempt second write. If it fails, retry. 5. Update state from (3) so it contains (client key, allocation id, second id) 6. Return to client
Points of failure: - Client http failure. Resolved by looking for any started state by idempotency key. If fully finished, return to client immediately. If partly finished, actor resumes. - Actor failure. Resolved by having a single actor that restarts on failure. On start it picks up any incomplete started states and completes them, then goes to consuming from work queue.
- DB Failure between two DB writes. Handled by actor retrying the second (assuming no actor failure) - DB failure (when actor fails). Handled by actor restart behaviour: at start it looks for any incomplete ‘started’ states and completes them. Future client request read the completed state as discussed above - Server/actor failure after phase 2 write but before updating Redis. You resolve this with a reaper. In actor it’s resolved by new actor picking up ‘started’ work, seeing P1 is complete and attempting to create a new P2 row which fails due to uniqueness on allocation id. Actor is aware this exists so it updates ‘started’ state with the TC id and reruns to client.
Partially resolved failures: - Actor failure between first write completion and updating Redis. This failure is also present in your design when the process fails between Phase 1 and starting Phase 2. In both cases it’s partially resolved by the client retry kicking off a new phase 1, still get correct result but now have phantom rows in slot reservation db (for both our designs)
Would this logic be a bad idea for your use case? I admit I still don’t fully understand all the context.
Or an even simpler design (but depends on your context): - Client idempotency key but only used for TC lookup, not request gating - Actor with external queue - State in Redis holding ‘fresh allocations’
Actor performs: 0. Check TC to see if a row with client idempotency key exists. If so return that row. Finish. 1. See if any unused Phase 1 allocations are in ‘fresh allocations’ in Redis. If so, skip step 2. 2. Phase 1 allocated and written to ‘fresh allocations’ in Redis 3. Phase 2 attempt. On failure, retry. 4. Remove allocation id from ‘fresh allocations’ 5. Return result to client
Because the allocations aren’t tied to any client we don’t need client state. Let’s say client 1 makes a request, allocation created, actor fails before TC created. Then client 2 makes a request, restarted actor will use the allocation originally created for client 1 to proceed for client 2. The result is a temporally sequential id. Client 1 retries, gets an ID later than client 2 (via a new allocation).
Failure modes: - Client retries. Resolved by clients always seeing a new sequentially ordered ID or error. - Actor failure after work queue consumption but before P1. Resolved by HTTP handler returning error on actor failure. Client retries and adds same work to queue, new actor picks it up - Actor failure after P1 but before writing to Redis. This is also a failure mode in your design. Leads to phantom rows in allocation db. Operations still work normally - Actor failure before P2. Client gets error. Next request to restarted actor (from any client) reuses the previous allocation (stored in ‘fresh allocations’) - Actor failure after P2 but before ‘fresh allocations’ is cleared. Actor picks up original P1. Attempts P2 but gets DB uniqueness error on allocation id. Reuse it was never removed from ‘fresh allocations’ it was never returned to a client so we can reuse it for this request and remove allocation id from ‘fresh allocations’ - Actor failure after removing from ‘fresh allocations’ but before returning to client. Solved in the same was as your method: by looking up client idempotency key in TC step 0.
This is a drastically simpler design with no reaper needed but I’m not sure if I fully understand your context. Would you be against sharing your full constraints & context for the problem?
Gronax_au@reddit (OP)
Wow that was a long analysis! The saga pattern is right for this problem and you are also right in that the phantom row problem exists in both designs.
Answers your questions:
Clients don't specify the human readable ID. The server assigns it.
Allocation rows store idempotency key, slot ID, status, and eventually the task UUID and human key once phase 2 commits
TC rows are standard TaskChampion format with no server-specific state stored there and outside the transaction scope of the allocation DB
Currently it's "first successful commit time" with the number assigned when Phase 2 commits. Your second design makes this explicit.
Worth noting too the assumption is that clients are occasionally connected so we need to support offline flows and scenarios.
The constraint your second design doesn't address is that cmdock is designed for self-hosted deployment, ideally a single binary with minimal dependencies and other running servers. The current design uses the existing allocation DB as the coordination layer which is already a dependency with no additional infra. Adding Redis would work but it changes the deployment model significantly. I didn't want users spinning up and maintaining a Redis server even if it was in a Docker compose file. It would be another moving part that wasn't under direct application control.
Your "fresh allocations" pool in Redis and the pending slot table in the allocation DB are doing the same job. Same structure but using Redis as another state store. I chose to keep it within the cmdock server for deployment simplicity and to minimise external dependencies. Other people may have chosen a different deployment shape.
dom_ding_dong@reddit
I'm still confused. Why do you need ordering. You have queue, you pop one off and hand it to a client. A client either finishes and acks which removes token permanently or fails and times out or rejects and the token goes back on the queue.
You still get temporal ordering of sorts. If there is a dependency then that has to be handled separately.
Gronax_au@reddit (OP)
The ordering constraint in the article is about when the number gets assigned, either before or after the task definitely exists in both stores.
The slot ID in the article coordinates creation across the two data stores and isn't human readable. Once both writes have committed successfully the human readable ID is assigned. If either write fails, the slot burns but no human ID is consumed. A crash between the two writes such as startup thrashing, OOM kill, deployment restart will leave a pending slot for the reaper to clean up and not a phantom human ID.
dom_ding_dong@reddit
Ok but if the tasks are independent why does ordering matter?
You have one synchronized queue, any task can only get one number. If they use it great - if they can't it goes back in the pool. Even so sometimes stuff will finish and die before ack. The assumption is there is a vacuum task that reassigns a new number
What I'm saying is I don't see the need for a two write system here unless I'm missing something
Gronax_au@reddit (OP)
The ordering constraint is about sequence within a single request. Does the human readable ID get assigned before or after the task definitely exists in the first datastore? Not about ordering between independent tasks.
The two write system exists because the task store and the allocation DB are separate systems that don't/can't share a transaction scope. If they were the same datastore the queue approach works cleanly with one transaction, write the task and consume the token.
But the first datastore is an embedded task store with its own schema independent of the server's allocation DB. They can't share a transaction so when a request dies between the two writes you need to account for recovery. FYI the vacuum task you described is the reaper in the article. 😄
So the two phase approach here exists to make that recovery deterministic rather than ad hoc. This is a niche case but the technique is generally applicable when co-ordinating two or more datastores which is why I thought it was worth writing up.
dom_ding_dong@reddit
Ok that makes sense. Perhaps English is not your first language, but a more apt explanation would be that you have multiple cues of people doing things and each person is submitting a ticket for a job and you have to make sure that within the queue it is synchronized. Which to be honest you could also just delay on the client side
lucatk@reddit
please stop so obviously including AI slop text blocks into your replies.
- you're back in reaper territory?
- Murphy's gonna get me?!!
ooof
Jaded-Asparagus-2260@reddit
Great read, thanks! One note about your page layout: The images make the post hard to read on mobile, because you have to keep scrolling back and forth horizontally. I suggest having them scaled down to viewport width, and keep pinch-to-zoom enabled so the reader can zoom on their own discretion.
Gronax_au@reddit (OP)
Thanks. Obviously that didn’t get tested by me!
_xiphiaz@reddit
You can allocate a human readable identifier but it has to have equivalent entropy to a uuid, so it would probably be a whole paragraph. Also not sortable
zombiecalypse@reddit
Depends on your throughput. UUID is designed to be universally unique, but if you create 10 tickets per day, you can just use a counter and that's not ridiculously low for anything that needs a pet name. Even if you need to hand out allocated identifiers beforehand, that's not a lot of data to store in a list for each worker.
Blue_Moon_Lake@reddit
Also the odds of it being nicely insertable in an index like UUIDv7 is low
_xiphiaz@reddit
You actually could implement uuidv7 directly, as in rather than represent with base16 digits, represent it with base(length of dictionary) and use whole words represented by their index in a dictionary.
Sounds like a fun side project to implement, and might not be that bad with a short enough dictionary of unambiguous words like What3Words uses
erocuda@reddit
Use a n-gram model and encode it in base-64 using the 64 most likely words to follow the previous word selected. Now it reads like a somewhat grammatical fever dream.
DrShocker@reddit
Correct me if I'm wrong, but wouldn't this still be best represented as a 128 bit number in the computer, and we're just constructing a "pretty printer" to make it seem legible. I guess we'd want to map bidirectionally though.
Gronax_au@reddit (OP)
My use case here specifically is for sequential integers. The value is "I'm working on SYN-42" in a standup, which only works if the numbers are short and speakable. Dictionary style identifiers are a great fit for spatial coordinates like the What3word where you need high entropy and its the users are conditioned to it.. But for task keys you want monotonic and short. Two different optimisation targets.
nardii@reddit
What if the human-readable identifier is a derived value? If you have an append-only workflow, you could simply read the identifiers as a rule: PREFIX-42 means taking the 42nd row after you filter by tag PREFIX and you sort by creation timestamp and unique ID. And then you can keep a lookup table that maps between unique ID and human readable ID on the side for easy joins :)
Gronax_au@reddit (OP)
The derived ID approach is interesting but it highlights the core tension in the article. Derived IDs like UUIDs and position-based keys are elegant because there's nothing to protect. The ID falls out of the records properties. Allocated IDs like SYN-42 introduce a whole class of problems that derived ones don't have. The reason to pay that cost is the communication value for the user. The user wants a short, stable and speakable ID in a standup. The derived ID approach get close but the deletion and concurrent write by clients cases mean the key isn't stable, which puts you back in the same problem area. I can remember years ago running into this same problem with assigning Pole numbers for the electrical distribution network where you had multiple writers and the need for human IDs that are numeric.
nardii@reddit
Yes you're right, it only works if you don't have deletes. But if that is the case it can be pretty simple, and the nice thing about deferring identifier creation to read time is that by the time the identifier is needed (eg to display on a webpage) the writes that fall within one timestamp "bucket" will already have settled (and otherwise you can delay it 1ms without much problems). So you end up with little complexity but still a stable result from the users POV :)
Gronax_au@reddit (OP)
Yep, agreed for that case when its append-only, no deletes, server-assigned timestamps. The derived IDs will settle cleanly and you skip the allocation machinery and complexity entirely.
The case where that doesn't fit is with the settling assumption. Clients can write offline. A task created last Tuesday in airplane mode syncs today with a past timestamp which shifts every position-based ID assigned after it. The user says "SYN-42" in Monday's standup, another client syncs Wednesday, and now SYN-42 is a different task. The allocation is the cost you pay to prevent that.
Gronax_au@reddit (OP)
The derived-value approach is interesting but has a couple of failure modes. First is deletion. If SYN-41 is deleted, every key after it shifts unless the dataset is genuinely append-only which my system isn't. Second is concurrent creation. When there are multiple clients writing simultaneously, sort-by-timestamp order isn't stable until the writes settle, so the human key can't be assigned until you can confirm the sort position. That is essentially the same ordering constraint that the article describes just deferred to read time rather than solved at write time. The lookup table you describe is close to what the allocation table is doing, just computed lazily and asynchronously.