If you define points origin as bottom left, corner as top right,
and
< (a, b ) {(a.x < b.x) and (a.y < b.y)})
you can make it even simpler conceptually (but, same actual comparisons in the end):
(self.origin < other.corner) and other.origin < self.corner
Often in problems like this it’s worth it to consider symmetries. The two sets of cases – the ones where a.start < b.start, and the ones where b.start < a.start, collapse into one by sorting a and b first by their start values. Now we can assume that a.start < b.start w.l.o.g, leaving only two cases to consider:
This is a good post, in that it shows how to work through a problem to a solution. I think though it should mention Allen’s Interval Algebra somewhere because it’s sort of a standard approach to working with intervals.
Definitely! There's actually a pretty interesting way to think about it.
For the 2D case, you could actually represent a box as 2 intervals: one for the x axis and one for the y axis. Two boxes overlap if their x axis overlap and their y axis overlap.
This can be generalized to any dimension. An n dimension "hyperinterval" can be made with n intervals. Two "hyperinterval" overlap if all the pairs of interval of the same axis overlap.
I would say that something so simple and straightforward doesn't warrant a blog post.
But. In my career I think I've seen this test implemented by coworkers 4 times, and every one was incorrect - not an exaggeration (though likely some sampling bias).
And even when I pointed out the failing cases, the fixed versions were always the naive 4-clause solution. Hell, when Postgres got range type support I remember seeing a Powerpoint promoting the new feature that exclusively used the 4-clause version throughout.
I guess there's something about this problem that makes it really unintuitive to many people.
I agree that in a sense it is kind of basic as a topic. But on the other hand it is often implemented in odd ways, usually not as the minimal expression.
For me, the main goal is really the insight that changing the viewpoint for cases can make a problem simpler. Seeing it on a simple case makes it easier to rememvber the idea for more complicated cases. And also, I got to add some fun visuals :-)
It kind of reminds me of the birthday problem: how do you calculate the probability that at least two people share the same birthday in a group of n people?
If you try to find the formula directly, you'll get into way too many cases that are hard to reconcile: how do I take into account the fact that 3 people sharing the same birthday is also valid? What about having multiple pairs of birthday?
It's easier to ask the opposite question: what's the probability that nobody shares a birthday? It's easy: imagine each person entering the room, one at a time. The first one obviously shares no birthday with anyone in that empty room, so they have 365/365 chances of not sharing a birthday with anyone else in that room. The 2nd one has 364/365 chances of not sharing their birthday with the 1st person, the 3rd one has 363/365 chances of not sharing their birthday with the 2 first people, the 4th one has 362/365 with the 3 first people and so on. The final probability is the product of all these probability, which would be 365364363...(365-n+2)(365-n+1)/365^n, or 365!/((365-n)!365^n) if you want to simplify
Then, the result of the original problem is just the complement: 1- 365!/((365-n)!*365^n).
To me, this kind of supports how useful doing math can be to software engineering. The topics themselves might not really help, but the approaches that you learn to solve problems are very useful.
The intent is more to give guidance on how one can think about creating Boolean conditions. And I agree, there are lots of very interesting problems for intervals and for boxes.
In a list of intervals, I would probably do something sweep-based. Keep a list of interval start and interval end-events sorted on the position. Then a sweep along all these can keep a current set of active intervals, adding to it when a start is processed and removing from it when a stop is processed.
Rydier@reddit
If you define points origin as bottom left, corner as top right,
and
< (a, b ) {(a.x < b.x) and (a.y < b.y)})
you can make it even simpler conceptually (but, same actual comparisons in the end):
(self.origin < other.corner) and other.origin < self.corner
another_throawayACC@reddit
Very nice post! As a beginner backend developer i really enjoied it and it will help me in tackling challenges in opposite perspective!
Sharlinator@reddit
Often in problems like this it’s worth it to consider symmetries. The two sets of cases – the ones where a.start < b.start, and the ones where b.start < a.start, collapse into one by sorting a and b first by their start values. Now we can assume that a.start < b.start w.l.o.g, leaving only two cases to consider:
dobryak@reddit
This is a good post, in that it shows how to work through a problem to a solution. I think though it should mention Allen’s Interval Algebra somewhere because it’s sort of a standard approach to working with intervals.
hueuebi@reddit
I enjoyed the post, thanks. Very well written.
mzl@reddit (OP)
Thank you
ArminiusGermanicus@reddit
Is there a way to generalize this to n dimensions?
BaNyaaNyaa@reddit
Definitely! There's actually a pretty interesting way to think about it.
For the 2D case, you could actually represent a box as 2 intervals: one for the x axis and one for the y axis. Two boxes overlap if their x axis overlap and their y axis overlap.
This can be generalized to any dimension. An n dimension "hyperinterval" can be made with n intervals. Two "hyperinterval" overlap if all the pairs of interval of the same axis overlap.
therealgaxbo@reddit
I would say that something so simple and straightforward doesn't warrant a blog post.
But. In my career I think I've seen this test implemented by coworkers 4 times, and every one was incorrect - not an exaggeration (though likely some sampling bias).
And even when I pointed out the failing cases, the fixed versions were always the naive 4-clause solution. Hell, when Postgres got range type support I remember seeing a Powerpoint promoting the new feature that exclusively used the 4-clause version throughout.
I guess there's something about this problem that makes it really unintuitive to many people.
mzl@reddit (OP)
I agree that in a sense it is kind of basic as a topic. But on the other hand it is often implemented in odd ways, usually not as the minimal expression.
For me, the main goal is really the insight that changing the viewpoint for cases can make a problem simpler. Seeing it on a simple case makes it easier to rememvber the idea for more complicated cases. And also, I got to add some fun visuals :-)
BaNyaaNyaa@reddit
It kind of reminds me of the birthday problem: how do you calculate the probability that at least two people share the same birthday in a group of n people?
If you try to find the formula directly, you'll get into way too many cases that are hard to reconcile: how do I take into account the fact that 3 people sharing the same birthday is also valid? What about having multiple pairs of birthday?
It's easier to ask the opposite question: what's the probability that nobody shares a birthday? It's easy: imagine each person entering the room, one at a time. The first one obviously shares no birthday with anyone in that empty room, so they have 365/365 chances of not sharing a birthday with anyone else in that room. The 2nd one has 364/365 chances of not sharing their birthday with the 1st person, the 3rd one has 363/365 chances of not sharing their birthday with the 2 first people, the 4th one has 362/365 with the 3 first people and so on. The final probability is the product of all these probability, which would be 365364363...(365-n+2)(365-n+1)/365^n, or 365!/((365-n)!365^n) if you want to simplify
Then, the result of the original problem is just the complement: 1- 365!/((365-n)!*365^n).
To me, this kind of supports how useful doing math can be to software engineering. The topics themselves might not really help, but the approaches that you learn to solve problems are very useful.
mzl@reddit (OP)
That is a great example of a similar case!
ymgve@reddit
Thought this was gonna be about a smart way to detect overlaps in a list of intervals, which is a more interesting problem
mzl@reddit (OP)
The intent is more to give guidance on how one can think about creating Boolean conditions. And I agree, there are lots of very interesting problems for intervals and for boxes.
In a list of intervals, I would probably do something sweep-based. Keep a list of interval start and interval end-events sorted on the position. Then a sweep along all these can keep a current set of active intervals, adding to it when a start is processed and removing from it when a stop is processed.