Loading...

Follow Petr Mitrichev's blog - Algorithm problems for.. on Feedspot

Continue with Google
Continue with Facebook
or

Valid
Let me put the weekly summaries aside for a moment and talk about the TCO 2018 finals experience, while it's still relatively fresh. As the TCO finals is one of the most important contests of the year, naturally its problems often receive deeper analysis than usual. In this post, I'd like to dive into the easy problem. The correct solution is described in the editorial, so I will focus on the incorrect ones: namely, all solutions submitted during the contest :)

I've started to have my suspicions during the contest itself, as I was implementing a rather complicated floating-point solution which had quite a bit of subtractions of possibly close numbers. However, at that point I concluded that given that it's the first problem, most likely the floating-point part checks out.

Then I saw 3 out of 6 solutions fail the system test, which was completely unexpected. Then I watched the recording of Scott's and Lewin's broadcast (which seems to be gone now :( Anyone has a link?), where Michal and another Michal mentioned the top-down solution, and told that the bottom-up solution that operates in formal piecewise-linear functions is very tricky to get right — but it was exactly the solution that all contestants seemed to implement.

The final realization came when I was looking at Um_nik's code for the problem, to understand why it could fail the system tests. It all seemed to check out, but had one big difference with, say, my own code: it used 64-bit integers to represent the x-coordinates of the junction points of the piecewise-linear function instead of doubles. It is true that all junctions have integer x-coordinates, so this strongly pointed to the fact that 64-bit integers simply overflow, which was indeed the issue.

But then it occurred to me: if 64-bit integers overflow, and the answer can be very small, how come the double-based solutions have enough precision? Armed with the knowledge that super large x-coordinates are possible, I've taken another look at my solution, and the potentially problematic line immediately stood out:

return value[pos] + slope[pos] * (at - control[pos]);

Here I'm trying to evaluate a linear function when x=at. There are actually two subtractions in this line: one denoted with a minus, and one with a plus :)

The at - control[pos] one is actually not an issue by itself, as both at and control[pos] are integers, and the best answer is always found for relatively small values of at, so when control[pos] is also small this subtraction is exact, and when it's big we're not subtracting two close numbers anymore.

However, the plus is actually an issue. Suppose at is small, the value of the linear function there is also small, but both control[pos] and value[pos] are big, then we're subtracting one huge number (-slope[pos] * (at - control[pos])) from another (value[pos]) to get a tiny result, and this is where the small relative error turns into a huge absolute error.

In order to construct a concrete testcase, we need to understand how to get a segment in our piecewise-linear function such that one end has both coordinates small, and the other end has both coordinates big. We can notice that if we take any rooted tree and its corresponding piecewise-linear function, and add a new root which has two children, the old tree and a new leaf with value 0, then what happens is that the x-coordinates of all junction points get multiplied by 2, and the y-coordinates of the junction points get the absolute value of corresponding x-coordinate added to them.

Let's start with a tree with a root and two leaves, with values -1 and 1. Its piecewise-linear function has junction points (-2,2) and (2,2). Now we grow it using the above trick n times, obtaining a caterpillar (see the picture on the right), with the piecewise-linear function defined by the junction points (-2n+1,2n+1), (0,2) and (2n+1,2n+1). Everything so far is computed exactly right as powers of two are representable in double.

Now we do one last caterpillar expansion step, but the new leaf has value of either 1 or -1, depending on which exact bug we're trying to exploit. This causes the solution to evaluate the above linear function in point -1 or 1. If it uses the right boundary of the segment to do the interpolation, then having value 1 in the new leaf will cause it to subtract roughly speaking 2n+1-3 from 2n+1, which is bound to return 0 instead of 3 when n is big enough. Similarly, if a solution interpolates using the left boundary, then -1 will do the trick.

And sure enough:
Petr: The method returned 0.0 when it should have returned 3.0
tourist: The method returned 1.57412216096E21 when it should have returned 3.0
ACRush: The method returned 0.0 when it should have returned 3.0

In other words, all submitted solutions were incorrect, and the judges were completely right in saying that the piecewise-linear approach is super tricky to get right. Luckily, even if the system tests caught this, it would have no bearing on the top 4 in the standings.

Having identified the problematic line, and having understood the mechanics of the huge testcases, it's now relatively easy to come up with a fix: we just need to make sure that the plus in that line is always an addition, and never a subtraction. This is achieved by using the smaller of the two y-coordinates of endpoints of the segment for interpolation, or practically by changing the if condition from

if (pos < control.length) {
return value[pos] + slope[pos] * (at - control[pos]);
} else {
return value[pos - 1] + slope[pos] * (at - control[pos - 1]);
}

into

if (slope[pos] < 0) {
return value[pos] + slope[pos] * (at - control[pos]);
} else {
return value[pos - 1] + slope[pos] * (at - control[pos - 1]);
}

in my solution. I'm reasonably certain the solution is correct after this fix, but feel free to prove me wrong and come up with another challenge case :)

Note that there's one more small thing my solution does right from precision point of view: I'm keeping some redundant information in my representation of a piecewise-linear function, storing ~3n values for a function with ~2n degrees of freedom. This allows me to evaluate each segment independently, and thus not care about precision errors on huge values which have no bearing on the final answer which is always achieved in a relatively small point. Had I for example kept the entire control, slope, but only the left-most element of value, and used running sums to get the rest, I would get precision issues even with the above fix as the low y-coordinates would already be computed incorrectly. It is possible to have only ~2n values without precision issues (i.e. keep only value and control, and the slope only before first and after last control point), but choosing such representation correctly requires some understanding of the precision issues, while having some redundancy allows to avoid more potentially problematic computations.

Thanks for reading, and check back for more weekly summaries as I still hope to catch up with 2018 in 2018!
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
AtCoder has returned with its Grand Contest 027 during the Sep 10 - Sep 16 week (problems, results, top 5 on the left, my screencast, analysis). There was a pretty tense fight for the second place with cospleermusora getting problem B accepted with less than a minute remaining; but tourist's victory was not really in doubt as he finished all problems with 25 minutes to spare. Congratulations to both!

I've really enjoyed solving problem D (the choice of constructive problems for this blog is becoming quite a pattern, isn't it?): you need to construct any 500 by 500 matrix of distinct positive integers up to 1015, such that if we take any two vertically or horizontally adjacent numbers a, b in the matrix and compute max(a,b) mod min(a,b) we always get the same non-zero result.

The second Open Cup stage, the Grand Prix of Udmurtia, followed on Sunday (results, top 5 on the left). Team Havka-papstvo had dug themselves into a hole thanks to having a lot of incorrect attempts, then marvelously escaped with just 8 minutes remaining by solving the most difficult problem. Congratulations on the victory!

The Grand Prix of Udmurtia was a pioneer of interactive problems in the past, and this incarnation had four of those, too. Problem E went like this: the judging program has a hidden string of 1000 digits, each either 0 or 1. In one query, you ask about a segment [l,r], and the judging program returns one of the two things, each with probability 50%:
  • the number u of 1s in the given segment
  • a uniformly chosen random integer between 0 and r-l+1 that is not equal to u.
In other words, with probability 50% the judging program gives an incorrect answer to your query. Your goal is to reconstruct the hidden string using at most 18000 queries, with one more important restriction: you are also not allowed to ask the same query twice.

In my previous summary, I have mentioned another problem with segment queries: there's a hidden integer between 1 and 1018. You can make queries, and in one query you give a segment [l,r] and the judging program tells you whether the hidden number lies in that segment. In case it does, and your segment has length 1 (l=r), you win. After each query, the hidden number can change a bit — more precisely, by at most k=10. These changes do not depend on your queries — in each testcase the sequence of changes is always the same. You have at most 4500 queries to win.

If the hidden number did not change, we would do a binary search: by querying the segment [1,m] we can compare the hidden number with m, so if we knew that our number was within some segment [a,b] before our query, we would narrow it down to either [a,m] or [m+1,b] after this query, and determine our number after at most ceil(log(1018))=60 queries.

When the hidden number changes under the hood, we need to adapt this approach: now we go from [a,b] to either [a-k,m+k] or [m+1-k,b+k]. When the segment [a,b] is big, this still divides it roughly in half, so we can proceed as before. However, when it becomes small, it will actually stop decreasing, and will never converge to a segment of length 1.

So we will do the following: when the length b-a+1 of the current segment is bigger than some boundary b, we will divide it in two using the above approach. And when it's b or less, we will just pick a random number c within the segment, and send [c,c] query. With probability of at least 1/b, we will win in this case. In case we don't, our candidate segment grows from [a,b] to [a-k,b+k], and we continue as before.

It's important to pick the right value of b: if it's too big, the probability of winning in each attempt of the second kind would be too low, and we won't always finish under 4500 queries. And if it's too small, it will take too many queries of the first kind between two queries of the second kind to reduce the segment size, and we would have too few queries of the second kind and also won't finish under 4500 queries. It's probably possible to find mathematically optimal value of b, or we can take a guess (I've used b=99 during the contest) and verify that it leads to good enough probability to finish under 4500 queries.
Thanks for reading, and check back for more!
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
The Sep 3 - Sep 9 week started with Codeforces Round 507 on Wednesday (problems, results, top 5 on the left, my screencast, analysis). Once again the hardest problem was not solved during the round, and thus it all came down to the speed on the first four problems. Um_nik was considerably faster than the competition, finishing the four problems under an hour, and thus got a well-deserved first place. Congratulations!

Problem B in this round added a nice twist to a standard setting. This is an interactive problem in which you need to find a hidden integer between 1 and 1018. You can make queries, and in one query you give a segment [l, r] and the judging program tells you whether the hidden number lies in that segment. In case it does, and your segment has length 1 (l=r), you win. So far this is a classical binary search problem.

Here comes the twist: after each query, the hidden number can change a bit — more precisely, by at most 10. These changes do not depend on your queries — in each testcase the sequence of changes is always the same.

Can you see how to adapt the binary search for this setup? You have at most 4500 queries to win.

On Sunday, the new season of the Open Cup kicked off with the Grand Prix of Zhejiang (results, top 5 on the left). This will most likely be the most brutal contest of the year :) As I understand, this was the first "big" contest set by Yuhao Du, and the scoreboard reminds me of my own first Petrozavodsk contest, or my second TopCoder SRM. Team japan02 chose the solvable problems well and earned the victory with quite some margin. Well done!
Thanks for reading, and check back for more!
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
The Aug 20 - Aug 26 week was very calm compared to the previous ones, with just the TopCoder Open 2018 Online Wildcard Round on Saturday (problems, results, top 5 on the left, parallel round results, analysis). ACRush, Egor and Stonefeang were the only participants to solve all three problems, but ACRush and Egor were almost twice as fast in solving the hardest one, thus qualifying to the TCO onsite with a 250+ point margin. Well done!

The easy problem in this round was cute. You are given a 10x10 grid and can place at most 150 unit cubes onto it. A cube can be placed onto a cell of the grid, or on top of another cube. Given a number s between 1 and 500, you need to place the cubes in such a way that the surface area of the resulting figure (the total number of sides of the cubes that do not touch the grid or other cubes) is equal to s, or report that it's impossible.
Thanks for reading, and check back for more!
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
TopCoder SRM 736 started the Aug 13 - Aug 19 week (problems, results, top 5 on the left, my screencast, analysis). This was the first round under the new system in which one can qualify for the last online round and even to the onsite round of TopCoder Open 2019 by placing well in all SRMs in a quarter. rng_58 has claimed the early lead in that leaderboard by winning the SRM by less than one full point!

Codeforces then held two rounds based on VK Cup Finals problems, starting with Round 504 on Friday (problems, results, top 5 on the left, analysis). Just as in the official round, the hardest problem remained unsolved, but this time the winner was determined on challenges. ko_osaga found 9 opportunities and got a clear first place as the result. Well done!

Facebook Hacker Cup Round 3 on Saturday selected the 25 lucky finalists going to Menlo Park (problems, results, top 5 on the left, analysis). Xiao was quite a bit faster than the rest of the pack, qualifying to the onsite finals in style. Congratulations to Xiao and to all finalists!

Finally, another VK Cup-based Codeforces Round 505 wrapped up the week (problems, results, top 5 on the left, analysis). Once again the hardest problem remained unsolved, and once again it took solving all remaining problems plus finding a few challenges to earn a victory — and Swistakk did just that.
Thanks for reading, and check back for more!
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview