String Minimisation

Back to General discussions forum

gardengnome     2023-09-22 11:33:43
User avatar

Interesting and fun problem, thank you.

Original problem allowed strings up to half-million character length, which prohibits using inefficient logic (something O(N^3) is obvious). However in this task we can't easily handle such amounts of data and, what is worse, can't keep them secret from user (as was the case in the contest). Perhaps we may add advanced version of the problem later, if it may be interesting.

I believe O(n) is possible but it needs some careful handling of the cases. Here are some examples that I found useful:

aaaccbb
aaacbbc
aacacbb
aabaccb
aaccabb
acacbba
Rodion (admin)     2023-09-22 12:55:05
User avatar

I believe O(n) is possible

Yes, seems like this - I initially was confused by these cases and tried to come up with longer and clumsier solution, which was "approaching O(n)" but some cleverly handcrafted tests in the end were failing by time or memory limit. I got frustrated and thought to leave it unsolved, but then recollected Clive's wisdom about working thoroughly from smaller examples with pencil and paper (simple, obvious, but easily ignored advice). It was a bit tedious, again, due to "cases", but after I worked it out the solution appeared much shorter and worked quick even in Python (I resided to rewriting code in Go at some point, but it still failed somehow, perhaps due to system bug).

Unless I'm mistaken there are just two cases to regard - forward and backward transfer. Inside these cases processing could be unified.

I uploaded my original solution (as unsolved) adding comments here. I think I proved correctness of this logic on the paper, but can't easily repeat the proof without thinking again.

CSFPython     2023-09-22 14:32:30

I saw the problem this morning just as I was about to go out. It looked fairly easy so I wrote a quick solution and submitted this to get a wrong answer. I had no time to look into it further. Now that I have returned I see that the example given in the problem contains the special case which I failed to spot. My code is now twice as long but now works. It is mainly O(n) but is actually O(n ln n) because I used a sort. Rodion's two directions make sense. The backward direction is my special case. The algorithm is certainly fast enough for millions of characters but I'm not sure that it is worth the effort of presenting this as an extra problem.

gardengnome     2023-09-22 14:53:14
User avatar

Hi, have a look at the third example given above - the solution should be aaacbbc.

On a technical note, python is quite bad with regard to the time complexity of list manipulations; deleting and inserting elements from/into a list is often O(n) for a single operation (except when an element is added or removed at the end of a list). Thus we often increase O() by an additional factor of n if those operations are used.

Rodion (admin)     2023-09-22 15:14:27
User avatar

deleting and inserting elements into a list is often O(n)

probably most languages implement lists/arrays this way? Hm-m-m... well, in PHP and JavaScript of course they have hash-like behavior... but I have no idea of particulars of the implementation... and if the elements are reindexed, this anyway will be O(n), right?

gardengnome     2023-09-22 15:23:10
User avatar

Some other languages offer more choice of data structures to address such issues (I still like Python :) ). None of this makes any difference for the CodeAbbey version but I would guess this would be important for the Ozon version with very long strings.

Rodion (admin)     2023-09-22 15:34:19
User avatar

Their contest was conducted with rules typical for CodeForces/TopCoder - in other words, similar to ACM/ICPC. Only three languages were supported (Go, C# and Python) - and as usual in such format there were sets of tests in order of increasing difficulty (user never sees input or output, however).

I also was solving in Python, despite I probably know it worse than Go, but it allows much shorter coding (even though I'm poor at using string find/index functions). Ozon's "classic" rules however mean that people using faster language can sometimes successfully submit less efficient solution. We historically/intentionally use different approach, so I tend to agree with Clive - whoever is curious about complexity, will prefer to write clever solutions - and for others, let them use naive approach at pleasure :)

CSFPython     2023-09-22 16:23:10

Mathias, your example of aaacbbc is a good one. You caught me with one of my common mistakes. I had used > when I should have used >=. I have now made the correction in my code. I'm impressed that you came up with some really helpful examples.

zelevin     2023-09-22 18:37:17

My first idea was similar to the one implemented by ozturkm2004 (which makes for truly compact code!), but I had an inkling that O(N) was possible and didn't want to settle for O(N log(N)). Then I realized that, while one can explicitly account for both cases, one is the [SPOILER] of the other one, and the same method would suffice with a simple change of the [SPOILER].

sam_bandara     2023-09-27 23:52:38

Hello All,

I removed my comment :-)

Cheers, Sam.

Please login and solve 5 problems to be able to post at forum