General discussions: Issues with task 150 - Introducing Regexps

qwerty 2025-06-06 08:33:02

Hello everyone,

I think there are several issues with this task.

Issue #1 - broken button

In the problem statement, after example input and answer, there is a note:

Use the button "RegExp" below to try your solution against sample input.

However...

When I click "Run it!" in the "Select Language" button group (with RegExp chosen as solution languange), a pop-up window appears and states: "Sorry, no executor yet for chosen language :("
When I click "RegExp" button in the "Code Running Tools" section, nothing happens, even though I copied code from the example section, after the words: "Solution may start like:" to the solution code textbox.

Issue #2 - ignoring plus and minus signs

Plus and minus signs are valid parts of numeric literal, but they are not covered in this task.

The following code compiles just fine on my PC...

#include <stdio.h>

int main() {
    printf("Octal literal: %d", -076543210);
    printf("Hexadecimal literal: %ld", +0x0123456789ABCDEF);
    printf("Binary literal: %d", -0b1000101);
    printf("Decimal literal: %ld", +9876543210);
    return 0;
}

...but I don't see literals like the ones above in the test input.

Issue #3 - incomplete coverage of radix suffixes

As noted in Assembly language reference, the possible radix values are:

h    hexadecimal
q/o  octal
d    decimal
b    binary

However, the radix values q/o for octal and d for decimal are not covered in this task.

Please add variants to the test input like the following...

26d
42q
42o

...and, of course, there are also variants with capitalized letters D, Q, O.

Issue #4 - important test case missing

It seems that the chance to get 0 (a single zero) among the test data is very small (at least I never happen to get 0). But 0 is an important test case...

qwerty 2025-06-06 09:51:07

In addition to issue #4.

0x and 0b may be of interest too:

0x - not valid, x is invalid suffix.

0b - valid, this is not a prefix 0b, but a binary number 0 with suffix b.

gardengnome 2025-06-06 10:09:55

Yes the button for RegExp seems to be broken.

But with regard to the other points, the problem is called "Introducing Regexps" and the focus is on introducing basic principles of regular expressions and not to have the most complete coverage of what binacy/octal/... numbers could look like in certain languages. In my opinion, the problem is just fine as it is.

qwerty 2025-06-06 11:08:04

Yeah, the most important issue is that the button is not working.

Second most important issue is the absence of 0 in test cases, even your code does not handle this simple case accordingly.

All other issues are not that important.

gardengnome 2025-06-06 11:39:59

My code handles the problem as set just fine. The problem statement does not specify how octal and decimal zero would be represented, and thus this case is not covered.

qwerty 2025-06-06 11:56:46

By simple logic, decimal zero is 0 and octal zero is 00.

gardengnome 2025-06-06 12:03:47

That is your interpretation. The problem states 'values consisting only of digits 0-9 are considered decimals (but not ones starting with zero ...'.

qwerty 2025-06-06 12:21:51

You are right. From problem statement it is not clear if 0 is decimal is not.

And, after some thought, my definition of octal zero is wrong, is it not only 00, but any sequence of two or more zeros.

Rodion (admin) 2025-06-06 13:48:31

Hi Friends!

Thanks for raising the question - hopefully button is fixed (perhaps relogin or cache cleaning may be needed) - there was inconsistency in tags with regex and regexp in the page scripts, sorry :(

As far as I understand the problem checker code (and now I understand it poorly), only positive integers are generated.

There is definitely issue with 0 not properly belonging to some single format but being shared among all of them.

I honestly don't feel inclination to cover various exciting formats invented by programmers - for example assembly also uses h suffix for hexadecimal in some systems - and you may have a hard time when you discover modern hexadecimal format for floating point numbers (they have hexadecimal mantisa but binary exponent expressed as decimal) - actually many queer formats which are hardly used. And group separators (now popular in most modern version of the languages).

You properly indicate that some of the advanced forms lead to confusion and contradictions.

By simple logic, decimal zero is 0 and octal zero is 00.

Honestly, I'm not sure I see logic here :) single 0 also starts with 0 and thus conforms to old C octal format. The matter is all this is about tokenizing the source code of the program - and for tokenizer it is not important whether you mean octal or decimal zero (at least I don't know languages where it makes difference).

qwerty 2025-06-06 15:08:59

Honestly, I'm not sure I see logic here :)

My logic is that if we consider 0 as octal literal, then there will be no decimal representation for zero at all, because 0 is reserved as octal literal. Is it not counter-intuitive?

Rodion (admin) 2025-06-06 18:45:27

because 0 is reserved as octal literal. Is it not counter-intuitive?

To me it looks like 0 is valid octal and valid decimal literal simultaneously.

Let's see it in the CPP reference

octal-literal is the digit zero (0) followed by zero or more octal digits (0, 1, 2, 3, 4, 5, 6, 7)
decimal-literal is a non-zero decimal digit (1, 2, 3, 4, 5, 6, 7, 8, 9), followed by zero or more decimal digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Funny, they are more radical, not including 0 in decimal literals! I think it is just about implementation, though.

UPD: I peeked into java language specification and found they on contrary declare octal literal always consists of more than one digit (e.g. 0 is decimal).