Assert a Get Text with <strong> in the HTML markup

I am trying to assert contents of a

tag in a test case.

HTML fragment contains

<div id="result"><strong>42</strong> results</div>

In my test case, this assertion fails because there are some invisible characters before and after “42”.

Get Text	id=result	==	42 results

What is the way to do this assertion?

The markdown engine is eating my HTML fragment.

<div id="result"><strong>42</strong> results</div>

i remember i wrote this code a while ago, hope it helps

 Count_Test

    ${alllinkscount}=   get element count    xpath://a
    log to console    ${alllinkscount}
#    ${alllinkscount}=    Evaluate    ${alllinkscount} + 1
#    log to console  \r${alllinkscount}
    @{linkItems}    create list
    FOR    ${i}     IN RANGE    1       ${alllinkscount}+1
        ${linktext}=    get text    xpath:(//a)[${i}]
        log to console  \r[${i}], ${linktext}
    END

Hi @northernHemisphere

I tried your html snipped and the <strong> does not influence the Get Text .

What sometimes is an issue is a &nbsp; (no-break space) which is a different character than a normal space.
Often Web developers use these to ensure that the spaces does not lead to a line break.

Maybe you can post the error message you get?

Ps: Three “Back Ticks” ``` before and after the preformatted code does the trick here in the forum.

1 Like

Now maybe as a last resort:
you can log the “non printable” characters like this:


    ${text} =    Get Text    id=result
    ${escaped} =    Evaluate     [c if c in string.printable else r'\x{0:02x}'.format(ord(c)) for c in $text]   modules=string
    Log To Console    ${escaped}

That weird python expression escapes non printable characters.
then you can figure out what there is.

I thought I’d jump to last resort to get an idea of what is in the string…

['x2068', '1', '7', 'x2069', ' ', 'r', 'e', 's', 'u', 'l', 't', 's']

it looks like that \u2068 is a start of strong.

See ⁨ - First Strong Isolate: U+2068 - Unicode Character Table

But i doubt that this has anything to do with the html <strong> .
I would guess that the content has additionally these two start and end characters.

So i tested a bit more and you can in JavaScript send these unicode characters to the element like this.

Evaluate JavaScript    id=result   e => e.innerHTML = "<strong>\u206842\u2069</strong> result"

without the <strong> it is not rendered as strong on the page.

You can however replace these special characters when reading it with

Get Text    id=result    validate    re.sub('[\u2068\u2069]', '', value) == '42 result'

Or just return it with

Get Text    id=result    evaluate    re.sub('[\u2068\u2069]', '', value)
1 Like

Thanks Rene.

Well you learn something everyday. Our internationalization library is putting these Unicode characters in by default to support display of bi-directional text. The HTML fragment I am testing is generated through that library (in our case to make sure pluralization of the word “result” matches the number of results) so it is putting in these Unicode characters. Turns out this is great as we will be working in right to left languages in the future.

The FSI 0x2068 and PDI 0x2069 are Unicode characters to help with the correct display of bi-directional text. For the benefit of future readers of this chat, see Unicode Isolation · projectfluent/fluent.js Wiki · GitHub.

Now that I know what is going on, I can design my test cases.

Regards.

2 Likes

Because the FSI and PDI characters are there, I think I’ll write the test to explicitly check for them.
Get Text id=result \u206842\u2069 results

The alternative would be to strip them out, which would also get the test passing.

1 Like