Converting characters to UTF-16BE

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Converting characters to UTF-16BE

Richard Taubo
Hi!

(Lasso 8.5)

I need to convert a text to Unicode (UTF-16BE) hex, so that e.g. the text:
        Øse
                                                                                                   Ø      s       e
Becomes: 00D800730065 (or split up per character: 00D8 0073 0065)

Based on the unicode numbers (see e.g. http://unicode-table.com/en/):
        Ø = U+00D8
        s = U+0073
        e = U+0065

Is there a way to transform e.g. Øse to: 00D800730065 in Lasso 8.5
or do I need to create some sort of look-up map via the info from e.g. http://unicode-table.com/en/ ?


Thanks for input! :-)


Richard Taubo

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Converting characters to UTF-16BE

stevepiercy
To convert between character sets, you could try shelling out.

     iconv -f UTF-8 -t UTF-16BE input.txt > output.txt

You'll probably need to tweak a few parameters.  See 'man iconv'.

Then you can try hex encoding the result.

If Lasso doesn't provide an easy solution, I usually find
something in the OS or in Python, which almost always does have
just what I need.

--steve


On 10/9/15 at 6:08 PM, [hidden email] (Richard Taubo) pronounced:

>Hi!
>
>(Lasso 8.5)
>
>I need to convert a text to Unicode (UTF-16BE) hex, so that e.g. the text:
>Øse
>Ø          s       e
>Becomes: 00D800730065     (or split up per character: 00D8 0073 0065)
>
>Based on the unicode numbers (see e.g. http://unicode-table.com/en/):
>Ø   =   U+00D8
>s   =   U+0073
>e   =   U+0065
>
>Is there a way to transform e.g. Øse to: 00D800730065 in Lasso 8.5
>or do I need to create some sort of look-up map via the info
>from e.g. http://unicode-table.com/en/ ?
>
>
>Thanks for input! :-)
>
>
>Richard Taubo
>
>#############################################################
>
>This message is sent to you because you are subscribed to
>the mailing list Lasso [hidden email]
>Official list archives available at http://www.lassotalk.com
>To unsubscribe, E-mail to: <[hidden email]>
>Send administrative queries to  <[hidden email]>

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Steve Piercy              Website Builder              Soquel, CA
<[hidden email]>               <http://www.StevePiercy.com/>


#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Converting characters to UTF-16BE

Bil Corry-3
In reply to this post by Richard Taubo
I don't have Lasso 8 to test against anymore, but what does this return?

var('b') = bytes('Øse');
$b->get(1);'<br>';
$b->get(2);'<br>';
$b->get(3);'<br>';

I think it returns the decimal values of those characters, so all you'd
need to do is convert to hex (and possibly swap the bytes around depending
on BE/LE).

You might need to set the page encoding to UTF-16 first.


- Bil


On Fri, Oct 9, 2015 at 6:08 PM, Richard Taubo <[hidden email]> wrote:

> Hi!
>
> (Lasso 8.5)
>
> I need to convert a text to Unicode (UTF-16BE) hex, so that e.g. the text:
>         Øse
>
>                          Ø          s       e
> Becomes: 00D800730065           (or split up per character: 00D8 0073 0065)
>
> Based on the unicode numbers (see e.g. http://unicode-table.com/en/):
>         Ø       =       U+00D8
>         s       =       U+0073
>         e       =       U+0065
>
> Is there a way to transform e.g. Øse to: 00D800730065 in Lasso 8.5
> or do I need to create some sort of look-up map via the info from e.g.
> http://unicode-table.com/en/ ?
>
>
> Thanks for input! :-)
>
>
> Richard Taubo
>
> #############################################################
>
> This message is sent to you because you are subscribed to
>   the mailing list Lasso [hidden email]
> Official list archives available at http://www.lassotalk.com
> To unsubscribe, E-mail to: <[hidden email]>
> Send administrative queries to  <[hidden email]>

#############################################################

This message is sent to you because you are subscribed to
  the mailing list Lasso [hidden email]
Official list archives available at http://www.lassotalk.com
To unsubscribe, E-mail to: <[hidden email]>
Send administrative queries to  <[hidden email]>