[futurebasic] Re: [FB] Weird character encoding

Message: < previous - next > : Reply : Subscribe : Cleanse
Home   : June 2007 : Group Archive : Group : All Groups

From: Bernie <fblist.bw@...>
Date: Tue, 19 Jun 2007 11:16:23 +0100
Thanks Robert.

That's really helpful. I can now convert the PString to a CFString  
using kCFStringEncodingMacRoman then convert it back to a PString  
(for printing to disk) using kCFStringEncodingUTF8, saving me a hell  
of a lot of find/replace code.

Cheers

Bernie


Robert P wrote:

>
> Bernie wrote:
>
>> My app generates a document that can be opened in one of Apple's  
>> developer tools. The problem I'm having is Apple's app identifies  
>> special characters as a weird character sequence that I haven't  
>> seen before - although I suspect it may be standard because  
>> TextWrangler handles them correctly.
>> Examples:
>> 	opt-space	=	> 
>> 	opt-5		=	‚àû
>> 	opt-;		=	…
>>
>> Does anyone have a complete list of these character encodings or,  
>> better still, an easy way to convert them?
>
>
> UTF-8  <http://en.wikipedia.org/wiki/UTF-8>
>
> opt-space  NO-BREAK-SPACE        UTF-8: C2 A0
> opt-5      INFINITY              UTF-8: E2 88 9E
> opt-;      HORIZONTAL ELLIPSIS   UTF-8: E2 80 A6
>
> Interpreting UTF-8 wrongly as MacRoman characters gives results  
> akin to those that you show above.
>
> dim as Str255 s1, s2, s3
> s1[0] = 2 : s1[1] = 0xC2 : s1[2] = 0xA0 // opt-space
> s2[0] = 3 : s2[1] = 0xE2 : s2[2] = 0x88 : s2[2] = 0x9E // opt-5
> s3[0] = 3 : s3[1] = 0xE2 : s3[2] = 0x80 : s3[2] = 0xA6 // opt-;
> // show as MacRoman
> stop s1 + chr$(13 ) + s2 + chr$( 13 ) + s3
>