[futurebasic] RFC a text cleaning DCOD

Message: < previous - next > : Reply : Subscribe : Cleanse
Home   : July 1998 : Group Archive : Group : All Groups

From: jonathan <jonathan@...>
Date: Ven, 31 Jul 98 16:50:36 +0200
Dear all

[been a long time since I posted anything consequent...]

= == = This is a 'Request for Comments', read on = == =

Explanation:
having been coding a lot of web pages recently (when i
should have been coding FB) i have been using my tool
LeComposteur (on Staz's web site as 'Oops! Caps Lock'
in the shareware section). And have decided to redo this
tool to fit my current needs.
While thinking (dangerous occupation) i have decided to
do it as a DCOD, so that anyone can just drop it into an
app and get the same text manipulation functions there.
This means that ypu donŒt need to duplicate word that already
exists. OK, it would be cooler to do this as a .lib, but I
don't feel up to doing that!

In the latest incarnation the tool just treats 256 chars and
does upper and lower case manipulations, but taking into
account accented chars and ligatures (ligatures are strange
glyphes that you find in european languages like 'ae' or
'oe' in french, and 'ss' becoming ß in german, as well as
the 'fi' and 'fl' that you find in standard typefaces.)

Passing a handle means that it could treat up to 32K chars,
although I shall probably limit it to 16K, as some functions
will expand the length of the text, and I donšt want to explode
thru the limits of the text handle.
But anyway, even with a 16 K limit, it is possible to treat
text of any length, you would just need to cut in up into 16K
blocks and feed them to the DCOD.

Below is the proposed functions and syntax. I would like your
feedback as to:
1/ functions - is there something that you would like to see
   that i have not included
2/ syntax - is there something unclear.

If you think that this is X-FB or a waste of bandwdth then
please answer privately.

TEXT CLEANING DCOD

The numbers are the numerical value, the constants that are proposed
would have to be added to your globals file.

proposed var codes
------------------
1   _stripCR   - takes out single CR chars- leaves double ones
                 as they would mark a paragraph
2   _french    - takes into account french specifics
4   _german    - takes into account german specifics

proposed action codes
---------------------
8    _lowerCase     - change all text to lowercase
16   _lowerCasePlus - change all text to lowercase
                      detecting ligatures
     +_french       - also seeks oe and ae
     +_german       - also seeks ss
32   _upperCase     - change all text to uppercase,
                      expanding ligatures if any found
     +_french       - also seeks oe and ae
     +_german       - also seeks ss
48   _capitalise    - change all text to lowercase,
                      inserting uppercase if needed
     _capitalize    - alternate spelling for above
     +_french       - also seeks oe and ae
     +_german       - also seeks ss
64   _curlyQuote    - inserts curly quotes and corrects
                      typographical spaces
     +_french       - uses french style quotes and typographical spaces
     +_german       - inserts german style quotes
80   _HTMLclean     - removes "<TAG>" and expands &#nnn; to character
96   _HTMLaware     - changes extended ASCII to &#nnn; style
112  _mailUnquote   - removes ">" quote characters from text, cleans CRs
                      and checks for double spaces after cleaning*
128  _mailReady     - cleans out extended ASCII, substituting to 7 bit
                      ASCII, inserts a CR formatting lines to 60 chars
                      maximum*

*the var code _stripCR can not be used with this action.

proposed syntax
---------------
CALL "DCOD", the_number_of_the_DCOD, err% (_actionCode_varCode, handle&)

Proposed use
------------
You would call this DCOD with the required action code and varcode if needed, as well as a handle to the text that is to be changed.
You must lock the text handle before passing it, and presumably, unlock it when you come back. You should also check the error code before continuing in case of problems. You would be responsible for refreshing the screen if the text is visible, as the DCOD could not know if it was passed the handle to a selection or to all of a visible field.


All comments must welcome.

July 31 1998
jonathan




-------------------------------------------------------------
! "format utile"  studio de graphisme/graphic design studio !
!      32 bd de Menilmontant, 75020 Paris, France           !
!    phone +33 1 43 49 02 04 +++ fax +33 1 43 49 16 51      !
-------------------------------------------------------------
           *** coming soon to a browser near you ***
          <http://www.cycbercities.com/formatutile>
-------------------------------------------------------------