6.9 Character Set

Builtin Class: <char-set>
Character set class. Character set object represents a set of characters. Gauche provides built-in support of character set creation and a predicate that tests whether a character is in the set or not.

Further operations, such as set algebra, is defined in SRFI-14 module (See section 10.8 srfi-14 - Character-set library).

Reader Syntax: #[char-set-spec]
You can write a literal character set in this syntax. char-set-spec is a sequence of characters to be included in the set. You can include the following special sequences:
Characters between x and y, inclusive. x must be smaller than y in the internal encoding.
If char-set-spec begins with caret, the actual character set is a complement of what the rest of char-set-spec indicates.
A character whose internal code is a hexadecimal number NN.
A character whose UCS-2 code is a 4-digit hexadecimal number NNNN.
A character whose UCS-4 code is a 8-digit hexadecimal number NNNNNNNN.
Whitespace characters.
Complement of whitespace characters.
Decimal digit characters.
Complement of decimal digit characters.
Alphanumeric characters.
Complement of alphanumeric characters.
A backslash character.
A minus character.
A caret character.
[:alnum:] ...
Character set a la POSIX. The following character set name is recognized: alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper and xdigit.

#[aeiou]     ; a character set consists of vowels
#[a-zA-Z]    ; alphabet
#[[:alpha:]] ; alphabet (using POSIX notation)
#[\x0d\x0a]  ; newline and carriage return
#[\\\-]      ; backslash and minus
#[]          ; empty charset

Function: char-set? obj
[SRFI-14] Returns true if and only if obj is a character set object.

Function: char-set-contains? char-set char
[SRFI-14] Returns true if and only if a character set object char-set contains a character char.
(char-set-contains? #[a-z] #\y) => #t
(char-set-contains? #[a-z] #\3) => #f

(char-set-contains? #[^ABC] #\A) => #f
(char-set-contains? #[^ABC] #\D) => #t

Function: char-set char ...
[SRFI-14] Creates a character set that contains char ....
(char-set #\a #\b #\c)   => #[a-c]

Function: char-set-copy char-set
[SRFI-14] Copies a character set char-set.

