Monday, July 21, 2008

[BUGS] Problem loading ispell affix file with apostrophes

I'm having problem with french dictionaries. Loading an ispell affix
file with apostrophes does not work. The file comes from the ifrench
(french dict for ispell) debian source package at
http://packages.debian.org/sid/ifrench

Here's the session excerpt:

------------------------------------------------------------------------
Welcome to psql 8.3.3, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

dockee=# select plainto_tsquery('custom_french', 'bug');
ERROR: syntax error at line 158 of affix file
"/usr/share/postgresql/8.3/tsearch_data/ispell_french.affix"

dockee=# show lc_ctype ;
lc_ctype
-------------
en_US.UTF-8
(1 row)

dockee=# show client_encoding;
client_encoding
-----------------
UTF8
(1 row)

dockee=# show server_version;
server_version
----------------
8.3.3
(1 row)
------------------------------------------------------------------------


The 'custom_french' text configuration is defined as below:

------------------------------------------------------------------------
CREATE TEXT SEARCH CONFIGURATION public.custom_french ( COPY =
pg_catalog.french );

CREATE TEXT SEARCH DICTIONARY french_ispell (
TEMPLATE = ispell,
DictFile = ispell_french,
AffFile = ispell_french
);

ALTER TEXT SEARCH CONFIGURATION custom_french
ALTER MAPPING FOR
asciiword,
asciihword,
hword_asciipart,
word,
hword,
hword_part
WITH french_ispell ;

ALTER TEXT SEARCH CONFIGURATION custom_french
DROP MAPPING FOR
url,url_path,sfloat,float,file,int,version;
------------------------------------------------------------------------


Line 158 of file ispell_french.affix corresponds to the first flag
definition that triggers a prefix with an apostrophe, it's the line
below "flag *N"

------------------------------------------------------------------------
flag *D: # dé: défaire, dégrossir
. > dé

flag *N: # élision d'une négation
[aàâeèéêiîoôuh] > n' # je n'aime pas, il n'y a pas
------------------------------------------------------------------------

Maybe apostrophes in ispell affix files are simply not supported? I
can't find a mention of this limitation in the documentation at
http://www.postgresql.org/docs/8.3/static/textsearch-dictionaries.html

When commenting out the offending flag definitions, the affix file
loads successfully. Thanks in advance for helping me resolve this
problem.
--
Jean-Baptiste Quenot
http://jbq.caraldi.com/

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

No comments: