mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-31 00:03:57 -04:00 
			
		
		
		
	have no other gods before c.h'. Also remove some demonstrably redundant #include lines, mostly of <errno.h> which was added to c.h years ago.
Gendict - generate dictionary templates for contrib/tsearch2 module.
This utility aims to help people creating dictionary for contrib/tsearch v2
module. Particularly, it has built-in support for snowball stemmers.
Programming API to tsearch2 dictionaries is described in tsearch v2 
documentation.
Prerequisities:
* PostgreSQL 7.3 and above.
* You need tsearch2 module sources already compiled
* Rights to install contrib modules
Usage:
    run config.sh without parameters to see options and arguments
Usage:
./config.sh -n DICTNAME ( [ -s [ -p PREFIX ] ] | [ -c CFILES ] [ -h HFILES ] [ -i ] ) [ -v ] [ -d DIR ] [ -C COMMENT ]
    -v - be verbose
    -d DIR - name of directory in PGSQL_SRC/contrib (default dict_DICTNAME)
    -C COMMENT - dictionary comment
Generate Snowball stemmer:
./config.sh -n DICTNAME -s [ -p PREFIX ] [ -v ] [ -d DIR ] [ -C COMMENT ]
    -s - generate Snowball wrapper
    -p - prefix of Snowball's function, (default DICTNAME)
Generate template dictionary:
./config.sh -n DICTNAME [ -c CFILES ] [ -h HFILES ] [ -i ] [ -v ] [ -d DIR ] [ -C COMMENT ]
    -c CFILES - source files, must be placed in contrib/tsearch2/gendict directory.
                These files will be used in Makefile.
    -h HFILES - header files, must be placed in contrib/tsearch2/gendict directory.
                These files will be used in Makefile and subinclude.h
    -i - dictionary has init method
Example 1:
   Create Portuguese stemmer
 
   0. cd PGSQL_SRC/contrib/tsearch2/gendict
   1. Obtain stem.{c,h} files for Portuguese
      wget http://snowball.tartarus.org/portuguese/stem.c
      wget http://snowball.tartarus.org/portuguese/stem.h
   
   2. Create template files for Portuguese
      ./config.sh -n pt -s -p portuguese_ISO_8859_1 -v -C'Snowball stemmer for Portuguese'
      Note, that argument for -p option should be *the same* as name of stemming
      function in stem.c (without _stem)
      A bunch of files will be generated and placed in PGSQL_SRC/contrib/dict_pt
      directory.
   3. Compile and install dictionary
	cd PGSQL_SRC/contrib/dict_pt
	make
	make install
   4. Test it 
	Sample portuguese words with the stemmed forms are available
        from http://snowball.tartarus.org/portuguese/stemmer.html
 	createdb testdict
	psql testdict < /usr/local/pgsql/share/contrib/tsearch2.sql
	psql testdict < /usr/local/pgsql/share/contrib/dict_pt.sql
	psql -d testdict -c "select lexize('pt','bobagem');"
	 lexize  
	---------
	 {bobag}
	(1 row)
	Here is what I have in pg_ts_dict table
	psql -d testdict -c "select * from pg_ts_dict where dict_name='pt';"
	 dict_name |     dict_init      | dict_initoption |              dict_lexize              |          dict_comment           
	-----------+--------------------+-----------------+---------------------------------------+---------------------------------
	 pt        | dinit_pt(internal) |                 | snb_lexize(internal,internal,integer) | Snowball stemmer for Portuguese
	(1 row)
 
        Note, that you have already installed dictionary and corresponding
	entry in tsearch configuration and you may modify it using
	plain SQL commands, for example, specify stop words.
Example 2:
      a) Simple template dictionary with init method 
       ./config.sh -n wow -v -i -C WOW
      b) Create simple template dict (without init method):
	./config.sh -n wow -v  -C WOW
        The same as above, but dictionary will have not init method
       Dictionaries obtained in a) and b) are fully working and ready
       for use: 
	  a) lowercase input word and remove it if it is a stop word
	  b) recognizes any word
      c) Simple template dictionary with source files (with init method):
       ./config.sh -n wow -v -i -c a.c -h a.h -C WOW
        Source files ( a.c ) must be placed in contrib/tsearch2/gendict directory.
        These files will be used in Makefile.
        Header files ( a.h ), must be placed in contrib/tsearch2/gendict directory.
        These files will be used in Makefile and subinclude.h
      d) Simple template dictionary with source files (without init method):
	./config.sh -n wow -v  -c a.c -h a.h -C WOW
	The same as above, but dictionary will have not init method
       After that you have sources in PGSQL_SRC/contrib/dict_wow and
       you may edit them to create actual dictionary.
  Please, check Tsearch2 home page (http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/)
  for additional information about "Gendict tutorial" and dictionaries.