PostgreSQL 8.2 with tsearch2 and Dutch Snowball stemmer
13 juni 2007
quick walk-through for compiling PostgreSQL with tsearch2 full text extension and Dutch Snowball stemmer on Debian Etch:
-
sudo su -
-
cd /usr/src
-
apt-get build-deps postgresql-8.1
-
wget ftp://ftp4.nl.postgresql.org/postgresql.zeelandnet.nl/latest/postgresql-8.2.4.tar.bz2
-
tar jxvf postgresql-8.2.4.tar.bz2
-
cd postgresql-8.2.4
-
wget http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearch_snowball_82-20070504.gz
-
gunzip tsearch_snowball_82-20070504.gz
-
patch -b -p0 < tsearch_snowball_82-20070504
-
./config
-
make
-
make install
-
cd contrib
-
make
-
make install
-
cd tsearch2/gendict
-
wget http://snowball.tartarus.org/algorithms/dutch/stem.c
-
wget http://snowball.tartarus.org/algorithms/dutch/stem.h
-
./config.sh -n nl -s -p dutch_ISO_8859_1 -v -C‘Dutch Stemmer. Snowball’
-
cd ../../dict_nl
-
make
-
make install
-
wget http://snowball.tartarus.org/algorithms/dutch/stop.txt -O /usr/local/pgsql/share/contrib/dutch.stop
-
adduser postgres
-
mkdir /usr/local/pgsql/data
-
chown postgres /usr/local/pgsql/data
-
su – postgres
-
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
-
/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >/tmp/logfile 2>&1 &
-
/usr/local/pgsql/bin/createdb tsearch2
-
/usr/local/pgsql/bin/psql tsearch2 < /usr/local/pgsql/share/contrib/tsearch2.sql
-
/usr/local/pgsql/bin/psql tsearch2 < /usr/local/pgsql/share/contrib/dict_nl.sql
-
/usr/local/pgsql/bin/psql tsearch2
If everything went well you are now in a PostgreSQL prompt.
We will now update ans insert some stuff:
-
UPDATE pg_ts_dict SET dict_initoption=‘contrib/dutch.stop’ WHERE dict_name=‘nl’;
-
INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) VALUES (‘dutch’, ‘default’, ‘nl_NL’);
-
INSERT INTO pg_ts_cfgmap (SELECT ‘dutch’, tok_alias, dict_name FROM pg_ts_cfgmap WHERE ts_name=‘default’);
-
UPDATE pg_ts_cfgmap SET dict_name=‘{nl}’ WHERE ts_name=‘dutch’ AND dict_name=‘{en_stem}’;
select to_tsvector('dutch', 'ik ga naar school'); to_tsvector ------------------ 'ga':2 'schol':4 (1 row)
Categorieën:Linux Tips & Trics