PGroonga 官网:https://pgroonga.github.io/
Description:PGroonga (píːzí:lúnɡά) is a PostgreSQL extension to use Groonga as the index.PostgreSQL supports full text search against languages that use only alphabet and digit. It means that PostgreSQL doesn’t support full text search against Japanese, Chinese and so on. You can use super fast full text search feature against all languages by installing PGroonga into your PostgreSQL!
一、安装相关依赖包
yum install wget curl tar gzip gcc gcc-c++ make zlib zlib-devel msgpack msgpack-devel mecab mecab-devel lz4 lz4-devel
二、下载安装 git
注意:编译要求Git版本为 V2.7.4及以上版本
wget https://www.kernel.org/pub/software/scm/git/git-2.7.4.tar.gz --no-check-certificate
tar -vzxf git-2.7.4.tar.gz
cd git-2.7.4/
#Configure
./configure --with-openssl=/usr/local/openssl
#编译安装
make && make install
#打开操作系统环境变量配置文件,修改环境变量
vi /etc/profile
#在底部加上git相关配置
export PATH=$PATH:/usr/local/git-2.7.4
#:wq保存,source命令生效
source /etc/profile
#查看git版本
git --version
三、下载编译安装 groonga
Description:Groonga is an open-source fulltext search engine and column store. It lets you write high-performance applications that requires fulltext search.
官网:https://groonga.org/
编译安装:https://groonga.org/docs/install/centos.html#centos-7
PS:下载source进行源码编译,官网上的groonga-release-latest.noarch.rpm直接在本地安装会有问题(可能会存在文件缺失)。
wget https://packages.groonga.org/source/groonga/groonga-13.0.9.tar.gz --no-check-certificate
tar -xvzf groonga-13.0.9.tar.gz
cd groonga-13.0.9
#Configure
./configure
#编译Build
make -j$(grep '^processor' /proc/cpuinfo | wc -l)
#install
sudo make install
#install之后,执行如下命令查看当前系统安装了哪些库?
pkg-config --list-all
#查看是否能查到groonga,若没有查到groonga头文件和库文件的位置,编译器无法使用,需要设置PKG_CONFIG_PATH环境变量
#查找groonga.pc的位置
find / -name groonga.pc
#设置PKG_CONFIG_PATH环境变量
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig/
#查看是否可输出groonga
pkg-config --list-all
#正常可输出groonga Groonga - An Embeddable Fulltext Search Engine
四、下载安装 xxHash
Description:xxHash is an Extremely fast Hash algorithm, processing at RAM speed limits. Code is highly portable, and produces hashes identical across all platforms (little / big endian).
Vcpkg用于在Windows、Linux、Mac上管理C和C++库,极大简化了第三方库的安装,它由微软开源,源码地址:https://github.com/Microsoft/vcpkg,最新发布版本为2023.04.15 Release,它的license为MIT。
Building xxHash - Using vcpkg
You can download and install xxHash using the vcpkg dependency manager
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
./vcpkg install xxhash
五、下载安装 pgroonga
git clone --recursive https://github.com/pgroonga/pgroonga.git
cd pgroonga
make
make install
六、登录数据库
psql highgo sysdba
highgo=# select * from pg_available_extensions where name like '%roon%';
name | default_version | installed_version | comment-------------------+-----------------+-------------------+-------------------------------------------------------
pgroonga | 3.1.6 | 3.1.6 | Super fast and all languages supported full text search index based on Groonga
pgroonga_database | 3.1.6 | | PGroonga database management module
highgo=# create extension pgroonga;
错误: 扩展 "pgroonga" 已经存在
highgo=# \dx
已安装扩展列表
名称 | 版本 | 架构模式 | 描述
--------------------+-------+--------------------+-----------------------------------------------------------------------------------------------
hg_mac | 1.0 | information_schema | hgdb mandatory access control without using selinux
hg_permission | 1.0 | information_schema | hg permission
mysqlface | 1.0 | public | administrative functions for PostgreSQL
orafce | 3.9 | public | Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
passwordcheck | 1.0 | information_schema | passwordcheck
pg_buffercache | 1.3 | public | examine the shared buffer cache
pg_stat_statements | 1.7 | public | track execution statistics of all SQL statements executed
pgroonga | 3.1.6 | public | Super fast and all languages supported full text search index based on Groonga
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
zhfts | 1.1 | public | RUM index access method
zhparser | 2.2 | public | a parser for full-text search of Chinese
(11 行记录)
七、使用
1、启用全文搜索作为文本类型的列
CREATE TABLE memos (
id integer,
content text
);
CREATE INDEX pgroonga_content_index ON memos USING pgroonga (content);
INSERT INTO memos VALUES (1, 'PostgreSQL is a relational database management system.');
INSERT INTO memos VALUES (2, 'Groonga is a fast full text search engine that supports all languages.');
INSERT INTO memos VALUES (3, 'PGroonga is a PostgreSQL extension that uses Groonga as index.');
INSERT INTO memos VALUES (4, 'There is groonga command.');
SET enable_seqscan = off;
There are the following operators to perform full text search:
&@
&@~
LIKE
ILIKE
&@~ operator
You can use &@~ operator to perform full text search by query syntax such as keyword1 OR keyword2:
highgo=# SELECT * FROM memos WHERE content &@~ 'PGroonga OR PostgreSQL';
id | content
----+----------------------------------------------------------------
1 | PostgreSQL is a relational database management system.
3 | PGroonga is a PostgreSQL extension that uses Groonga as index.
(2 行记录)
&@ operator
You can use &@ operator to perform full text search by one keyword:
highgo=# SELECT * FROM memos WHERE content &@ 'engine';
id | content
----+------------------------------------------------------------------------
2 | Groonga is a fast full text search engine that supports all languages.
(1 行记录)
LIKE operator
PGroonga supports LIKE operator. You can perform fast full text search by PGroonga without changing existing SQL.
column LIKE '%keyword%' almost equals to column &@ 'keyword':
highgo=# SELECT * FROM memos WHERE content LIKE '%engine%';
id | content
----+------------------------------------------------------------------------
2 | Groonga is a fast full text search engine that supports all languages.
(1 行记录)
2、Score(匹配精度排序)
You can use pgroonga.score
function to get precision as a number. If a record is more precision against searched query, the record has more higher number.
You need to add primary key column into pgroonga
index to use pgroonga.score
function. If you don’t add primary key column into pgroonga
index, pgroonga.score
function always returns 0
.
Here is a sample schema that includes primary key into indexed columns:
CREATE TABLE score_memos (
id integer PRIMARY KEY,
content text
);
CREATE INDEX pgroonga_score_memos_content_index
ON score_memos
USING pgroonga (id, content);
INSERT INTO score_memos VALUES (1, 'PostgreSQL is a relational database management system.');
INSERT INTO score_memos VALUES (2, 'Groonga is a fast full text search engine that supports all languages.');
INSERT INTO score_memos VALUES (3, 'PGroonga is a PostgreSQL extension that uses Groonga as index.');
INSERT INTO score_memos VALUES (4, 'There is groonga command.');
SET enable_seqscan = off;
--执行全文检索并获得分数
SELECT *, pgroonga_score(tableoid, ctid) FROM score_memos WHERE content &@ 'PGroonga' OR content &@ 'PostgreSQL';
id | content | pgroonga_score
----+----------------------------------------------------------------+----------------
1 | PostgreSQL is a relational database management system. | 1
3 | PGroonga is a PostgreSQL extension that uses Groonga as index. | 2
(2 行记录)
--可以使用ORDER by子句中的pgroonga_score函数,按精度降序对匹配的记录进行排序:
highgo=# SELECT *, pgroonga_score(tableoid, ctid)
highgo-# FROM score_memos
highgo-# WHERE content &@ 'PGroonga' OR content &@ 'PostgreSQL'
highgo-# ORDER BY pgroonga_score(tableoid, ctid) DESC;
id | content | pgroonga_score
----+----------------------------------------------------------------+----------------
3 | PGroonga is a PostgreSQL extension that uses Groonga as index. | 2
1 | PostgreSQL is a relational database management system. | 1
(2 行记录)
更多用法详见官网:https://pgroonga.github.io/tutorial/文章来源:https://www.toymoban.com/news/detail-836819.html
Author: HGDB-QIURU文章来源地址https://www.toymoban.com/news/detail-836819.html
到了这里,关于中文全文检索pgroonga在HGDB-SEE V4.5.8版本编译的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!