Oracle正则表达式-数据库-IT落伍者

Oracle g数据库内建了符合IEEE POSIX (Portable Operating System for Unix)标准的正则表达式熟练使用正则表达式可以写出简洁强大的SQL语句

正则表达式有几个优点优于常见的LIKE操作符和INSTRSUBSTR及REPLACE 函数的这些传统的SQL 函数不便于进行模式匹配只有LIKE 操作符通过使用%和_字符匹配但LIKE不支持表达式的重复复杂的更替字符范围字符列表和POSIX 字符类等等

元字符（Meta Character）

Sql代码

^ 使表达式定位至一行的开头

$ 使表达式定位至一行的末尾

* 匹配次或更多次

? 匹配次或次

+ 匹配次或更多次

{m} 正好匹配 m 次

{m} 至少匹配 m 次

{m n} 至少匹配 m 次但不超过 n 次

[:alpha:] 字母字符

[:lower:] 小写字母字符

[:upper:] 大写字母字符

[:digit:] 数字

[:alnum:] 字母数字字符

[:space:] 空白字符（禁止打印）如回车符换行符竖直制表符和换页符[:punct:] 标点字符

[:cntrl:] 控制字符（禁止打印）

[:print:] 可打印字符 | 分隔替换选项通常与分组操作符 () 一起使用

( ) 将子表达式分组为一个替换单元量词单元或后向引用单元

[char] 字符列表

Oracle g提供了四个regexp function: REGEXP_LIKE REGEXP_REPLACE REGEXP_INSTR REGEXP_SUBSTR

Sql代码

REGEXP_LIKE比较一个字符串是否与正则表达式匹配

(srcstr pattern [ match_option])

REGEXP_INSTR在字符串中查找正则表达式并且返回匹配的位置

(srcstr pattern [ position [ occurrence [ return_option [ match_option]]]])

REGEXP_SUBSTR返回与正则表达式匹配的子字符串

(srcstr pattern [ position [ occurrence [ match_option]]])

REGEXP_REPLACE搜索并且替换匹配的正则表达式

(srcstr pattern [ replacestr [ position [ occurrence [ match_option]]]]) 其中各参数的含义为:

Sql代码

srcstr: 被查找的字符数据

pattern: 正则表达式

occurrence: 出现的次数默认为

position: 开始位置

return_option: 默认值为返回该模式的起始位置值为则返回符合匹配条件的下一个字符的起始位置

replacestr: 用来替换匹配模式的字符串

match_option: 匹配方式选项缺省为c

ccase sensitive

Icase insensitive

n()匹配任何字符（包括newline)

m字符串存在换行的时候被作为多行处理

下面通过一些具体的例子来说明如何使用这四个函数首先创建一个测试数据表

Sql代码

SQL> create table person (

first_name varchar()

last_name varchar()

email varchar()

zip varchar());

Table created

SQL> insert into person values (Steven Chen );

row created

SQL> insert into person values (James Li || chr() || bdf);

row created

SQL> commit;

Commit complete

SQL> select * from person;

FIRST_NAME LAST_NAME EMAIL ZIP

Steven Chen

James Li bdf

REGEXP_LIKE

Sql代码

SQL> select zip as invalid_zip from person where regexp_like(zip [^[:digit:]]);

INVALID_ZIP

bdf

SQL> select first_name from person where regexp_like(first_name ^S*n$);

FIRST_NAME

Steven

SQL> select first_name from person where regexp_like(first_name ^s*n$);

no rows selected

SQL> select first_name from person where regexp_like(first_name ^s*n$ c);

no rows selected

SQL> select first_name from person where regexp_like(first_name ^s*n$ i);

FIRST_NAME

Steven

SQL> select email from person where regexp_like(email ^james*com$);

no rows selected

SQL> select email from person where regexp_like(email ^james*com$ n);

SQL> select email from person where regexp_like(email ^li*com$);

no rows selected

SQL> select email from person where regexp_like(email ^li*com$ m);

REGEXP_INSTR

Sql代码

查找zip中第一个非数字字符的位置

SQL> select regexp_instr(zip [^[:digit:]]) as position from person;

POSITION

从第三个字符开始查找zip中第二个非数字字符的位置

SQL> select regexp_instr(zip [^[:digit:]] ) as position from person;

POSITION

从第三个字符开始查找zip中第二个非数字字符的下一个字符位置

SQL> select regexp_instr(zip [^[:digit:]] ) as position from person;

POSITION

REGEXP_SUBSTR

Sql代码

SQL> select regexp_substr(zip [^[:digit:]]) as zip from person;

ZIP

SQL> select regexp_substr(zip [^[:digit:]] ) as zip from person;

ZIP

REGEXP_REPLACE

Sql代码

把zip中所有非数字字符替换为

SQL> update person set zip=regexp_replace(zip [^[:digit:]] )

where regexp_like(zip [^[:digit:]]);

row updated

SQL> select zip from person;

ZIP

后向引用（backreference）

后向引用是一个很有用的特性它能够把子表达式的匹配部分保存在临时缓沖区中供以后重用缓沖区从左至右进行编号并利用 \digit 符号进行访问子表达式用一组圆括号来显示利用后向引用可以实现较复杂的替换功能

Sql代码

SQL> select regexp_replace(Steven Chen (*) (*) \ \) as reversed_name from dual;

REVERSED_NAME

Chen Steven

在DDL中也可以正则表达式比如Constraint index view

Sql代码

SQL> alter table person add constraint constraint_zip check (regexp_like(zip ^[[:digit:]]+$));

SQL> create index person_idx on person(regexp_substr(last_name ^[[:upper:]]));