I have been working for some time on the CL-UNIFICATION library. Somehow, "unification" and "parsing" are related but not quite (I bet the literature on the topic is huge). In an afternoon of the Easter vacation, I decided to take the easy way out of extending CL-UNIFICATION with some "parsing" functionality: I just added Edi Weitz's wonderful CL-PPCRE library. Edis' library has a very simple and intuitive interface which made the integration a SMOP! The result is the following. Now you can write the following (assuming all the packages are USEed):
(unify #T(regexp "a(b+)c(d+)" (?bs ?ds)) "abbbbcddd")
The call will produce an environment where ?BS is bound to "bbbb" and ?DS is bound to "ddd". Here is a full transcript.
CL-USER 3 > (in-package "UNIFY") #<The CL.EXT.DACF.UNIFICATION package, 326/512 internal, 20/64 external> UNIFY 4 > (unify #T(regexp "a(b+)c(d+)" (?bs ?ds)) "abbbbcddd") #<UNIFY ENVIRONMENT: 1 frame 200BC147> UNIFY 5 > (v? '?bs *) "bbbb" T UNIFY 6 > (v? '?ds **) "ddd" TOf course, the other matching operations work as expected.
UNIFY 9 > (match-case ("abbbbcdd") (#T(regexp "a(b+)c(d+)" (?bs ?ds)) (concatenate 'string ?ds ?bs)) (t "It did not work!")) "ddbbbb"I.e., there is an interface to get to the actual regexp groups (beyond those available directly in CL-PPCRE). Maybe the syntax of the regexp unification templates could be made even more CL-PPCRE-like by exploiting "named registers", but, for the time being, the above is the best way to use it.
(cheers)
This doesn't seem to be in cvs yet.
ReplyDelete