Writing a Perl XS swig interface to the
- Slides: 28
Writing a Perl XS swig interface to the CLucene C++ text search engine Peter Edwards Perl XS and SWIG interface to CLucene C++ text search engine 1 9/26/2020
Introduction Peter Edwards ~ background l Subject ~ writing a Perl XS swig interface to the CLucene C++ text search engine l Perl XS and SWIG interface to CLucene C++ text search engine 2 9/26/2020
Aims Give an idea of the process involved in selecting and using an external library from Perl l Introduction to extending Perl using XS, swig, GNU autotools l Entertainment Ø Audience: What is your background and interest? l Perl XS and SWIG interface to CLucene C++ text search engine 3 9/26/2020
Topics Understanding the Problem l The Answer (at a high level) l Technical Options l Investigating Options l Writing a perl / C++ Interface l Layers and Components l Lessons Learned l Perl XS and SWIG interface to CLucene C++ text search engine Process Extending Perl 4 9/26/2020
Terms l Perl ~ Pathologically Eclectic Rubbish Lister l Perl XS ~ e. Xternal Subroutine l SWIG ~ Simplified Wrapper and Interface Generator l C++ ~ Object Oriented version of C programming language text search ~ boolean searching of stemmed words, wildcards CLucene ~ C++ text search engine based on Java Lucene l l $_ = "wftedskaebjgdpjgidbsmnjgc"; tr/a-z/oh, turtleneck Phrase Jar!/; print; allows a perl program to call a C language subroutine XS is also the “glue” language specifying the calling interface contains complex “perlguts” stuff that will destroy your sanity makes it easy to call a C/C++ library from many languages (perl, python, ruby, PHP…) Perl XS and SWIG interface to CLucene C++ text search engine 5 9/26/2020
Understanding the Problem Recruitment software written in Perl l 20, 000+ candidate Word CVs/resumes l Boolean searching using words or partial words and wildcards l e. g. (“BA” or “MA”) and “literature” l Combined with SQL searching e. g. geographic area, skill profile codes, pay rate Speed < 2 seconds l Old system used dt. Search proprietary s/w l Perl XS and SWIG interface to CLucene C++ text search engine 6 9/26/2020
The Answer (at a high level) Load l Convert candidate CVs from Word to text using wv. Ware (Open. Office) converter l Index text against candidate no. Search l Search text -> cand nos -> SQL temp table l Normal SQL search on other criteria Perl XS and SWIG interface to CLucene C++ text search engine 7 9/26/2020
Technical Options (at 2003/4) Proprietary l dt. Search ~ cost; hard to get cand nos out; Windows interface when perl app is Web Open Source l Java Lucene ~ slow but good API and power l C++ CLucene ~ alpha quality rewrite of Lucene in Visual C++ as degree project by Ben van Klinken l Perl CPAN (PLucene etc. ) below http: //search. cpan. org/modlist/String_Language_Text_Processing Perl XS and SWIG interface to CLucene C++ text search engine 8 9/26/2020
Investigating Perl Options l l l Wrote test harness to load 1000 CVs then do some searches Tried about 5 CPAN modules PLucene search speed okay for small volumes but exponential increase in insert time >60 seconds per insert l Why? Tokenises doc, multi-lingual word stemming, adds doc id to reverse lookup index for each stem token Other modules faster but search options weak Need to look further l Perl XS and SWIG interface to CLucene C++ text search engine 9 9/26/2020
Investigating CLucene Wrote similar C++ test harness l Speed good: search 20, 000 CVs <1 second load 3 CVs per sec (mostly Word->text) l Code written as VC++ degree project and registered at Source. Forge l Jimmy Pritts changed layout and added GNU autoconf files configure. ac Makefile. in to let it build cross-platform on Windows, cygwin, Linux l Had C DLL interface used by PHP wrapper Decided to write Perl wrapper l Perl XS and SWIG interface to CLucene C++ text search engine 10 9/26/2020
Interfacing Perl to C++ l l l When I wrote this wrapper, Perl to C++ interfacing via XS or SWIG was tricky and despite the optimism expressed at http: //www. johnkeiser. com/perl-xs-c++. html I had difficulties mapping the CLucene API to XS Reasons: C++ namespace mangling; object and method mapping; C++ memory garbage collection So I decided to go via the C DLL wrapper to hide this complexity Perl XS and SWIG interface to CLucene C++ text search engine 11 9/26/2020
Perl XS l l Always start with h 2 xs utility Code is C with macro extensions Write C code (XSUBs) Call internal Perl routines (perlguts) to create variables, allocate arrays… new. SViv(IV), sv_setiv(SV*, IV) ~ scalar integer variable l l Complicated Nyarlathotep / “Crawling Chaos” Perl XS and SWIG interface to CLucene C++ text search engine 12 9/26/2020
Enter SWIG Creates XS for you from a. i definition file l Parses C/C++. h header files to get types and function prototypes l Allows for inline C/XS code l Perl XS and SWIG interface to CLucene C++ text search engine 13 9/26/2020
Swig XS Sample From argv. i // Creates a new Perl array and places a NULL-terminated char ** into it %typemap(out) char ** { AV *myav; SV **svs; int i = 0, len = 0; /* Figure out how many elements we have */ while ($1[len]) len++; svs = (SV **) malloc(len*sizeof(SV *)); for (i = 0; i < len ; i++) { svs[i] = sv_newmortal(); sv_setpv((SV*)svs[i], $1[i]); }; myav = av_make(len, svs); free(svs); $result = new. RV((SV*)myav); sv_2 mortal($result); argvi++; } Perl XS and SWIG interface to CLucene C++ text search engine 14 9/26/2020
Diagram of Layers Perl OO Wrapper Low Level Perl SWIG XS C Code C DLL Interface CLucene C++ Library CLucene. pm CLucene. Wrap. pm clucene_wrap. c SWIG generated clucene_dll. o clucene. so Perl XS and SWIG interface to CLucene C++ text search engine 15 9/26/2020
CLucene C++ Interface src/CLucene/search/Search. Header. h: #include "CLucene/Std. Header. h" #ifndef _lucene_search_Search. Header_ #define _lucene_search_Search. Header_ #include "CLucene/index/Index. Reader. h“ … using namespace lucene: : index; namespace lucene{ namespace search{ //predefine classes class Searcher; class Query; class Hits; class Hit. Doc { public: float_t score; int_t id; lucene: : document: : Document* doc; Hit. Doc* next; Hit. Doc* prev; }; // in doubly-linked cache Hit. Doc(const float_t s, const int_t i); ~Hit. Doc(); Perl XS and SWIG interface to CLucene C++ text search engine 16 9/26/2020
CLucene C DLL Interface src/wrappers/dll/clucene_dll. h: #ifndef _DLL_CLUCENE #define _DLL_CLUCENE #include "CLucene/CLConfig. h" … #ifdef _UNICODE //unicode methods # define CL_UNLOCK CL_U_Unlock # define CL_OPEN CL_U_Open # define CL_DOCUMENT_INFO CL_U_Document_Info # define CL_ADD_FILE CL_U_Add_File … CLUCENEDLL_API int CL_U_Unlock(const wchar_t* dir); CLUCENEDLL_API int CL_U_Delete(const int resource, const wchar_t* query, const wchar_t* field); CLUCENEDLL_API int CL_U_Add_Field(const int resource, const wchar_t* fie ld, const wchar_t* value, const int value_length, const int store, const ind ex, const int token); … Perl XS and SWIG interface to CLucene C++ text search engine 17 9/26/2020
SWIG Definition File clucene. i %module "Fulltext. Search: : CLucene. Wrap" %{ #include "clucene_dllp. h" %} // our definitions for CLucene variables and functions %include "clucene_perl. h" //%include "clucene_dll. h" // could use this but then would need to call CL_N_Se arch not CL_SEARCH etc. %include typemaps. i %include argv. i // helper functions where pointers to result buffers are expected // would be better done with a %typemap(out) if I knew enough about perlguts %inline %{ int val_len; char * val; int CL_Get. Field 1(int resource, char * field) { return CL_GETFIELD(resource, field, &val_len); } … } Perl XS and SWIG interface to CLucene C++ text search engine 18 9/26/2020
SWIG-Generated XS CLucene. Wrap. pm # This file was automatically generated by SWIG package Fulltext. Search: : CLucene. Wrap; require Exporter; require Dyna. Loader; @ISA = qw(Exporter Dyna. Loader); package Fulltext. Search: : CLucene. Wrapc; bootstrap Fulltext. Search: : CLucene. Wrap; package Fulltext. Search: : CLucene. Wrap; @EXPORT = qw( ); # ----- BASE METHODS ------package Fulltext. Search: : CLucene. Wrap; sub TIEHASH { my ($classname, $obj) = @_; return bless $obj, $classname; } sub CLEAR { } … # ------- FUNCTION WRAPPERS -------package Fulltext. Search: : CLucene. Wrap; *CL_OPEN = *Fulltext. Search: : CLucene. Wrapc: : CL_OPEN; *CL_CLOSE = *Fulltext. Search: : CLucene. Wrapc: : CL_CLOSE; … # ------- VARIABLE STUBS -------package Fulltext. Search: : CLucene. Wrap; *clucene_perl = *Fulltext. Search: : CLucene. Wrapc: : clucene_perl ; *NULL = *Fulltext. Search: : CLucene. Wrapc: : NULL; *val_len = *Fulltext. Search: : CLucene. Wrapc: : val_len; *val = *Fulltext. Search: : CLucene. Wrapc: : val; *errstr = *Fulltext. Search: : CLucene. Wrapc: : errstr; … Perl XS and SWIG interface to CLucene C++ text search engine 19 9/26/2020
SWIG-Generated XS clucene_wrap. c #ifdef __cplus extern "C" { #endif XS(_wrap_CL_OPEN) { { char *arg 1 ; int arg 2 = (int) 1 ; int result; int argvi = 0; d. XSARGS; if ((items < 1) || (items > 2)) { SWIG_croak("Usage: CL_OPEN(path, create); "); } if (!Sv. OK((SV*) ST(0))) arg 1 = 0; else arg 1 = (char *) Sv. PV(ST(0), PL_na); if (items > 1) { arg 2 = (int) Sv. IV(ST(1)); } result = (int)CL_OPEN(arg 1, arg 2); ST(argvi) = sv_newmortal(); sv_setiv(ST(argvi++), (IV) result); XSRETURN(argvi); fail: ; } } croak(Nullch); Perl XS and SWIG interface to CLucene C++ text search engine 20 9/26/2020
CLucene. pm Perl OO Wrapper Back into the realms of sanity l Normal OO package with methods l Calls XS wrapper functions l sub open { my $this = shift; my %arg = @_; my $path = $arg{path} || $this->{path} || confess "path undefined"; my $create = anyof ( $arg{create}, $this->{create}, 0 ); $this->{resource} = Fulltext. Search: : CLucene. Wrap: : CL_OPEN ( $path, $creat e) or confess "Failed to CL_OPEN $this->{path} create $create errst r ". $this->errstrglobal(); $this->{path} = $path; $this; } Perl XS and SWIG interface to CLucene C++ text search engine 21 9/26/2020
Build Environment Uses GNU autotools and m 4 macro processor Definition files l configure. ac ~ top level build definitions l Makefile. am ~ makefile flags definitions Programs l libtool ~ generalised library building l aclocal ~ builds aclocal. m 4 from configure. ac l autoconf ~ reads configure. ac to create configure script l autoheader ~ creates C header defines for configure l automake ~ creates Makefile. in from Makefile. am l l autoreconf ~ manually remake whole tree of GNU build files Perl XS and SWIG interface to CLucene C++ text search engine 22 9/26/2020
Bootstrap shell script #!/bin/sh # Bootstrap the CLucene installation. mkdir -p. /build/gcc/config set -x libtoolize --force --copy --ltdl --automake aclocal autoconf autoheader automake -a --copy --foreign Perl XS and SWIG interface to CLucene C++ text search engine 23 9/26/2020
Autoconfigure. ac file dnl Process this file with autoconf to produce a configure script. dnl Written by Jimmy Pritts. dnl initialize autoconf and automake AC_INIT([clucene], [1]) AC_PREREQ([2. 54]) AC_CONFIG_SRCDIR([src/CLucene. h]) AC_CONFIG_AUX_DIR([. /build/gcc/config]) AC_CONFIG_HEADERS([config. h]) AM_INIT_AUTOMAKE dnl Check for existence of a C and C++ compilers. AC_PROG_CC AC_PROG_CXX dnl Check for headers AC_HEADER_DIRENT dnl Configure libtool. AC_PROG_LIBTOOL dnl option to use UTF-8 as internal 8 -bit charset to support characters in Unicodeâ ¢ AC_ARG_ENABLE(utf 8, AC_HELP_STRING([--enable-utf 8], [UTF-8 as internal 8 -bit charset to support characters in Unicodeâ ¢ (default=no)]), [AC_DEFINE([UTF 8], [use UTF-8 as internal 8 -bit charset to support characters in Unicodeâ ¢])], enable_utf 8=no) AM_CONDITIONAL(USEUTF 8, test x$enable_utf 8 = xyes) AC_CONFIG_FILES([Makefile src/Makefile examples/demo/Makefile examples/tests/Makefile examples/util/Makefile wrappers/dll/Makefile wrappers/dlltest/Makefile]) AC_OUTPUT Perl XS and SWIG interface to CLucene C++ text search engine 24 9/26/2020
Makefile. am files src/Makefile. am: AUTOMAKE_OPTIONS = 1. 6 . /Makefile. am: ## Makefile. am -- Process this file with automake to produce Makefile. in include_HEADERS = CLucene. h INCLUDES = -I$(top_srcdir) lsrcdir = $(top_srcdir)/src/CLucene SUBDIRS = src wrappers examples. lib_LTLIBRARIES = libclucene. la libclucene_la_SOURCES = include CLucene/analysis/Makefile. am include CLucene/analysis/standard/Makefile. am include CLucene/debug/Makefile. am include CLucene/document/Makefile. am include CLucene/index/Makefile. am include CLucene/query. Parser/Makefile. am include CLucene/search/Makefile. am include CLucene/store/Makefile. am include CLucene/util/Makefile. am include CLucene/Makefile. am src/CLucene/document/Makefile. am: documentdir = $(lsrcdir)/document dochdir = $(includedir)/CLucene/document libclucene_la_SOURCES += $(documentdir)/Date. Field. cpp libclucene_la_SOURCES += $(documentdir)/Document. cpp libclucene_la_SOURCES += $(documentdir)/Field. cpp doch_HEADERS Perl XS and SWIG interface to CLucene C++ text search engine = $(documentdir)/*. h 25 9/26/2020
Recap We saw how and why I selected an external Perl library l We looked at GNU autotools to provide a cross-platform build environment l We investigated the layers of code needed to interface perl to a C++ library ~ SWIG, C, XS inline helpers, low and high level Perl modules l Perl XS and SWIG interface to CLucene C++ text search engine 26 9/26/2020
Lessons Learned Start off a new external library using GNU autotools and keeping in mind that the API should be easy to use through SWIG l Use SWIG not XS to wrap a C/C++ library l Always use h 2 xs to start a Perl extension l Open Source feedback and testing are more valuable than you expect (2 emails this week alone) l Perl XS and SWIG interface to CLucene C++ text search engine 27 9/26/2020
Where to Get More Information l Perl XS l l C++ / XS SWIG l l Lucene CLucene Autoconf Book Ø Any Questions l These slides are at http: //perl. dragonstaff. com/ 2002) http: //en. wikipedia. org/wiki/XS_%28 Perl%29 http: //www. perl. com/doc/manual/html/pod/perlguts. html http: //www. johnkeiser. com/perl-xs-c++. html http: //en. wikipedia. org/wiki/SWIG http: //www. swig. org/ http: //en. wikipedia. org/wiki/Lucene http: //sourceforge. net/projects/clucene/ http: //www. gnu. org/software/autoconf/ “Extending and Embedding Perl”, Jenness & Couzens (Manning, Perl XS and SWIG interface to CLucene C++ text search engine 28 9/26/2020
- Perl swig
- "burp suite" -"daily swig"
- What is interface in java
- User led through interaction via series of questions
- Industrial interfaces
- Interface------------ an interface *
- Perl logger
- Perl5 正規表現
- Perl diamond operator
- Perl tk tutorial
- Peal soap
- Obfuscated perl
- Perl logical operators
- Perl web framework
- Cgi linkage in perl
- Html program
- Perl shell scripting
- Perl bioinformatics
- Nassim zellal
- Primary data structures in perl
- Perl hash table
- Four perfect pebbles chapter 2
- Perl bioinformatics
- Language
- Perl ide
- Perl random number generator
- Perl log analysis
- What is perl?
- Intro to perl