![]() |
home | introduction | research | people | facilities | events & news | visitor info | contact us | search |
Example of SearchTool with Shoebox
|
Command: search -Itestdata.dat -TSbxTiers.lis -PQuery1.sbx -Itestdata.dat Inputfile with SHOEBOX data: "testdata.dat": \ref mi1 006 \_no 00006 \tx 6a de 6i Fami:lya; \mb de 6i ke7x -i =k \egl to PREP A3(POSS) Family que. \fts i lo que tiene; \fte \com \wo vi:0 \ref mi1 007 \_no 00007 \tx 7a de 7i fami:lya; \mb 7a de 7i fami:lya \sgl a de A3(POSS) familia \egl to PREP A3(POSS) family \fts ?a de su familia?; \fte \com \ref mi1 008 \_no 00008 \tx cha7k 7it@pa7; \mb chaj =ak 7it -@ -pa -a7 \sgl cual =AN EXIST-INV-INCI.I-RLTVZR \egl which=AN EXIST-INV-INCI.I-RLTVZR \fts lo que tiene; \fte \com \wo vtl:2v;2(p) -TSbxTiers.lis The tierfile "SbxTiers.lis" can look like: FILETYPE='SHOEBOX' >\ref >\tx >\egl >\fts >\wo -PQuery1.sbx The patternfile "Query1.sbx" can look like: #1 '\egl' 'family' I #2 '\tx' 'fami:lya' #3 '\fts' '^LO.*TIENE' I 3+1*2 The result on the screen will be: ******************************************************************************** *** Command: Search -Itestdata.dat -TSbxTiers.lis -PQuery1.sbx ******************************************************************************** *** Inputfiles are of type SHOEBOX. *** Blocks start with tiername: \ref. *** Selected tiernames for output: *** Tiername: \ref *** Tiername: \tx *** Tiername: \egl *** Tiername: \fts *** Tiername: \wo ******************************************************************************** *** Pattern: #1 '\egl' 'family' I *** Pattern: #2 '\tx' 'fami:lya' *** Pattern: #3 '\fts' '^LO.*TIENE' I *** Combination: 3+1*2 ******************************************************************************** *** FILE: testdata.dat *** Block: 2 \ref mi1 007 \tx 7a de 7i fami:lya; \egl to PREP A3(POSS) family \fts ?a de su familia?; *** Block: 3 \ref mi1 008 \tx cha7k 7it@pa7; \egl which=AN EXIST-INV-INCI.I-RLTVZR \fts lo que tiene; \wo vtl:2v;2(p) *** *** Blocks in file testdata.dat: 3 *** Block matches in file testdata.dat: 2 ******************************************************************************** ******************************************************************************** *** Total files read: 1 *** Total blocks read: 3 *** Total block matches: 2 ******************************************************************************** Explanation of result: Pattern: #1 '\egl' 'family' I : matches block 1; Family and block 2; family Pattern: #2 '\tx' 'fami:lya' : matches block 2; fami:lya Pattern: #3 '\fts' '^LO.*TIENE' I : matches block 3; lo que tiene Note: - I : Ignore case - ^ (in ^LO.*TIENE) : matches from beginning of line. ^ \ $ . [ ] | ( ) * + ? have a special meaning within a regular expression. Combination: 3+1*2 is evaluated as 3+(1*2) * (= AND) has a higher priority as + (= OR); So 1*2 matches block 2 3 matches block 3 3+1*2 matches block 2,3 Combination: 3+1*2 : matches block 2,3 4. New features Version 3.0 =========================== A. Support of Shoebox column structure B. Support of comment lines in tierfile, patternfile, listfile and vectorfile C. New tierfile keywords: - TIERNAMES='ALL' - FILETYPE='CHAT' - OUTPUTTYPE='KWAL' D. Support of Chat files E. Support of block context F. Vectors 4.A Support of Shoebox column structure ======================================= Shoebox files have an aligned column structure e.g.: \ref mi2 003 \_no 00003 \tx 7i ka: jatpa ta tuni m@7ki; \mb 7i ka: jat -pa ta tun -i m@:k7 -i \sgl y NEG poder -INCI.I C3(ERG) hacer-INCD hacer_tamales-NMZR \egl and NEG be_able-INCI.I C3(ERG) do -INCD prepare_tamal-NMZR With search it's possible to specify a column context for two patterns. - you can specify that 2 patterns should match in the same column (column context = 0). - you can also specify that if pattern #1 matches in column X, then pattern #2 should match in column: - X-1 or X or X+1 (column context = 1) - X-2 or X-1 or X or X+1 or X+2 (column context = 2) - .... e.g. A patternfile can now look like: #1 '\mb' 'tun' #2 '\sgl' 'hacer' C(1|0|2) The program will return all blocks where pattern #1 and pattern #2 match in the same column. where C(1|0|2) is: C : Column expression ( 1 : PatternNo | : Divider 0 : Column context | : Divider 2 : PatternNo ) |
Back to text |