TGrep2

Basics
Tree search

In this tutorial, you’ll walk through some simple tgrep2 commands to become familiar with the basic options and basic tree search patterns.

Basics

General usage:

tgrep2 [options] <pattern>...

Search for all instances of the word “some.”

tgrep2 "some"

Return all full sentences with the word “some.”

tgrep2 -w "some"

Return only the terminal nodes (i.e., the words), without parses.

tgrep2 -wt "some"

Pipe the output into less so we can scroll up and down.

tgrep2 -wt "some" | less

How many matches are there?

tgrep2 -wt "some" | wc -l

Make sure to get all subtrees matching one or more patterns, but only report each subtree once (use -af by default, it will make the results more accurate).

tgrep2 -afwt "some" | wc -l

Once we’re happy with our results, we can save them to a file.

tgrep2 -afwt "some" > "some.txt"

By default, tgrep2 searches the Switchboard (or whatever the TGREP2_CORPUS environment variable is set to). Let’s search a different corpus instead. To see the available corpora:

ls $TGREP2ABLE

Let’s search the BNC.

tgrep2 -afwtc /afs/ir/data/linguistic-data/Treebank/tgrep2able/bnc-charniak-parses.t2c.gz "some" | wc -l

Tree search

Let’s go back to the Switchboard and find all the “some”-NPs. Start by looking at the parses with the -l option.

tgrep2 -aflw "some" | less

Find the NP that dominates “some.”

tgrep2 -aflw "some >> NP" | less

Oops, this still gets us just the word “some.” TGrep2 returns the first node by default. Let’s turn the pattern around.

tgrep2 -afl "NP << some" | less

The problem is that this finds any NP with “some” in it. Let’s get the closest one.

tgrep2 -afl "NP @<< NP << some" | less

To find the cases of partitive “some.”

tgrep2 -afl "NP << some << (of , some)" | less

To find the cases of non-partitive “some.”

tgrep2 -afl "NP << (some @. of)" | less

To find the cases of “some”-NPs at the start of sentences.

tgrep2 -afl "NP << some @, *" | less

To specify that the NP must contain a head noun.

tgrep2 -afl "NP=np << (some .. (/NN|NNS/ >> =np))" | less

Thus far, we have only output the results of a single search on the command line, or saved the output of a single search to a file. Ideally, we would add different types of information about each instance of “some” to a database directly – e.g., the full sentence, whether “some” occurs in the partitive, and whether “some” occurs at the start of a sentence. To do this easily, use TDTlite.

TGrep2

Table of contents

Basics

Tree search