GATE is an application framework (among other things) that was used to prepare all ANC data and annotations. GATE is therefore a natural companion to the ANC for many types of processing tasks. Several plugins are provided that allow GATE to work with ANC documents and load and save standoff annotations.
The ANC GATE tools include plugins for reading and writing data and annotations in ISO-GrAF, the format used for all ANC corpora, and GRATE, a Groovy DSL for scripting GATE tasks.
Download the Gate tools:
To install the plugins, extract the contents of the GateTools archive into GATE’s plugins folder, launch GATE and add the plugins with GATE’s “Manage Plugins” dialog. See GATE’s documentation for further information on adding custom plugins.
After installation you should find the following items available in GATE:
GrAF Document (1.0)
The GrAF Document resource accepts the same properties as the GATE Document class as well as the following properties:
|loadStandoff||:||A boolean indicating whether any standoff annotations should be loaded. Defaults to true.|
|standoffASName||:||The name of the annotation set the standoff annotations will be added to.|
|standoffAnnotations||:||A list of standoff annotation types to be loaded. If left empty all annotations will be loaded.|
|resourceHeader||:||The GrAF resource header for the corpus.|
GrAF Load Standoff (1.0)
Adds a set of standoff annotations to a document. The following properties are supported:
|document||:||The document the annotations will be added to|
|resourceHeader URL||:||The GrAF resource header for the corpus.|
|sourceUrl||:||The URL of the standoff annotation file.|
|standoffAsName||:||The annotation set to add the annotation to.|
GrAF Load All Standoff (1.0)
Loads all standoff annotations for a document.
|document||:||The document to load annotations for|
GrAF Save Standoff (1.0)
Saves annotations to a standoff XML file. The following properties are supported:
|annotationType||:||Suffix that will be added to the filename. By convention this is the file type defined in the resource header|
|document||:||The document containing the annotations to save.|
|destination||:||The URL of the standoff annotation file to create.|
|inputASName||:||Set containing the annotations to be saved.|
|standoffTags||:||A list of annotation types to be saved. If this property is left empty all of the annotations in the input annotation set will be saved.|
|encoding||:||The character encoding to be used when writing the standoff annotation file. Default is UTF-8|
|grafASName||:||The name of the annotation space to use.|
|grafASType||:||The pid (URI) for the annotation space.|
|grafDefaultASType||:||Default annotation space type if unknown annotation space names are encountered.|
|version||:||The value of the version attribute in the root element of the document. Defaults to 1.0|
Save Document Content
Save the content of the document to a text file. The following properties are supported: There is no corresponding ANC Load Content processing resourse. Since the content of an ANC document is stored as a text file the content can be loaded in a normal GATE document (remember, the text files are UTF-8).
|destination||:||where the text file will be saved|
|document||:||the document to be saved|
|encoding||:||character encoding to use. Default is UTF-8|
Detailed instructions on using the ANC resources in GATE can be found here.