The ANC Tool

The ANC Tool application is still in beta testing stages. Before processing the entire ANC you should first process a small portion of the corpus to determine if the program is doing what you need.

Standoff annotations to XML with inline annotatinons

The ANC Tool can be used to generate files with inline annotations from the ANC files. The generated files may be:

If you want to use the ANC text files without the part of speech tags, but your application does not support UTF character encodings (i.e. MonoConc Pro) you can process the corpus with ANC File Conversion program which can be used to change the character encoding and linefeeds/carriage returns used in the text files. However, if you change the character encoding you may end up with "garbage" characters in your file if that character is not present in the selected character encoding. There is not much to do in this case other than to use a character encoding that can represent all the characters used in the corpus.

Download

The ANC Tool can be downloaded from the ANC Downloads page.

Usage

The XML Tab

Screenshot

  1. Select an input directory containing the ANC files to process. The program will recurse through all directories rooted at the input directory and process all the ANC files found.
  2. Select an output directory. The XML files that are created will be placed in the output directory. If the Copy directory structure check box is selected the directory structure of the input directory will be mirrored and directories will be created as needed. Otherwise all files will be created in the output directory. The default is to copy the directory structure. It is possible to select the the input directory as the output directory, but it is highly recommended that the input and output directories be separate directories.
  3. Select the annotations to include.
  4. Click the Process button.

The MonoConc Tab

  1. Select the input and output directory as above.
  2. Select the part of speech tags to include. The part of speech tags are the only annotations that can be included when generating text files.
  3. Select a separator character. This is the character that will be used to separate a word from its part of speech tag. The default character is the underscore. It is possible to use more than one character as the separator.
  4. Click the Process button.

The WordSmith Tab

  1. Select the input and output directory as above.
  2. Select the part of speech tags to include. The part of speech tags are the only annotations that can be included when generating text files.
  3. Click the Process button.