Web Services

ELAN offers several ways to interact with web services. These web services are tools or applications that run on a web server, accept some resource(s) as input, apply an algorithm and return the result as output. Some of the Recognizers in the Audio And Video Recognizers tab are available as a web service, in which case audio or video or any other type of sequential data are uploaded to be processed online. The web services available in this menu work with text rather than multimedia content. These tools can automate certain parts of the annotation process, such as tokenizing and part of speech tagging. These web services can be found by clicking Options > Web Services . WebLicht is described in the next section, support for TypeCraft is still experimental.

WebLicht

WebLicht (Web-Based Linguistic Chaining Tool) is an execution environment developed at Tuebingen University as part of the CLARIN infrastructure. Most of the tools in this environment perform NLP (Natural Language Processing) type of tasks on textual data and most of them are tailored to work with language data in one of the well-described and well-resourced languages. The tools are encapsulated as web services and can be combined into processing chains.

To make use of the WebLicht service, go to Options > Web Services > Weblicht. In the dialog that opens, you can specify how to use WebLicht

  • by selecting the type of input, either plain text or one or more selected tiers
  • and by selecting the type of service, either a preconfigured tool chain (recommended) or a single tool service (deprecated)

Figure 2.117. WebLicht service: upload plain text or the contents of a tier

WebLicht service: upload plain text or the contents of a tier

Then click the Next button.

The interface of the second step depends on the choice between plain text and tier(s) in the first step.

  • In case of plain text a text area will be visible in which you can paste or type plain text.

    Figure 2.118. WebLicht service: text input

    WebLicht service: text input

  • In case of tier selection a table will be shown in which one tier can be selected. Below the table, specify the content type (Sentence if the annotations on the selected tier contain sentences or Word/Token if the annotations contain single words). There are some limitations on the tiers you can select for each type; Sentence tiers are expected to be a toplevel tier or a symbolically associated dependent tier thereof, Token tiers are expected to be on a symbolic subdivision tier.

    Figure 2.119. WebLicht service: select a tier to upload

    WebLicht service: select a tier to upload

Then click Next.

The third step depends on the choice between a tool chain and a single tool in the first step.

  • In case of a tool chain (see section WebLicht with tool chain file), the stored chain file can be specified here. Click the Select button and browse to the saved file. It is necessary to specify the type of content the first tool in the chain expects, plain text or TCF. In the lower text field the access key can be pasted.

    Figure 2.120. WebLicht service: select the tool chain file and paste the access key

    WebLicht service: select the tool chain file and paste the access key

  • In case of a single tool, a list of available tool services will be shown. There is no guarantee a listed tool service (still) works this way (direct access without authentication).

    For plain text in the first step, the list contains several services that detect sentence boundaries and then tokenize these sentences. Select one of the tokenize servicese. In case of successful processing the result will be two tiers, for sentences and tokens. If you want to add Part of Speech and/or Lemma annotations, you can use the tiers produced in this step as the input for such services (part of speech taggers) in a second run. There is an option to specify the duration (in ms) of each sentence

    Figure 2.121. WebLicht service: select a tokenizer

    WebLicht service: select a tokenizer

    For tier input different services are available which can parse text, tag Parts of Speech, etc. Each service has a short description that specifies its function. Hovering over a service with the mouse will show a tooltip containing more information of the service. If the service you are looking for is not listed, you can manually specify its URL.

    Figure 2.122. WebLicht service: select a service

    WebLicht service: select a service

After configuration with a tool chain file or a single tool, click Finish to start processing. When the processing was successful, you will see a dialog stating the operation is complete. Depending on the service you selected for processing, the tokenized sentence and/or part of speech tags will be added as children of the tier you selected for processing.

Figure 2.123. WebLicht service: result of the processing

WebLicht service: result of the processing
Parser used: Berkeley NLP Parser

WebLicht with tool chain file

The recommended way to use WebLicht from within ELAN is by supplying a preconfigured tool chain file. This file can be created as follows:

  • go to the WebLicht web interface and click the Start WebLicht button. You'll be prompted to log in to WebLicht via the CLARIN Service Provider Federation. If your home organisation is not in the list, you can request a CLARIN account.
  • once logged in, create a tool chain, preferably using some of your own data, and run the tools to check the results
  • click the Download button to save the tool chain configuration. The file will have a name like chain_randomnumber_.xml, it might be practical to rename it to something more intuitive, especially if you plan to have more than one tool chain
  • make sure to make a note of the type of input of the first tool in the chain, it should be Plain Text or TCF
  • to be able to run the tool chain from within ELAN, you'll need to generate a so-called API Key. ELAN will run the tool chain via WebLicht as a Service (WaaS), and that service requires an API key to authenticate the request. The API key can be generated on the WaaS home page and then stored locally. The key will be valid for 3 months, after that a new key will have to be generated.
  • in the third step of running WebLicht in ELAN, all this information has to be provided: the tool chain file, the type of input and the API key (see Figure 2.120, “WebLicht service: select the tool chain file and paste the access key”)