Compiling proceedings for ingestion to the ACL anthology#

When compiling proceedings in ACL format, one has to set up the output format accordingly in config.toml:

out_format   = "acl"            # Output format

Two other pieces of information are needed for the output to be well-formed for ingestion:

anthology_id = "jeptalnrecital" # Anthology ID 
bilingual    = true             # Bilingual bibtex fields

The anthology_id (also known as venue identifier) is provided by the Director of the ACL Anthology (see submission procedure) and is used in the file names of the generated pdf and bib files. The bilingual option can be enabled to add a "language" field to the bib files to indicate whether the paper is in English or in French.

Note that language detection is done by calling the python langdetect library on the paper's title.

Overview of the process#

When compiling proceedings in ACL format, taln2x builds a directory adhering to the following structure (example taken from the ACL anthology documentation):

  meta                               Conference information
    semeval-2018.bib                 Bib entries (all papers)
    semeval-2018.pdf                 PDF of whole proceedings
      2018.semeval-1.0.bib           BibTeX entry for volume
      2018.semeval-1.1.bib           BibTeX entry for paper 1
      2018.semeval-1.2.bib           etc.
      2018.semeval-1.0.pdf           PDF of frontmatter
      2018.semeval-1.1.pdf           PDF for paper 1
      2018.semeval-1.2.pdf           etc.

Basically this directory contains PDF and Bib files for both the full proceedings and the articles themselves. Note that if these were compiled by taln2x, you can simply reuse the same project as the one used to compile these, update config.toml and then re-run taln2x. The proceedings in ACL format will also appear under out/.

On top of these PDF and Bib files, a plain text file named meta of the following form is added:

abbrev SemEval
volume 1
title 12th International Workshop on Semantic Evaluation
booktitle Proceedings of the 12th International Workshop on Semantic Evaluation
shortbooktitle Proceedings of SemEval
month January
year 2018
sig siglex
chairs Marianna Apidianaki
chairs Mohammad, Saif M.
chairs Jonathan May
chairs Ekaterina Shutova
chairs Steven Bethard
chairs Marine Carpuat
location Berlin, Germany
publisher Association for Computational Lingustics

This file is compiled by slightly reformating the pieces of information contained in event.yml.

Documentation built with MkDocs and Terminal for Mkdocs.