View on GitHub

Arguments-enhanced Information Retrieval

A case study in the Decide Madrid database

Download this project as a .zip file Download this project as a tar.gz file

Arguments-enhanced IR

version last-update license

ArgIR repository, a tool for annotation and retrieval of argumentative information from textual content. A case study in the Decide Madrid database.

We present a tool that not only allows to retrieve argumentative information, but also to annotate new arguments and/or validate them (in terms of their topical relevance and rhetorical quality). The search runs on Apache Lucene and the results (proposals and comments) are re-ranked according to their level of controversy or the number and quality of arguments they have.

This project takes advantage of the arguments previously extracted (from the citizen proposals of the Decide Madrid platform) in the argrecsys/arg-miner repository.

Papers

This work (v1.0) was presented as a long paper at CIRCLE (Joint Conference of the Information Retrieval Communities in Europe) 2022. CIRCLE 2022 was hosted by the Université de Toulouse, France, 4-7th July 2022. The paper can be found here.

Screenshots

Argument-enhanced Information Retrieval tool: allows the retrieval of argumentative information from textual content.

arg-ir-gui-main

Arguments Annotation form: allows manual annotation and validation of arguments.

arg-ir-gui-annotation

Annotation and validation

The tool allows you to annotate/edit arguments, as well as validate their relevance and quality. Below is an example of the generated validation file.

proposal_id argument_id relevance quality timestamp username
7 7-85675-1-1 VERY_RELEVANT SUFFICIENT 10/3/2022 20:53:00 andres.segura
1419 1419-30381-1-1 RELEVANT SUFFICIENT 17/02/2022 23:04 andres.segura
2576 2576-0-1-1 VERY_RELEVANT HIGH_QUALITY 16/02/2022 17:31 andres.segura
10996 10996-0-1-1 VERY_RELEVANT HIGH_QUALITY 24/02/2022 20:12 andres.segura
26787 26787-204339-1-1 NOT_RELEVANT LOW_QUALITY 2022-03-09 16:39:43 andres.segura

Validation

As a preliminary offline evaluation, using the developed tool, we manually validated 20% of the arguments extracted by the simple syntactic pattern-based method. For the topical relevance metric, 8.6% of the arguments were labeled as spam, 36.9% as not relevant, 39.9% as relevant, and 14.6% as very relevant, whereas for the rhetoric quality metric, 42.3% of the arguments were of low quality, 40.6% of sufficient quality, and 17.1% of high quality. Although these results are modest, they can be considered acceptable as baseline values, taking into account they were obtained with a heuristic method that does not require training data and parameter tuning.

Dependencies

The implemented solutions depend on or make use of the following libraries and .jar files:

Execution and Use

The project has an executable package in the \jar folder, called ArgumentIR.jar. To run the tool from the Command Prompt (CMD), execute the following commands:

  cd "arg-ir-tool\jar\"
  java -jar ArgumentIR.jar

Documentation

Please read the contributing and code of conduct documentation.

Authors

Created on Jan 25, 2022
Created by:

License

This project is licensed under the terms of the Apache License 2.0.

Acknowledgements

This work was supported by the Spanish Ministry of Science and Innovation (PID2019-108965GB-I00).