annotator

Aug 2, 2021

0a2264a · Aug 2, 2021

Name	Name	Last commit message	Last commit date
parent directory ..
data	data	release annotator	Aug 2, 2021
.gitattributes	.gitattributes	release annotator	Aug 2, 2021
.gitignore	.gitignore	release annotator	Aug 2, 2021
README.md	README.md	release annotator	Aug 2, 2021
annotate.py	annotate.py	release annotator	Aug 2, 2021
cosine_sim.py	cosine_sim.py	release annotator	Aug 2, 2021
get_loss.py	get_loss.py	release annotator	Aug 2, 2021
get_representation_ami.py	get_representation_ami.py	release annotator	Aug 2, 2021
get_representation_samsum.py	get_representation_samsum.py	release annotator	Aug 2, 2021
recover_word_loss.py	recover_word_loss.py	release annotator	Aug 2, 2021
requirements.txt	requirements.txt	release annotator	Aug 2, 2021
utils.py	utils.py	release annotator	Aug 2, 2021

README.md

DialoGPT Annotator

This code mainly shows how we annotate a dialogue from AMI and SAMSum using DialoGPT.

Requirements

conda create -n tfs python=3.7
pip install -r requirements.txt

Get loss

Firstly, run the following command, you will get a dir loss/samsum/bpe or loss/ami/bpe that stores three files: train_loss.json, valid_loss.json and test_loss.json.

For SAMSum: python get_loss.py -d samsum
For AMI: python get_loss.py -d ami

Secondly, we recover word-level loss, you will get a dir loss/samsum/word or loss/ami/word

Note that DialoGPT uses BPE to tokenize texts, thus, losses are calculated at the sub-word level. We recover the word-level predicted loss by averaging the losses of multiple sub-words.

For SAMSum: python recover_word_loss.py -d samsum
For AMI: python recover_word_loss.py -d ami

Get dialogue context representation

Run following commands, you will get a dir rep/samsum or rep/ami that stores three files: train_rep.json, valid_rep.json and test_rep.json.

For SAMSum: python get_representation_samsum.py
For AMI: python get_representation_ami.py

Calculate cosine similarity

Run following commands, you will get a dir rep/samsum/sim or rep/samsum/sim that stores three files: train_sim.json, valid_sim.json and test_sim.json.

For SAMSum: python cosine_sim.py -d samsum
For AMI: python cosine_sim.py -d ami

Annotate

Run following commands, you will get a dir data/samsum/final or data/ami/final that stores final output files.

For SAMSum: python annotate.py -d samsum
For AMI: python annotate.py -d ami

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

annotator

annotator

README.md

DialoGPT Annotator

Requirements

Get loss

Get dialogue context representation

Calculate cosine similarity

Annotate

Collapse file tree

Files

annotator

Directory actions

More options

Directory actions

More options

Latest commit

History

annotator

Folders and files

parent directory

README.md

DialoGPT Annotator

Requirements

Get loss

Get dialogue context representation

Calculate cosine similarity

Annotate