C.9 Mimicry: Function word, Content word, BERT, Moving

Jump to bottom

agshruti12 edited this page Dec 6, 2023 · 1 revision

1. Feature Name

C.9 C.9 Mimicry: Function word, Content word, BERT, Moving

2. Literature Source (Serial Number, link)

3. Description of how the feature is computed (In Layman’s terms)

Function word mimicry:

Step 1: Extract function words from a message. Dictionary-based.
Step 2: Find the function words that also occur in the previous turn, which are defined as accommodated function words
Step 3: Count the total number of accommodated function words.

Content word mimicry:

Step 1: Extract content words from a message. Content words are defined as any word that is not a function word
Step 2: Find the content words that also occur in the previous turn, which are defined as accommodated content words
Step 3: Calculate the frequency of each content word across the whole dataset
Step 4: Calculate the inverse frequency of each accommodated content word by normalizing it with the frequency of that content word, then sum them up.

BERT mimicry:

Step 1: Compute BERT sentence vector for each chat.
Step 2: Compute distance (cosine similarility) from current chat to previous chat.

Moving mimicry:

Step 1: Compute BERT mimicry for the current chat.
Step 2: Compute average of all mimicry scores computed thus far, including metric generated in Step 1.
Step 3: Store Step 1 score for future calculation.

4. Algorithms used (KNN, Logistic Regression etc.)

None

5. ML Inputs/Features

None

6. Statistical concepts used

None

7. Pages of the literature to be referred to for details

Page 11

8. Any tweaks/changes/adaptions made from the original source

None