Named Entity Recognition in aspect-opinion extraction using dependency rule matching

2024/10/14 14:20:15

Using Spacy, I extract aspect-opinion pairs from a text, based on the grammar rules that I defined. Rules are based on POS tags and dependency tags, which is obtained by token.pos_ and token.dep_. Below is an example of one of the grammar rules. If I pass the sentence Japan is cool, it returns [('Japan', 'cool', 0.3182)], where the value represents the polarity of cool.

However I don't know how I can make it recognise the Named Entities. For example, if I pass Air France is cool, I want to get [('Air France', 'cool', 0.3182)] but what I currently get is [('France', 'cool', 0.3182)].

I checked Spacy online documentation and I know how to extract NE(doc.ents). But I want to know what the possible workaround is to make my extractor work. Please note that I don't want a forced measure such as concatenating strings AirFrance, Air_France etc.

Thank you!

import spacynlp = spacy.load("en_core_web_lg-2.2.5")
review_body = "Air France is cool."
doc=nlp(review_body)rule3_pairs = []for token in doc:children = token.childrenA = "999999"M = "999999"add_neg_pfx = Falsefor child in children :if(child.dep_ == "nsubj" and not child.is_stop): # nsubj is nominal subjectA = child.textif(child.dep_ == "acomp" and not child.is_stop): # acomp is adjectival complementM = child.text# example - 'this could have been better' -> (this, not better)if(child.dep_ == "aux" and child.tag_ == "MD"): # MD is modal auxiliaryneg_prefix = "not"add_neg_pfx = Trueif(child.dep_ == "neg"): # neg is negationneg_prefix = child.textadd_neg_pfx = Trueif (add_neg_pfx and M != "999999"):M = neg_prefix + " " + Mif(A != "999999" and M != "999999"):rule3_pairs.append((A, M, sid.polarity_scores(M)['compound']))

Result

rule3_pairs
>>> [('France', 'cool', 0.3182)]

Desired output

rule3_pairs
>>> [('Air France', 'cool', 0.3182)]
Answer

It's very easy to integrate entities in your extractor. For every pair of children, you should check whether the "A" child is the head of some named entity, and if it is true, you use the whole entity as your object.

Here I provide the whole code

!python -m spacy download en_core_web_lg
import nltk
nltk.download('vader_lexicon')import spacy
nlp = spacy.load("en_core_web_lg")from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()def find_sentiment(doc):# find roots of all entities in the textner_heads = {ent.root.idx: ent for ent in doc.ents}rule3_pairs = []for token in doc:children = token.childrenA = "999999"M = "999999"add_neg_pfx = Falsefor child in children:if(child.dep_ == "nsubj" and not child.is_stop): # nsubj is nominal subjectif child.idx in ner_heads:A = ner_heads[child.idx].textelse:A = child.textif(child.dep_ == "acomp" and not child.is_stop): # acomp is adjectival complementM = child.text# example - 'this could have been better' -> (this, not better)if(child.dep_ == "aux" and child.tag_ == "MD"): # MD is modal auxiliaryneg_prefix = "not"add_neg_pfx = Trueif(child.dep_ == "neg"): # neg is negationneg_prefix = child.textadd_neg_pfx = Trueif (add_neg_pfx and M != "999999"):M = neg_prefix + " " + Mif(A != "999999" and M != "999999"):rule3_pairs.append((A, M, sid.polarity_scores(M)['compound']))return rule3_pairsprint(find_sentiment(nlp("Air France is cool.")))
print(find_sentiment(nlp("I think Gabriel García Márquez is not boring.")))
print(find_sentiment(nlp("They say Central African Republic is really great. ")))

The output of this code will be what you need:

[('Air France', 'cool', 0.3182)]
[('Gabriel García Márquez', 'not boring', 0.2411)]
[('Central African Republic', 'great', 0.6249)]

Enjoy!

https://en.xdnf.cn/q/69405.html

Related Q&A

Python Socket : AttributeError: __exit__

I try to run example from : https://docs.python.org/3/library/socketserver.html#socketserver-tcpserver-example in my laptop but it didnt work.Server :import socketserverclass MyTCPHandler(socketserver.…

How to save pygame Surface as an image to memory (and not to disk)

I am developing a time-critical app on a Raspberry PI, and I need to send an image over the wire. When my image is captured, I am doing like this:# pygame.camera.Camera captures images as a Surface pyg…

Plotting Precision-Recall curve when using cross-validation in scikit-learn

Im using cross-validation to evaluate the performance of a classifier with scikit-learn and I want to plot the Precision-Recall curve. I found an example on scikit-learn`s website to plot the PR curve …

The SECRET_KEY setting must not be empty || Available at Settings.py

I tried to find this bug, but dont know how to solve it.I kept getting error message "The SECRET_KEY setting must not be empty." when executing populate_rango.pyI have checked on settings.py …

Pandas: Applying Lambda to Multiple Data Frames

Im trying to figure out how to apply a lambda function to multiple dataframes simultaneously, without first merging the data frames together. I am working with large data sets (>60MM records) and I …

scipy.minimize - TypeError: numpy.float64 object is not callable running

Running the scipy.minimize function "I get TypeError: numpy.float64 object is not callable". Specifically during the execution of:.../scipy/optimize/optimize.py", line 292, in function_w…

Flask, not all arguments converted during string formatting

Try to create a register page for my app. I am using Flask framework and MySQL db from pythonanywhere.com. @app.route(/register/, methods=["GET","POST"]) def register_page(): try:f…

No module named objc

Im trying to use cocoa-python with Xcode but it always calls up the error:Traceback (most recent call last):File "main.py", line 10, in <module>import objc ImportError: No module named …

Incompatible types in assignment (expression has type List[nothing], variable has type (...)

Consider the following self-contained example:from typing import List, UnionT_BENCODED_LIST = Union[List[bytes], List[List[bytes]]] ret: T_BENCODED_LIST = []When I test it with mypy, I get the followin…

How to convert XComArg to string values in Airflow 2.x?

Code: from airflow.models import BaseOperator from airflow.utils.decorators import apply_defaults from airflow.providers.google.cloud.hooks.gcs import GCSHookclass GCSUploadOperator(BaseOperator):@appl…