NLP strategies to handle disinformation

The growing prevalence of misinformation and disinformation poses a serious threat to our democratic society. Often spread by political lobby groups or corporations to manipulate public discourse, it can be difficult for individuals to determine whether the information they encounter is true or false. Disinformation also occurs in scientific contexts, where it can directly impact the work of researchers. Explore different methods for analyzing a potentially misleading text passage.

Disinformation is defined by two key factors: The intention to mislead and The inaccuracy of the information. The intention is often driven by the pursuit of profit or influence, with the goal of discrediting competitors. With th Fur shades of Life Sciences Intention Classifier we explore if language models can learn these intentions in text genres.

Four Shades of Life Sciences Intention Classifier

The central hypothesis of Project AQUAS is that the motives behind disinformation—such as gaining attention, financial profit, or political influence—shape both the syntax and semantics of the texts used to spread it.

By applying machine learning techniques, a language model can be trained to recognize the distinctive features of such texts, including specific terminology and stylistic patterns. To enable this, we have compiled a dataset of life science documents categorized into four groups, referred to as the "four shades of life sciences":

  1. Scientific text style
  2. Vernacular text style
  3. Disinformative text style
  4. Alternative scientific text style

Based on this dataset, several models have been fine-tuned to classify similar language styles.

You can choose between traditional Bag of Words methods—such as Random Forest, Support Vector Machine, and Logistic Regression —or advanced transformer-based language models, including BERT, BioBert, and SPECTER.




Response:


                         

Wikifier

Currently, automatically validating facts in full texts remains a challenging task.

However, Named-entity Recognition (NER) can be used to identify key terms, which can then be linked to external knowledge bases.

By using the entity linking service Wikifier, we can retrieve corresponding records from Wikidata and Wikipedia to help verify the information presented.

Try annotating the questionable text using Wikifier to retrieve relevant knowledge base entries.

For more information, note that Wikifier was developed by Janez Brank, who implemented it as a tool for linking text entities to Wikipedia and Wikidata.

Use Wikifier to retrieve additional information from Wikipedia and Wikidata.

Please note that the annotated text is truncated after 1,000 characters.


Response: