🔤

Natural Language Processing

All reports

NameOrganizationDescriptionFileTagsYearTypeLinkStatusSelf reference

Humboldt Universitat zu Berlin

A report on how deep learning transform models can classify grooming attempts. The authors of the report created a dataset that was then used by Viktor Bowallius and David Eklund in the report Grooming detection of chat segments using transformer models, where an f1 score of 0.98 was achieved.

Natural Language ProcessingClustering/Classification
2021
Research (peer reviewed)

This paper presents an automated tool designed to analyze children's sexual abuse reports. By automating the analysis process of abuse complaints, the tool significantly reduces the risk of exposure to harmful content by categorizing the reports on three dimensions: Subject, Degree of Criminality, and Damage.

Criminal investigationNatural Language Processing
2023
Research (peer reviewed)

Cornell University

This paper proposes an approach to detection of online sexual predatory chats and abusive language using the open-source pretrained Llama 2 7B-parameter model, recently released by Meta GenAI. We fine-tune the LLM using datasets with different sizes, imbalance degrees, and languages (i.e., English, Roman Urdu and Urdu). Based on the power of LLMs, our approach is generic and automated without a manual search for a synergy between feature extraction and classifier design steps like conventional methods in this domain. Experimental results show a strong performance of the proposed approach, which performs proficiently and consistently across three distinct datasets with five sets of experiments. This study's outcomes indicate that the proposed method can be implemented in real-world applications (even with non-English languages) for flagging sexual predators, offensive or toxic content, hate speech, and discriminatory language in online discussions and comments to maintain respectful internet or digital communities.

Machine learningNatural Language Processing
2023
Research (peer reviewed)