Skip to content

englishtea21/spammers-hunter

Repository files navigation

Telegram Anti-Spam Bot

This project is a Telegram bot designed to protect groups from spam using a classification model based on DistilBERT (multi-language cased) from transformers package. The bot utilizes a machine learning model trained to identify spam messages in both English and Russian, along with other languages, to ensure effective spam detection in various contexts.

Try It Out

Users can try out this project here: Try the Project

Features

  • Spam Detection: Identifies and handles spam messages in Telegram groups.
  • Customizable: Adjust spam detection parameters and manage blocked users.
  • Efficient: Leverages a pre-trained DistilBERT model for fast and accurate classification.

Datasets Used

The model was trained on the following datasets:

These datasets contain a variety of spam messages, including SMS spam, and feature numerous emojis and special characters, aligning with the context of Telegram chats.

Model Training

  • Sequence Length: The model training took into account the fundamental limitations on sequence length, specifically the maximum message length in Telegram.
  • Special Characters: The training data includes a large number of emojis and special characters to ensure robustness in the Telegram environment.

Database Model

The project includes a database model with the following objects:

  • Anti-Spam Chats: Represents chats where spam protection is active.
  • Anti-Spam Chat Admins: Information about users managing spam protection settings in chats.
  • Banned Users: List of users banned for spam.
  • Muted Users: List of users temporarily muted from sending messages.

Initially, the project was tested with PostgreSQL using schema building through SQLAlchemy.

Libraries Used

  • Telegram Bot Framework:

    • aiogram==3.12.0
    • aioschedule==0.5.2
    • aiosignal==1.3.1
    • async-timeout==4.0.3
  • Machine Learning:

    • transformers==4.44.0
    • torch==2.4.0
  • Configuration and Database:

    • pyyaml==6.0.2
    • python-dotenv==1.0.1
    • sqlalchemy==2.0.32
    • aiosqlite==0.20.0
    • asyncpg==0.29.0
    • dogpile.cache==1.3.3
  • Timezone dealing:

    • pytz==2024.1

License

This project is licensed under the terms specified in the LICENSE.MD file

About

Shield your Telegram groups from spam with this DistilBERT-powered bot. It detects and handles spam in multiple languages, including English and Russian, and allows for customizable settings and user management.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors