GitHub - QData/TextAttack: TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

TLDR;

TextAttack is a Python framework designed for adversarial attacks, data augmentation, and model training in Natural Language Processing (NLP). It allows users to understand NLP models, research and develop adversarial attacks, augment datasets, and train models. The framework supports various functionalities through its command-line interface and Python library, including running attacks, augmenting text, training models, and benchmarking different methods.

TextAttack is a Python framework for adversarial attacks, data augmentation, and model training in NLP.
It provides tools to understand NLP models, develop adversarial attacks, augment datasets, and train models.
The framework supports command-line interface and Python library for various functionalities.

[Introduction to TextAttack]

TextAttack is a Python framework that facilitates adversarial attacks, data augmentation, and model training in NLP. It offers tools to understand NLP models by running adversarial attacks, develop new attacks, augment datasets to improve model generalization, and train NLP models. The framework also provides access to the TextAttack Model Zoo, which contains pre-trained models. For assistance and updates, users can join the TextAttack Slack channel.

[Setup and Installation]

To install TextAttack, Python 3.6+ is required, and a CUDA-compatible GPU is recommended for faster performance. The package can be installed via pip using the command pip install textattack. TextAttack downloads files, including pretrained models and datasets, to ~/.cache/textattack/ by default, but the cache path can be changed using the TA_CACHE_DIR environment variable.

[Usage and Main Features]

TextAttack's main features are accessible through the textattack command. Common commands include textattack attack <args> and textattack augment <args>. Detailed information about each command can be accessed using textattack --help or textattack attack --help. The examples/ folder contains scripts for training models, running attacks, and augmenting CSV files. The documentation website provides walkthroughs for basic usage, including building custom transformations and constraints.

[Running Attacks]

The easiest way to run an attack is through the command-line interface using textattack attack. The --parallel option can distribute the attack across multiple GPUs for improved performance. Examples include running TextFooler on BERT trained on the MR sentiment classification dataset and DeepWordBug on DistilBERT trained on the Quora Question Pairs dataset. Users can also use the --interactive option to attack samples inputted by the user.

[Attacks and Papers Implemented]

TextAttack includes attack recipes that implement attacks from the literature, which can be listed using textattack list attack-recipes. To run an attack recipe, use textattack attack --recipe Attack Recipe Name. The framework supports various goal functions, constraints, and transformations for different attack types, including those on classification and sequence-to-sequence models.

[Recipe Usage Examples]

Examples of testing attacks from the literature include running TextFooler against BERT fine-tuned on SST-2 and seq2sick against T5 fine-tuned for English-German translation. These can be executed via the command line using specific configurations for the model, recipe, and number of examples.

[Augmenting Text]

TextAttack provides tools for data augmentation, utilizing the textattack.Augmenter class with transformations and constraints. Built-in recipes include wordnet, embedding, charswap, eda, checklist, clare, back_trans, and back_transcription. The command-line interface, textattack augment <args>, allows users to augment text from a CSV file, specifying the input column, augmentation recipe, percentage of words to swap, and number of transformations per example.

[Augmentation Command-Line Interface]

The textattack augment command takes an input CSV file, a text column to augment, the number of words to change per augmentation, and the number of augmentations per input example. It outputs a CSV file with the augmented examples. For instance, the command textattack augment --input-csv examples.csv --output-csv output.csv --input-column text --recipe embedding --pct-words-to-swap .1 --transformations-per-example 2 --exclude-original augments the text column by altering 10% of each example's words, generating twice as many augmentations as original inputs, and excludes the original inputs from the output CSV.

[Augmentation Python Interface]

In addition to the command-line interface, text augmentation can be performed dynamically in Python by importing the Augmenter class. All Augmenter objects implement augment and augment_many to generate augmentations of a string or a list of strings. Custom augmenters can be created from scratch by importing transformations and constraints from textattack.transformations and textattack.constraints.

[Prompt Augmentation]

Prompts can be augmented, and responses can be generated using large language models (LLMs). The augmentation is performed using the same Augmenter as above. Responses can be generated using a user's own LLM, a HuggingFace LLM, or an OpenAI LLM.

[Training Models]

TextAttack offers model training code via textattack train to train LSTMs, CNNs, and transformers models. Datasets are automatically loaded using the datasets package. Examples include training the default LSTM for 50 epochs on the Yelp Polarity dataset and fine-tuning bert-base on the CoLA dataset for 5 epochs.

[Exploring Datasets]

To examine datasets, use textattack peek-dataset. This command prints statistics about the inputs and outputs from the dataset. For example, textattack peek-dataset --dataset-from-huggingface snli displays information about the SNLI dataset from the NLP package.

[Listing Functional Components]

The command textattack list can be used to list various components within TextAttack, such as pretrained models or available search methods. This helps users keep track of the different elements available in the framework.

[Design and Model Agnosticism]

TextAttack is model-agnostic, allowing it to analyze any model that outputs IDs, tensors, or strings. It includes pre-trained models for common NLP tasks to facilitate easier use and fair comparison of attacks.

[Built-in Models and Datasets]

TextAttack comes with built-in models and datasets, and the command-line interface automatically matches the correct dataset to the correct model. It includes 82 pre-trained models for the nine GLUE tasks, as well as common datasets for classification, translation, and summarization. A list of available pretrained models and their validation accuracies is available at textattack/models/README.md.

[HuggingFace Support]

TextAttack provides built-in support for transformers pretrained models and datasets from the datasets package. An example of loading and attacking a pre-trained model and dataset is textattack attack --model-from-huggingface distilbert-base-uncased-finetuned-sst-2-english --dataset-from-huggingface glue^sst2 --recipe deepwordbug --num-examples 10.

[Loading Models and Datasets from Files]

Attacks can be performed on local models or dataset samples by loading them from files. To attack a pre-trained model, create a file that loads the model and tokenizer as variables model and tokenizer. The tokenizer must be able to transform string inputs to lists or tensors of IDs using a method called encode, and the model must take inputs via the __call__ method.

[Custom Models and Datasets]

To experiment with a custom model, create a file (e.g., my_model.py) that loads the model and tokenizer. Then, run an attack with the argument --model-from-file my_model.py. Similarly, a dataset can be loaded from a file (e.g., my_dataset.py) by creating an iterable of pairs and using the argument --dataset-from-file my_dataset.py.

[Dataset via AttackedText Class]

The AttackedText object maintains both a list of tokens and the original text, with punctuation, to allow for word replacement after a sequence has been tokenized. This object is used in favor of a list of words or raw text.

[Attacks and Designing New Attacks]

An attack consists of a goal function, constraints, a transformation, and a search method. The attack attempts to perturb an input text such that the model output fulfills the goal function and the perturbation adheres to the constraints. This modular design unifies adversarial attack methods and enables easy assembly of attacks from the literature.

[Goal Functions, Constraints, Transformations, and Search Methods]

A GoalFunction takes an AttackedText object, scores it, and determines whether the attack has succeeded. A Constraint takes a current AttackedText and a list of transformed AttackedTexts, returning a boolean indicating whether the constraint is met. A Transformation takes an AttackedText and returns a list of possible transformed AttackedTexts. A SearchMethod takes an initial GoalFunctionResult and returns a final GoalFunctionResult, using the get_transformations function to find possible transformations.

[Benchmarking Attacks]

It is recommended not to directly compare Attack Recipes out of the box due to variations in constraint setups. An increase in attack success rate could result from an improved search or transformation method or a less restrictive search space. Benchmarking scripts and results are available on the TextAttack-Search-Benchmark Github.

[Quality of Generated Adversarial Examples]

It is crucial to be mindful of the quality of generated adversarial examples in natural language. Human surveys indicate that preserving semantics requires significantly increasing the minimum cosine similarities between the embeddings of swapped words and between the sentence encodings of original and perturbed sentences. Reevaluation results are available on the Reevaluating-NLP-Adversarial-Examples Github.

[Multi-lingual Support]

TextAttack supports multiple languages. Example code for attacking French-BERT can be found at https://github.com/QData/TextAttack/blob/master/examples/attack/attack_camembert.py, and a tutorial notebook is available at https://textattack.readthedocs.io/en/latest/2notebook/Example_4_CamemBERT.html.

[Contributing and Citing TextAttack]

Suggestions and contributions are welcome. TextAttack is currently in an "alpha" stage, and contributions are encouraged. If TextAttack is used for research, it should be cited using the provided BibTeX entry.

Date: 1/17/2026 Source: github.com