AMULETY CLI Tutorial

Introduction

This tutorial demonstrates how to use AMULETY command line interface (CLI) to translate and embed both BCR (B-cell receptor) and TCR (T-cell receptor) sequences.

AMULETY supports a wide range of embedding models for different immune receptor types. For a full list of the supported models, please check the Usage documentation page.

Installation

Before getting started, please install AMULETY. You can install AMULETY through conda or pip. The conda installation will already install the IgBlast dependency, while if installing via pip, the IgBLAST dependency will need to be installed separately.

Install AMULETY through conda:

[ ]:
conda install -c conda-forge -c bioconda amulety --strict-channel-priority

To verify the installation and print the help message, run:

[1]:
! amulety --help

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

                                                                                
 Usage: amulety [OPTIONS] COMMAND [ARGS]...                                     
                                                                                
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.      │
│ --show-completion             Show completion for the current shell, to copy │
│                               it or customize the installation.              │
│ --help                        Show this message and exit.                    │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────╮
│ translate-igblast   Translates nucleotide sequences to amino acid sequences  │
│                     using IgBlast.                                           │
│ embed               Embeds sequences from an AIRR rearrangement file using   │
│                     the specified model. It returns the                      │
│ check-deps          Check if optional embedding dependencies and tools are   │
│                     installed.                                               │
╰──────────────────────────────────────────────────────────────────────────────╯

Downloading example data and reference database

The following command downloads an example AIRR format file of BCR sequences and the reference IgBlast database.

[ ]:
# Create tutorial directory and download example data
! mkdir -p tutorial
! wget -P tutorial https://zenodo.org/records/17186858/files/AIRR_subject1_FNA_d0_1_Y1.tsv

# Download and extract IgBlast reference database
! wget -P tutorial -c https://github.com/nf-core/test-datasets/raw/airrflow/database-cache/igblast_base.zip
! unzip tutorial/igblast_base.zip -d tutorial
! rm tutorial/igblast_base.zip
--2025-09-24 13:00:01--  https://zenodo.org/records/17186858/files/AIRR_subject1_FNA_d0_1_Y1.tsv
Resolving zenodo.org (zenodo.org)... 188.185.45.92, 188.185.43.25, 188.185.48.194, ...
Connecting to zenodo.org (zenodo.org)|188.185.45.92|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 479753 (469K) [application/octet-stream]
Saving to: 'tutorial/AIRR_subject1_FNA_d0_1_Y1.tsv.2'

AIRR_subject1_FNA_d 100%[===================>] 468.51K   578KB/s    in 0.8s

2025-09-24 13:00:02 (578 KB/s) - 'tutorial/AIRR_subject1_FNA_d0_1_Y1.tsv.2' saved [479753/479753]

--2025-09-24 13:00:03--  https://github.com/nf-core/test-datasets/raw/airrflow/database-cache/igblast_base.zip
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip [following]
--2025-09-24 13:00:03--  https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1204742 (1.1M) [application/zip]
Saving to: 'tutorial/igblast_base.zip'

igblast_base.zip    100%[===================>]   1.15M  --.-KB/s    in 0.1s

2025-09-24 13:00:03 (7.89 MB/s) - 'tutorial/igblast_base.zip' saved [1204742/1204742]

Archive:  tutorial/igblast_base.zip
   creating: tutorial/igblast_base/
   creating: tutorial/igblast_base/internal_data/
   creating: tutorial/igblast_base/internal_data/rhesus_monkey/
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.pin
   creating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/Repository
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/Entries
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/Root
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.pdm.imgt
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.pdm.kabat
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nsd
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nin
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nog
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nsq
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nhr
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nsi
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.psq
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nhr
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.phr
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.pog
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nog
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nsd
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nsi
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nhr
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nsd
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nsi
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.psd
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nsq
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nsq
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.ndm.kabat
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nin
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nog
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.psi
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nin
  inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.ndm.imgt
   creating: tutorial/igblast_base/internal_data/human/
  inflating: tutorial/igblast_base/internal_data/human/human.ndm.imgt
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nsq
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nsd
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.psq
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.pog
  inflating: tutorial/igblast_base/internal_data/human/human_V.pog
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nog
  inflating: tutorial/igblast_base/internal_data/human/human_V.psd
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.phr
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nsi
  inflating: tutorial/igblast_base/internal_data/human/human_V.nsi
  inflating: tutorial/igblast_base/internal_data/human/human_V.nsq
  inflating: tutorial/igblast_base/internal_data/human/human_V.pin
  inflating: tutorial/igblast_base/internal_data/human/human.ndm.kabat
  inflating: tutorial/igblast_base/internal_data/human/human_V.nin
  inflating: tutorial/igblast_base/internal_data/human/human_V.nhr
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.psi
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.psd
  inflating: tutorial/igblast_base/internal_data/human/human_V.nog
  inflating: tutorial/igblast_base/internal_data/human/human_V.psi
  inflating: tutorial/igblast_base/internal_data/human/human_V.phr
  inflating: tutorial/igblast_base/internal_data/human/human_V.psq
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nin
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nhr
  inflating: tutorial/igblast_base/internal_data/human/human.pdm.kabat
  inflating: tutorial/igblast_base/internal_data/human/human.pdm.imgt
  inflating: tutorial/igblast_base/internal_data/human/human_V.nsd
  inflating: tutorial/igblast_base/internal_data/human/human_TR_V.pin
   creating: tutorial/igblast_base/internal_data/rat/
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.nhr
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.nin
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.nog
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.pin
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.nsd
  inflating: tutorial/igblast_base/internal_data/rat/rat.pdm.kabat
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.psq
  inflating: tutorial/igblast_base/internal_data/rat/rat.ndm.imgt
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.pog
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.nsq
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.phr
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.psi
  inflating: tutorial/igblast_base/internal_data/rat/rat.ndm.kabat
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.psd
  inflating: tutorial/igblast_base/internal_data/rat/rat.pdm.imgt
  inflating: tutorial/igblast_base/internal_data/rat/rat_V.nsi
   creating: tutorial/igblast_base/internal_data/mouse/
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nsi
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nin
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nsd
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nog
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nhr
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.pog
  inflating: tutorial/igblast_base/internal_data/mouse/mouse.pdm.kabat
  inflating: tutorial/igblast_base/internal_data/mouse/mouse.ndm.kabat
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.pin
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nsq
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.pin
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.psq
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.phr
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.psq
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nsd
  inflating: tutorial/igblast_base/internal_data/mouse/mouse.ndm.imgt
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nsq
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.psi
  inflating: tutorial/igblast_base/internal_data/mouse/mouse.pdm.imgt
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.phr
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nog
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.pog
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nhr
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.psd
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.psi
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nin
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.psd
  inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nsi
  inflating: tutorial/igblast_base/internal_data/readme
   creating: tutorial/igblast_base/internal_data/rabbit/
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.psq
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.ndm.imgt
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.pdm.imgt
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.pdm.kabat
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nsq
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.phr
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nin
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.psi
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nsi
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nog
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.ndm.kabat
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nhr
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.pin
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.pog
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nsd
  inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.psd
   creating: tutorial/igblast_base/fasta/
  inflating: tutorial/igblast_base/fasta/imgt_human_tr_d.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_ig_d.fasta
  inflating: tutorial/igblast_base/fasta/imgt_aa_human_ig_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_aa_human_tr_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_d.fasta
  inflating: tutorial/igblast_base/fasta/imgt_aa_mouse_ig_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_tr_c.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_ig_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_j.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_c.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_tr_j.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_j.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_ig_c.fasta
  inflating: tutorial/igblast_base/fasta/imgt_aa_mouse_tr_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_c.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_d.fasta
  inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_v.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_ig_j.fasta
  inflating: tutorial/igblast_base/fasta/imgt_human_tr_v.fasta
   creating: tutorial/igblast_base/optional_file/
  inflating: tutorial/igblast_base/optional_file/human_gl.aux
  inflating: tutorial/igblast_base/optional_file/human_gl.aux.testonly
  inflating: tutorial/igblast_base/optional_file/rabbit_gl.aux
  inflating: tutorial/igblast_base/optional_file/mouse_gl.aux
  inflating: tutorial/igblast_base/optional_file/readme
  inflating: tutorial/igblast_base/optional_file/rat_gl.aux
  inflating: tutorial/igblast_base/optional_file/rhesus_monkey_gl.aux
   creating: tutorial/igblast_base/database/
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.ntf
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nhr
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.pin
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pjs
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.ntf
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nos
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.njs
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.ntf
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.not
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nin
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.njs
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.nhr
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.nos
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.nog
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.nin
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nos
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.njs
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nin
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.nos
  inflating: tutorial/igblast_base/database/mouse_gl_V.pog
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.njs
  inflating: tutorial/igblast_base/database/mouse_gl_V.nsd
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.not
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.nin
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nog
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.ptf
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nos
  inflating: tutorial/igblast_base/database/ncbi_human_c_genes.tar
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.nto
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pto
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.nos
  inflating: tutorial/igblast_base/database/mouse_gl_V.nsq
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.nhr
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.ntf
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.ndb
  inflating: tutorial/igblast_base/database/mouse_gl_D.nsd
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.ndb
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pdb
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.nsq
 extracting: tutorial/igblast_base/database/imgt_mouse_tr_d.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nto
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pot
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.ndb
  inflating: tutorial/igblast_base/database/mouse_gl_D.nsi
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.nos
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.not
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pog
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.njs
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.njs
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pto
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.njs
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nto
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nos
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.ndb
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.nin
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nin
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.nsd
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.nin
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pos
  inflating: tutorial/igblast_base/database/mouse_gl_J.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nhr
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.nin
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.not
  inflating: tutorial/igblast_base/database/mouse_gl_D.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nsq
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pdb
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.nog
  inflating: tutorial/igblast_base/database/rhesus_monkey_J.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.ntf
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.ntf
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.not
  inflating: tutorial/igblast_base/database/mouse_gl_J.nsi
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.not
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.nog
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.psq
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pot
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.njs
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.nto
  inflating: tutorial/igblast_base/database/mouse_gl_V.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nog
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.not
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.ntf
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pin
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.nsq
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pos
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nog
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.not
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nos
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.ndb
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.ndb
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.nos
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.nto
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.nog
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.nto
  inflating: tutorial/igblast_base/database/mouse_gl_J.nsd
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.nin
  inflating: tutorial/igblast_base/database/rhesus_monkey_J.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.njs
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.ntf
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.phr
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nto
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.not
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nos
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.nto
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nog
  inflating: tutorial/igblast_base/database/rhesus_monkey_J.nsi
  inflating: tutorial/igblast_base/database/mouse_gl_V.phr
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.nin
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.nin
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.psq
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pot
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nin
  inflating: tutorial/igblast_base/database/mouse_gl_V.pin
  inflating: tutorial/igblast_base/database/imgt_human_ig_v.nsq
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.nhr
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pog
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.nin
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.njs
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.nsq
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.nsq
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.psq
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.ptf
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pin
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.ntf
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.nhr
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.phr
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.phr
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.ndb
  inflating: tutorial/igblast_base/database/mouse_gl_V.nsi
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nsq
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.ntf
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nto
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nos
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.not
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pto
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.pog
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.nog
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.psq
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nin
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.njs
 extracting: tutorial/igblast_base/database/mouse_gl_J.nsq
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.ntf
  inflating: tutorial/igblast_base/database/mouse_gl_D.nog
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.not
 extracting: tutorial/igblast_base/database/imgt_human_tr_d.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.not
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.nto
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.ntf
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pjs
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nin
   creating: tutorial/igblast_base/database/airr/
  inflating: tutorial/igblast_base/database/airr/airr_c_human.tar
  inflating: tutorial/igblast_base/database/airr/airr_c_mouse.tar
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.not
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nos
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.njs
  inflating: tutorial/igblast_base/database/rhesus_monkey_J.nsd
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nsq
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.njs
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.nog
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pin
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nto
  inflating: tutorial/igblast_base/database/rhesus_monkey_J.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nto
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.nhr
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.nos
  inflating: tutorial/igblast_base/database/rhesus_monkey_VJ.tar
  inflating: tutorial/igblast_base/database/mouse_gl_VDJ.tar
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.nto
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.ptf
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nog
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.not
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.phr
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.ntf
  inflating: tutorial/igblast_base/database/imgt_human_tr_v.ntf
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nto
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.nsi
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nin
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nsq
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.nos
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pdb
  inflating: tutorial/igblast_base/database/mouse_gl_V.nhr
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.phr
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.njs
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nhr
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pog
  inflating: tutorial/igblast_base/database/mouse_gl_V.psq
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.psd
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.ptf
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nog
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pdb
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nhr
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.njs
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.ndb
  inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pjs
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pog
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.not
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nin
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pos
  inflating: tutorial/igblast_base/database/imgt_human_tr_d.ntf
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pos
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nhr
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pto
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pot
  inflating: tutorial/igblast_base/database/mouse_gl_V.nin
  inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pjs
  inflating: tutorial/igblast_base/database/mouse_gl_D.nin
  inflating: tutorial/igblast_base/database/mouse_gl_J.nin
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.ndb
  inflating: tutorial/igblast_base/database/mouse_gl_D.nhr
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.nog
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.nsq
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nto
  inflating: tutorial/igblast_base/database/imgt_human_ig_j.ntf
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nsq
  inflating: tutorial/igblast_base/database/imgt_human_tr_j.nto
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.psq
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.not
  inflating: tutorial/igblast_base/database/imgt_human_ig_d.nhr
  inflating: tutorial/igblast_base/database/imgt_human_ig_c.nos
  inflating: tutorial/igblast_base/database/mouse_gl_V.psd
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nog
  inflating: tutorial/igblast_base/database/imgt_human_tr_c.nhr
  inflating: tutorial/igblast_base/database/rhesus_monkey_V.psi
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nsq
  inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pin
  inflating: tutorial/igblast_base/database/mouse_gl_J.nhr
  inflating: tutorial/igblast_base/database/rhesus_monkey_J.nin
  inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.ndb
  inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.njs
  inflating: tutorial/igblast_base/database/mouse_gl_V.psi

Translating nucleotides to amino acid sequences

The inputs to the embedding models are AIRR format files with immune receptor amino acid sequences. If the AIRR file only contains nucleotide sequences, the amulety translate-igblast command can help with the translation:

[ ]:
! amulety translate-igblast -i tutorial/AIRR_subject1_FNA_d0_1_Y1.tsv -o tutorial -r tutorial/igblast_base

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 11:28:42,713 - INFO - Converting AIRR table to FastA for IgBlast translation...
2025-09-24 11:28:42,720 - INFO - Calling IgBlast for running translation...
2025-09-24 11:28:44,404 - INFO - Saved the translations in the dataframe (sequence_aa contains the full translation and sequence_vdj_aa contains the VDJ translation).
2025-09-24 11:28:44,407 - INFO - Took 1.69 seconds
2025-09-24 11:28:44,408 - INFO - Saved the translations in tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv file.

Embedding sequences

Now we are ready to embed the sequences using various models. AMULETY uses a unified embed command that supports all available models.

To print the help message for the embedding command run:

[5]:
! amulety embed --help

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

                                                                                
 Usage: amulety embed [OPTIONS]                                                 
                                                                                
 Embeds sequences from an AIRR rearrangement file using the specified model. It
 returns the

 Example usage:
 amulety embed --chain HL --model antiberta2 --output-file-path out.pt
 airr_rearrangement.tsv

╭─ Options ────────────────────────────────────────────────────────────────────╮
│ *  --input-airr                 TEXT     The path to the input data file.    │
│                                          The data file should be in AIRR     │
│                                          format.                             │
│                                          [default: None]                     │
│                                          [required]                          │
│ *  --chain                      TEXT     Input sequences. For BCR: H=Heavy,  │
│                                          L=Light, HL=Heavy-Light pairs,      │
│                                          LH=Light-Heavy pairs, H+L=Both      │
│                                          chains separately. For TCR:         │
│                                          H=Beta/Delta, L=Alpha/Gamma,        │
│                                          HL=Beta-Alpha/Delta-Gamma pairs,    │
│                                          LH=Alpha-Beta/Gamma-Delta pairs,    │
│                                          H+L=Both chains separately.         │
│                                          [default: None]                     │
│                                          [required]                          │
│ *  --model                      TEXT     The embedding model to use. BCR:    │
│                                          ['ablang', 'antiberta2',            │
│                                          'antiberty', 'balm-paired']. TCR:   │
│                                          ['tcr-bert', 'tcrt5']. Immune (BCR  │
│                                          & TCR): ['immune2vec']. Protein:    │
│                                          ['esm2', 'prott5', 'custom']. Use   │
│                                          'custom' for fine-tuned models with │
│                                          --model-path,                       │
│                                          --embedding-dimension, and          │
│                                          --max-length parameters.            │
│                                          [default: None]                     │
│                                          [required]                          │
│ *  --output-file-path           TEXT     The path where the generated        │
│                                          embeddings will be saved. The file  │
│                                          extension should be .csv, or .tsv.  │
│                                          for a dataframe, .pt for a pickled  │
│                                          torch object, or .h5ad for an       │
│                                          anndata object.                     │
│                                          [default: None]                     │
│                                          [required]                          │
│    --cache-dir                  TEXT     Cache dir for storing the           │
│                                          pre-trained model weights.          │
│                                          [default: /tmp/amulety]             │
│    --sequence-col               TEXT     The name of the column containing   │
│                                          the amino acid sequences to embed.  │
│                                          [default: sequence_vdj_aa]          │
│    --cell-id-col                TEXT     The name of the column containing   │
│                                          the single-cell barcode.            │
│                                          [default: cell_id]                  │
│    --batch-size                 INTEGER  The batch size of sequences to      │
│                                          embed.                              │
│                                          [default: 50]                       │
│    --model-path                 TEXT     Path to custom model (HuggingFace   │
│                                          model name or local path). Required │
│                                          for 'custom' model.                 │
│                                          [default: None]                     │
│    --embedding-dimension        INTEGER  Embedding dimension for custom      │
│                                          model. Required for 'custom' model. │
│                                          [default: None]                     │
│    --max-length                 INTEGER  Maximum sequence length for custom  │
│                                          model. Required for 'custom' model. │
│                                          [default: None]                     │
│    --duplicate-col              TEXT     The name of the numeric column used │
│                                          to select the best chain when       │
│                                          multiple chains of the same type    │
│                                          exist per cell. Default:            │
│                                          'duplicate_count'. Custom columns   │
│                                          must be numeric and user-defined.   │
│                                          [default: duplicate_count]          │
│    --installation-path          TEXT     Custom path to Immune2Vec           │
│                                          installation directory. Only        │
│                                          applies to 'immune2vec' model.      │
│                                          [default: None]                     │
│    --residue-level                       If True, returns residue-level      │
│                                          embeddings of dimension sequence    │
│                                          length x embedding dimension (L x   │
│                                          D) instead of sequence-level (1 x   │
│                                          D).                                 │
│    --help                                Show this message and exit.         │
╰──────────────────────────────────────────────────────────────────────────────╯

BCR embedding examples

Let’s demonstrate embedding BCR sequences using different models:

AntiBERTy (BCR-specific model)

[4]:
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model antiberty --batch-size 2 --output-file-path tutorial/test_embedding.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 11:28:55,583 - INFO - Detected single-cell data format
2025-09-24 11:28:55,585 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:28:55,586 - INFO - Removed 102 sequences not matching H chain
2025-09-24 11:29:02,850 - INFO - AntiBERTy loaded. Size: 26.03 M
2025-09-24 11:29:02,850 - INFO - Batch 1/48
2025-09-24 11:29:02,887 - INFO - Batch 2/48
2025-09-24 11:29:02,912 - INFO - Batch 3/48
2025-09-24 11:29:02,933 - INFO - Batch 4/48
2025-09-24 11:29:02,955 - INFO - Batch 5/48
2025-09-24 11:29:02,976 - INFO - Batch 6/48
2025-09-24 11:29:02,997 - INFO - Batch 7/48
2025-09-24 11:29:03,017 - INFO - Batch 8/48
2025-09-24 11:29:03,037 - INFO - Batch 9/48
2025-09-24 11:29:03,059 - INFO - Batch 10/48
2025-09-24 11:29:03,079 - INFO - Batch 11/48
2025-09-24 11:29:03,099 - INFO - Batch 12/48
2025-09-24 11:29:03,119 - INFO - Batch 13/48
2025-09-24 11:29:03,140 - INFO - Batch 14/48
2025-09-24 11:29:03,161 - INFO - Batch 15/48
2025-09-24 11:29:03,181 - INFO - Batch 16/48
2025-09-24 11:29:03,202 - INFO - Batch 17/48
2025-09-24 11:29:03,222 - INFO - Batch 18/48
2025-09-24 11:29:03,243 - INFO - Batch 19/48
2025-09-24 11:29:03,290 - INFO - Batch 20/48
2025-09-24 11:29:03,312 - INFO - Batch 21/48
2025-09-24 11:29:03,332 - INFO - Batch 22/48
2025-09-24 11:29:03,352 - INFO - Batch 23/48
2025-09-24 11:29:03,373 - INFO - Batch 24/48
2025-09-24 11:29:03,393 - INFO - Batch 25/48
2025-09-24 11:29:03,414 - INFO - Batch 26/48
2025-09-24 11:29:03,433 - INFO - Batch 27/48
2025-09-24 11:29:03,452 - INFO - Batch 28/48
2025-09-24 11:29:03,472 - INFO - Batch 29/48
2025-09-24 11:29:03,492 - INFO - Batch 30/48
2025-09-24 11:29:03,514 - INFO - Batch 31/48
2025-09-24 11:29:03,534 - INFO - Batch 32/48
2025-09-24 11:29:03,554 - INFO - Batch 33/48
2025-09-24 11:29:03,575 - INFO - Batch 34/48
2025-09-24 11:29:03,594 - INFO - Batch 35/48
2025-09-24 11:29:03,614 - INFO - Batch 36/48
2025-09-24 11:29:03,635 - INFO - Batch 37/48
2025-09-24 11:29:03,657 - INFO - Batch 38/48
2025-09-24 11:29:03,680 - INFO - Batch 39/48
2025-09-24 11:29:03,700 - INFO - Batch 40/48
2025-09-24 11:29:03,721 - INFO - Batch 41/48
2025-09-24 11:29:03,743 - INFO - Batch 42/48
2025-09-24 11:29:03,763 - INFO - Batch 43/48
2025-09-24 11:29:03,783 - INFO - Batch 44/48
2025-09-24 11:29:03,804 - INFO - Batch 45/48
2025-09-24 11:29:03,825 - INFO - Batch 46/48
2025-09-24 11:29:03,845 - INFO - Batch 47/48
2025-09-24 11:29:03,866 - INFO - Batch 48/48
2025-09-24 11:29:03,879 - INFO - Took 1.03 seconds
2025-09-24 11:29:03,880 - INFO - Generated embeddings with dimensions torch.Size([95, 512])
2025-09-24 11:29:03,880 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:29:03,881 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:29:03,885 - INFO - Saved embedding at tutorial/test_embedding.pt

AntiBERTa2 (BCR-specific model)

[10]:
# Embed heavy-light chain pairs using AntiBERTa2
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model antiberta2 --batch-size 2 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_antiberta2.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 11:44:05,686 - INFO - Detected single-cell data format
2025-09-24 11:44:05,688 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:44:05,688 - INFO - Removed 102 sequences not matching H chain
tokenizer_config.json: 100%|████████████████████| 116/116 [00:00<00:00, 339kB/s]
vocab.txt: 100%|█████████████████████████████| 80.0/80.0 [00:00<00:00, 1.56MB/s]
special_tokens_map.json: 100%|█████████████████| 124/124 [00:00<00:00, 1.24MB/s]
config.json: 100%|█████████████████████████████| 575/575 [00:00<00:00, 1.76MB/s]
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
2025-09-24 11:44:08,458 - WARNING - Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
model.safetensors: 100%|█████████████████████| 811M/811M [00:20<00:00, 40.2MB/s]
RoFormerForMaskedLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
2025-09-24 11:44:29,008 - INFO - AntiBERTa2 loaded. Size: 202.642462 M
2025-09-24 11:44:29,009 - INFO - Batch 1/48.
2025-09-24 11:44:30,501 - INFO - Batch 2/48.
2025-09-24 11:44:30,743 - INFO - Batch 3/48.
2025-09-24 11:44:30,963 - INFO - Batch 4/48.
2025-09-24 11:44:31,181 - INFO - Batch 5/48.
2025-09-24 11:44:31,400 - INFO - Batch 6/48.
2025-09-24 11:44:31,614 - INFO - Batch 7/48.
2025-09-24 11:44:31,839 - INFO - Batch 8/48.
2025-09-24 11:44:32,061 - INFO - Batch 9/48.
2025-09-24 11:44:32,275 - INFO - Batch 10/48.
2025-09-24 11:44:32,496 - INFO - Batch 11/48.
2025-09-24 11:44:32,714 - INFO - Batch 12/48.
2025-09-24 11:44:32,926 - INFO - Batch 13/48.
2025-09-24 11:44:33,146 - INFO - Batch 14/48.
2025-09-24 11:44:33,370 - INFO - Batch 15/48.
2025-09-24 11:44:33,588 - INFO - Batch 16/48.
2025-09-24 11:44:33,812 - INFO - Batch 17/48.
2025-09-24 11:44:34,033 - INFO - Batch 18/48.
2025-09-24 11:44:34,255 - INFO - Batch 19/48.
2025-09-24 11:44:34,474 - INFO - Batch 20/48.
2025-09-24 11:44:34,692 - INFO - Batch 21/48.
2025-09-24 11:44:34,916 - INFO - Batch 22/48.
2025-09-24 11:44:35,129 - INFO - Batch 23/48.
2025-09-24 11:44:35,391 - INFO - Batch 24/48.
2025-09-24 11:44:35,613 - INFO - Batch 25/48.
2025-09-24 11:44:35,832 - INFO - Batch 26/48.
2025-09-24 11:44:36,059 - INFO - Batch 27/48.
2025-09-24 11:44:36,281 - INFO - Batch 28/48.
2025-09-24 11:44:36,500 - INFO - Batch 29/48.
2025-09-24 11:44:36,714 - INFO - Batch 30/48.
2025-09-24 11:44:36,934 - INFO - Batch 31/48.
2025-09-24 11:44:37,151 - INFO - Batch 32/48.
2025-09-24 11:44:37,365 - INFO - Batch 33/48.
2025-09-24 11:44:37,590 - INFO - Batch 34/48.
2025-09-24 11:44:37,813 - INFO - Batch 35/48.
2025-09-24 11:44:38,037 - INFO - Batch 36/48.
2025-09-24 11:44:38,285 - INFO - Batch 37/48.
2025-09-24 11:44:38,503 - INFO - Batch 38/48.
2025-09-24 11:44:38,729 - INFO - Batch 39/48.
2025-09-24 11:44:38,949 - INFO - Batch 40/48.
2025-09-24 11:44:39,168 - INFO - Batch 41/48.
2025-09-24 11:44:39,391 - INFO - Batch 42/48.
2025-09-24 11:44:39,608 - INFO - Batch 43/48.
2025-09-24 11:44:39,838 - INFO - Batch 44/48.
2025-09-24 11:44:40,064 - INFO - Batch 45/48.
2025-09-24 11:44:40,283 - INFO - Batch 46/48.
2025-09-24 11:44:40,507 - INFO - Batch 47/48.
2025-09-24 11:44:40,735 - INFO - Batch 48/48.
2025-09-24 11:44:40,868 - INFO - Took 11.86 seconds
2025-09-24 11:44:40,872 - INFO - Generated embeddings with dimensions torch.Size([95, 1024])
2025-09-24 11:44:40,873 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:44:40,875 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:44:40,881 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_antiberta2.pt

AbLang (BCR-specific model with separate heavy/light models)

[11]:
# Embed both heavy and light chains separately using AbLang
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H+L --model ablang --batch-size 2 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_ablang.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 11:45:05,420 - INFO - Detected single-cell data format
2025-09-24 11:45:05,421 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:45:06,418 - INFO - AbLang heavy chain model loaded
2025-09-24 11:45:06,418 - INFO - Batch 1/99
2025-09-24 11:45:06,500 - INFO - Batch 2/99
2025-09-24 11:45:06,570 - INFO - Batch 3/99
2025-09-24 11:45:06,642 - INFO - Batch 4/99
2025-09-24 11:45:06,712 - INFO - Batch 5/99
2025-09-24 11:45:06,790 - INFO - Batch 6/99
2025-09-24 11:45:06,864 - INFO - Batch 7/99
2025-09-24 11:45:06,939 - INFO - Batch 8/99
2025-09-24 11:45:07,011 - INFO - Batch 9/99
2025-09-24 11:45:07,081 - INFO - Batch 10/99
2025-09-24 11:45:07,153 - INFO - Batch 11/99
2025-09-24 11:45:07,224 - INFO - Batch 12/99
2025-09-24 11:45:07,298 - INFO - Batch 13/99
2025-09-24 11:45:07,372 - INFO - Batch 14/99
2025-09-24 11:45:07,447 - INFO - Batch 15/99
2025-09-24 11:45:07,521 - INFO - Batch 16/99
2025-09-24 11:45:07,594 - INFO - Batch 17/99
2025-09-24 11:45:07,672 - INFO - Batch 18/99
2025-09-24 11:45:07,748 - INFO - Batch 19/99
2025-09-24 11:45:07,823 - INFO - Batch 20/99
2025-09-24 11:45:07,902 - INFO - Batch 21/99
2025-09-24 11:45:07,981 - INFO - Batch 22/99
2025-09-24 11:45:08,057 - INFO - Batch 23/99
2025-09-24 11:45:08,137 - INFO - Batch 24/99
2025-09-24 11:45:08,221 - INFO - Batch 25/99
2025-09-24 11:45:08,307 - INFO - Batch 26/99
2025-09-24 11:45:08,392 - INFO - Batch 27/99
2025-09-24 11:45:08,476 - INFO - Batch 28/99
2025-09-24 11:45:08,553 - INFO - Batch 29/99
2025-09-24 11:45:08,630 - INFO - Batch 30/99
2025-09-24 11:45:08,707 - INFO - Batch 31/99
2025-09-24 11:45:08,783 - INFO - Batch 32/99
2025-09-24 11:45:08,859 - INFO - Batch 33/99
2025-09-24 11:45:08,935 - INFO - Batch 34/99
2025-09-24 11:45:09,012 - INFO - Batch 35/99
2025-09-24 11:45:09,087 - INFO - Batch 36/99
2025-09-24 11:45:09,161 - INFO - Batch 37/99
2025-09-24 11:45:09,237 - INFO - Batch 38/99
2025-09-24 11:45:09,311 - INFO - Batch 39/99
2025-09-24 11:45:09,389 - INFO - Batch 40/99
2025-09-24 11:45:09,467 - INFO - Batch 41/99
2025-09-24 11:45:09,545 - INFO - Batch 42/99
2025-09-24 11:45:09,621 - INFO - Batch 43/99
2025-09-24 11:45:09,699 - INFO - Batch 44/99
2025-09-24 11:45:09,775 - INFO - Batch 45/99
2025-09-24 11:45:09,851 - INFO - Batch 46/99
2025-09-24 11:45:09,927 - INFO - Batch 47/99
2025-09-24 11:45:10,004 - INFO - Batch 48/99
2025-09-24 11:45:10,082 - INFO - Batch 49/99
2025-09-24 11:45:10,155 - INFO - Batch 50/99
2025-09-24 11:45:10,234 - INFO - Batch 51/99
2025-09-24 11:45:10,307 - INFO - Batch 52/99
2025-09-24 11:45:10,380 - INFO - Batch 53/99
2025-09-24 11:45:10,455 - INFO - Batch 54/99
2025-09-24 11:45:10,525 - INFO - Batch 55/99
2025-09-24 11:45:10,599 - INFO - Batch 56/99
2025-09-24 11:45:10,672 - INFO - Batch 57/99
2025-09-24 11:45:10,744 - INFO - Batch 58/99
2025-09-24 11:45:10,817 - INFO - Batch 59/99
2025-09-24 11:45:10,893 - INFO - Batch 60/99
2025-09-24 11:45:10,967 - INFO - Batch 61/99
2025-09-24 11:45:11,042 - INFO - Batch 62/99
2025-09-24 11:45:11,112 - INFO - Batch 63/99
2025-09-24 11:45:11,197 - INFO - Batch 64/99
2025-09-24 11:45:11,305 - INFO - Batch 65/99
2025-09-24 11:45:11,380 - INFO - Batch 66/99
2025-09-24 11:45:11,448 - INFO - Batch 67/99
2025-09-24 11:45:11,521 - INFO - Batch 68/99
2025-09-24 11:45:11,597 - INFO - Batch 69/99
2025-09-24 11:45:11,660 - INFO - Batch 70/99
2025-09-24 11:45:11,734 - INFO - Batch 71/99
2025-09-24 11:45:11,810 - INFO - Batch 72/99
2025-09-24 11:45:11,883 - INFO - Batch 73/99
2025-09-24 11:45:11,955 - INFO - Batch 74/99
2025-09-24 11:45:12,028 - INFO - Batch 75/99
2025-09-24 11:45:12,102 - INFO - Batch 76/99
2025-09-24 11:45:12,177 - INFO - Batch 77/99
2025-09-24 11:45:12,252 - INFO - Batch 78/99
2025-09-24 11:45:12,325 - INFO - Batch 79/99
2025-09-24 11:45:12,399 - INFO - Batch 80/99
2025-09-24 11:45:12,472 - INFO - Batch 81/99
2025-09-24 11:45:12,546 - INFO - Batch 82/99
2025-09-24 11:45:12,627 - INFO - Batch 83/99
2025-09-24 11:45:12,704 - INFO - Batch 84/99
2025-09-24 11:45:12,780 - INFO - Batch 85/99
2025-09-24 11:45:12,853 - INFO - Batch 86/99
2025-09-24 11:45:12,926 - INFO - Batch 87/99
2025-09-24 11:45:12,998 - INFO - Batch 88/99
2025-09-24 11:45:13,073 - INFO - Batch 89/99
2025-09-24 11:45:13,144 - INFO - Batch 90/99
2025-09-24 11:45:13,211 - INFO - Batch 91/99
2025-09-24 11:45:13,286 - INFO - Batch 92/99
2025-09-24 11:45:13,358 - INFO - Batch 93/99
2025-09-24 11:45:13,424 - INFO - Batch 94/99
2025-09-24 11:45:13,495 - INFO - Batch 95/99
2025-09-24 11:45:13,566 - INFO - Batch 96/99
2025-09-24 11:45:13,638 - INFO - Batch 97/99
2025-09-24 11:45:13,709 - INFO - Batch 98/99
2025-09-24 11:45:13,779 - INFO - Batch 99/99
2025-09-24 11:45:13,817 - INFO - AbLang embedding completed. Took 7.4 seconds
2025-09-24 11:45:13,828 - INFO - Generated embeddings with dimensions torch.Size([197, 768])
2025-09-24 11:45:13,829 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:45:13,831 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:45:13,838 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_ablang.pt

BALM-paired model (BCR paired chains)

BALM-paired is a specialized model for BCR trained on paired heavy-light chains. We can embed concatenated heavy and light chains with AMULETY with the --chain HL option.

[13]:
# Embed heavy-light chain pairs using BALM-paired
# The model will be automatically downloaded on first use
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain HL --model balm-paired --batch-size 2 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_balm_paired.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 11:51:36,752 - INFO - Detected single-cell data format
2025-09-24 11:51:36,754 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:02:54,987 - INFO - Model size: 303.92M
Batch 1/48

Batch 2/48

Batch 3/48

Batch 4/48

Batch 5/48

Batch 6/48

Batch 7/48

Batch 8/48

Batch 9/48

Batch 10/48

Batch 11/48

Batch 12/48

Batch 13/48

Batch 14/48

Batch 15/48

Batch 16/48

Batch 17/48

Batch 18/48

Batch 19/48

Batch 20/48

Batch 21/48

Batch 22/48

Batch 23/48

Batch 24/48

Batch 25/48

Batch 26/48

Batch 27/48

Batch 28/48

Batch 29/48

Batch 30/48

Batch 31/48

Batch 32/48

Batch 33/48

Batch 34/48

Batch 35/48

Batch 36/48

Batch 37/48

Batch 38/48

Batch 39/48

Batch 40/48

Batch 41/48

Batch 42/48

Batch 43/48

Batch 44/48

Batch 45/48

Batch 46/48

Batch 47/48

Batch 48/48

2025-09-24 12:03:21,260 - INFO - Took 26.27 seconds
2025-09-24 12:03:21,266 - INFO - Generated embeddings with dimensions torch.Size([95, 1024])
2025-09-24 12:03:21,267 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:03:21,270 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:03:21,273 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_balm_paired.pt

Protein Language Models

Then we want to use the same dataset to embed using the general protein language models.

ESM2 (Protein language model)

[5]:
# Embed heavy chains only using ESM2
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model esm2 --batch-size 1 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_esm2.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 11:29:55,935 - INFO - Detected single-cell data format
2025-09-24 11:29:55,935 - INFO - Processing both BCR and TCR sequences from the file.
2025-09-24 11:29:55,936 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:29:55,936 - INFO - Removed 102 sequences not matching H chain
tokenizer_config.json: 100%|██████████████████| 95.0/95.0 [00:00<00:00, 157kB/s]
vocab.txt: 100%|█████████████████████████████| 93.0/93.0 [00:00<00:00, 1.33MB/s]
special_tokens_map.json: 100%|██████████████████| 125/125 [00:00<00:00, 448kB/s]
config.json: 100%|█████████████████████████████| 724/724 [00:00<00:00, 2.76MB/s]
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
2025-09-24 11:29:58,760 - WARNING - Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
model.safetensors: 100%|███████████████████| 2.61G/2.61G [01:08<00:00, 38.2MB/s]
2025-09-24 11:31:07,501 - INFO - ESM2 650M model size: 652.36 M
2025-09-24 11:31:07,501 - INFO - Batch 1/95.
2025-09-24 11:31:13,329 - INFO - Batch 2/95.
2025-09-24 11:31:14,066 - INFO - Batch 3/95.
2025-09-24 11:31:14,759 - INFO - Batch 4/95.
2025-09-24 11:31:15,492 - INFO - Batch 5/95.
2025-09-24 11:31:16,153 - INFO - Batch 6/95.
2025-09-24 11:31:16,798 - INFO - Batch 7/95.
2025-09-24 11:31:17,454 - INFO - Batch 8/95.
2025-09-24 11:31:18,110 - INFO - Batch 9/95.
2025-09-24 11:31:18,772 - INFO - Batch 10/95.
2025-09-24 11:31:19,412 - INFO - Batch 11/95.
2025-09-24 11:31:20,058 - INFO - Batch 12/95.
2025-09-24 11:31:20,715 - INFO - Batch 13/95.
2025-09-24 11:31:21,671 - INFO - Batch 14/95.
2025-09-24 11:31:22,346 - INFO - Batch 15/95.
2025-09-24 11:31:23,040 - INFO - Batch 16/95.
2025-09-24 11:31:23,723 - INFO - Batch 17/95.
2025-09-24 11:31:24,406 - INFO - Batch 18/95.
2025-09-24 11:31:25,055 - INFO - Batch 19/95.
2025-09-24 11:31:25,714 - INFO - Batch 20/95.
2025-09-24 11:31:26,358 - INFO - Batch 21/95.
2025-09-24 11:31:27,010 - INFO - Batch 22/95.
2025-09-24 11:31:27,664 - INFO - Batch 23/95.
2025-09-24 11:31:28,306 - INFO - Batch 24/95.
2025-09-24 11:31:28,956 - INFO - Batch 25/95.
2025-09-24 11:31:29,610 - INFO - Batch 26/95.
2025-09-24 11:31:30,291 - INFO - Batch 27/95.
2025-09-24 11:31:30,959 - INFO - Batch 28/95.
2025-09-24 11:31:31,616 - INFO - Batch 29/95.
2025-09-24 11:31:32,260 - INFO - Batch 30/95.
2025-09-24 11:31:32,915 - INFO - Batch 31/95.
2025-09-24 11:31:33,563 - INFO - Batch 32/95.
2025-09-24 11:31:34,215 - INFO - Batch 33/95.
2025-09-24 11:31:34,877 - INFO - Batch 34/95.
2025-09-24 11:31:35,533 - INFO - Batch 35/95.
2025-09-24 11:31:36,186 - INFO - Batch 36/95.
2025-09-24 11:31:36,835 - INFO - Batch 37/95.
2025-09-24 11:31:37,492 - INFO - Batch 38/95.
2025-09-24 11:31:38,145 - INFO - Batch 39/95.
2025-09-24 11:31:38,793 - INFO - Batch 40/95.
2025-09-24 11:31:39,455 - INFO - Batch 41/95.
2025-09-24 11:31:40,097 - INFO - Batch 42/95.
2025-09-24 11:31:40,755 - INFO - Batch 43/95.
2025-09-24 11:31:41,418 - INFO - Batch 44/95.
2025-09-24 11:31:42,113 - INFO - Batch 45/95.
2025-09-24 11:31:42,801 - INFO - Batch 46/95.
2025-09-24 11:31:43,463 - INFO - Batch 47/95.
2025-09-24 11:31:44,124 - INFO - Batch 48/95.
2025-09-24 11:31:44,776 - INFO - Batch 49/95.
2025-09-24 11:31:45,437 - INFO - Batch 50/95.
2025-09-24 11:31:46,097 - INFO - Batch 51/95.
2025-09-24 11:31:46,754 - INFO - Batch 52/95.
2025-09-24 11:31:47,421 - INFO - Batch 53/95.
2025-09-24 11:31:48,078 - INFO - Batch 54/95.
2025-09-24 11:31:48,740 - INFO - Batch 55/95.
2025-09-24 11:31:49,402 - INFO - Batch 56/95.
2025-09-24 11:31:50,067 - INFO - Batch 57/95.
2025-09-24 11:31:50,727 - INFO - Batch 58/95.
2025-09-24 11:31:51,390 - INFO - Batch 59/95.
2025-09-24 11:31:52,045 - INFO - Batch 60/95.
2025-09-24 11:31:52,711 - INFO - Batch 61/95.
2025-09-24 11:31:53,381 - INFO - Batch 62/95.
2025-09-24 11:31:54,035 - INFO - Batch 63/95.
2025-09-24 11:31:54,692 - INFO - Batch 64/95.
2025-09-24 11:31:55,358 - INFO - Batch 65/95.
2025-09-24 11:31:56,032 - INFO - Batch 66/95.
2025-09-24 11:31:56,698 - INFO - Batch 67/95.
2025-09-24 11:31:57,364 - INFO - Batch 68/95.
2025-09-24 11:31:58,028 - INFO - Batch 69/95.
2025-09-24 11:31:58,678 - INFO - Batch 70/95.
2025-09-24 11:31:59,360 - INFO - Batch 71/95.
2025-09-24 11:32:00,035 - INFO - Batch 72/95.
2025-09-24 11:32:00,710 - INFO - Batch 73/95.
2025-09-24 11:32:01,464 - INFO - Batch 74/95.
2025-09-24 11:32:02,132 - INFO - Batch 75/95.
2025-09-24 11:32:02,799 - INFO - Batch 76/95.
2025-09-24 11:32:03,452 - INFO - Batch 77/95.
2025-09-24 11:32:04,112 - INFO - Batch 78/95.
2025-09-24 11:32:04,775 - INFO - Batch 79/95.
2025-09-24 11:32:05,430 - INFO - Batch 80/95.
2025-09-24 11:32:06,089 - INFO - Batch 81/95.
2025-09-24 11:32:06,751 - INFO - Batch 82/95.
2025-09-24 11:32:07,423 - INFO - Batch 83/95.
2025-09-24 11:32:08,078 - INFO - Batch 84/95.
2025-09-24 11:32:08,731 - INFO - Batch 85/95.
2025-09-24 11:32:09,408 - INFO - Batch 86/95.
2025-09-24 11:32:10,059 - INFO - Batch 87/95.
2025-09-24 11:32:10,719 - INFO - Batch 88/95.
2025-09-24 11:32:11,371 - INFO - Batch 89/95.
2025-09-24 11:32:12,039 - INFO - Batch 90/95.
2025-09-24 11:32:12,692 - INFO - Batch 91/95.
2025-09-24 11:32:13,349 - INFO - Batch 92/95.
2025-09-24 11:32:14,010 - INFO - Batch 93/95.
2025-09-24 11:32:14,681 - INFO - Batch 94/95.
2025-09-24 11:32:15,338 - INFO - Batch 95/95.
2025-09-24 11:32:15,998 - INFO - Took 68.5 seconds
2025-09-24 11:32:16,012 - INFO - Generated embeddings with dimensions torch.Size([95, 1280])
2025-09-24 11:32:16,013 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:32:16,015 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:32:16,033 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_esm2.pt

Custom/Fine-tuned models

You can use custom or fine-tuned models from HuggingFace or local paths using the custom model type:

[14]:
# Example: Using a fine-tuned ESM2 model from HuggingFace
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model custom \
  --model-path "AmelieSchreiber/esm2_t6_8M_UR50D-finetuned-localization" \
  --embedding-dimension 320 \
  --max-length 512 \
  --batch-size 2 \
  --output-file-path tutorial/custom_model_embeddings.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 12:30:43,593 - INFO - Detected single-cell data format
2025-09-24 12:30:43,595 - INFO - Processing both BCR and TCR sequences from the file.
2025-09-24 12:30:43,596 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:30:43,597 - INFO - Removed 102 sequences not matching H chain
Some weights of EsmForMaskedLM were not initialized from the model checkpoint at AmelieSchreiber/esm2_t6_8M_UR50D-finetuned-localization and are newly initialized: ['lm_head.bias', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2025-09-24 12:30:46,532 - INFO - Model size: 7.84M
Batch 1/48

Batch 2/48

Batch 3/48

Batch 4/48

Batch 5/48

Batch 6/48

Batch 7/48

Batch 8/48

Batch 9/48

Batch 10/48

Batch 11/48

Batch 12/48

Batch 13/48

Batch 14/48

Batch 15/48

Batch 16/48

Batch 17/48

Batch 18/48

Batch 19/48

Batch 20/48

Batch 21/48

Batch 22/48

Batch 23/48

Batch 24/48

Batch 25/48

Batch 26/48

Batch 27/48

Batch 28/48

Batch 29/48

Batch 30/48

Batch 31/48

Batch 32/48

Batch 33/48

Batch 34/48

Batch 35/48

Batch 36/48

Batch 37/48

Batch 38/48

Batch 39/48

Batch 40/48

Batch 41/48

Batch 42/48

Batch 43/48

Batch 44/48

Batch 45/48

Batch 46/48

Batch 47/48

Batch 48/48

2025-09-24 12:30:51,159 - INFO - Took 4.63 seconds
2025-09-24 12:30:51,159 - INFO - Generated embeddings with dimensions torch.Size([95, 320])
2025-09-24 12:30:51,160 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:30:51,161 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:30:51,165 - INFO - Saved embedding at tutorial/custom_model_embeddings.pt

TCR embedding examples

AMULETY also supports TCR-specific models. Here we also provide TCR example data and you can download and have a try:

[7]:
# Download TCR example data
! wget -P tutorial https://zenodo.org/records/17186858/files/AIRR_tcr_sample.tsv
--2025-09-24 11:35:16--  https://zenodo.org/records/17186858/files/AIRR_tcr_sample.tsv
Resolving zenodo.org (zenodo.org)... 188.185.45.92, 188.185.48.194, 188.185.43.25, ...
Connecting to zenodo.org (zenodo.org)|188.185.45.92|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 40915 (40K) [application/octet-stream]
Saving to: 'tutorial/AIRR_tcr_sample.tsv'

AIRR_tcr_sample.tsv 100%[===================>]  39.96K   166KB/s    in 0.2s

2025-09-24 11:35:17 (166 KB/s) - 'tutorial/AIRR_tcr_sample.tsv' saved [40915/40915]

TCR-BERT (TCR-specific model)

[15]:
# Embed TCR beta-alpha chain pairs using TCR-BERT
# Note: This assumes you have TCR data in AIRR format
! amulety embed --input-airr tutorial/AIRR_tcr_sample.tsv --chain HL --model tcr-bert --batch-size 2 --output-file-path tutorial/tcr_embeddings_tcrbert.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 12:31:02,594 - INFO - Detected single-cell data format
2025-09-24 12:31:02,595 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:31:02,599 - INFO - Dropping 100 cells with missing heavy or light chain...
2025-09-24 12:31:02,600 - INFO - Loading TCR-BERT model for TCR embedding...
2025-09-24 12:31:04,618 - INFO - Successfully loaded TCR-BERT model
2025-09-24 12:31:04,619 - INFO - TCR-BERT model loaded. Size: 57.39 M
2025-09-24 12:31:04,619 - INFO - TCR-BERT Batch 1/25.
2025-09-24 12:31:04,663 - INFO - TCR-BERT Batch 2/25.
2025-09-24 12:31:04,689 - INFO - TCR-BERT Batch 3/25.
2025-09-24 12:31:04,712 - INFO - TCR-BERT Batch 4/25.
2025-09-24 12:31:04,735 - INFO - TCR-BERT Batch 5/25.
2025-09-24 12:31:04,756 - INFO - TCR-BERT Batch 6/25.
2025-09-24 12:31:04,780 - INFO - TCR-BERT Batch 7/25.
2025-09-24 12:31:04,802 - INFO - TCR-BERT Batch 8/25.
2025-09-24 12:31:04,827 - INFO - TCR-BERT Batch 9/25.
2025-09-24 12:31:04,849 - INFO - TCR-BERT Batch 10/25.
2025-09-24 12:31:04,872 - INFO - TCR-BERT Batch 11/25.
2025-09-24 12:31:04,895 - INFO - TCR-BERT Batch 12/25.
2025-09-24 12:31:04,917 - INFO - TCR-BERT Batch 13/25.
2025-09-24 12:31:04,940 - INFO - TCR-BERT Batch 14/25.
2025-09-24 12:31:04,961 - INFO - TCR-BERT Batch 15/25.
2025-09-24 12:31:04,984 - INFO - TCR-BERT Batch 16/25.
2025-09-24 12:31:05,006 - INFO - TCR-BERT Batch 17/25.
2025-09-24 12:31:05,028 - INFO - TCR-BERT Batch 18/25.
2025-09-24 12:31:05,052 - INFO - TCR-BERT Batch 19/25.
2025-09-24 12:31:05,074 - INFO - TCR-BERT Batch 20/25.
2025-09-24 12:31:05,097 - INFO - TCR-BERT Batch 21/25.
2025-09-24 12:31:05,119 - INFO - TCR-BERT Batch 22/25.
2025-09-24 12:31:05,143 - INFO - TCR-BERT Batch 23/25.
2025-09-24 12:31:05,166 - INFO - TCR-BERT Batch 24/25.
2025-09-24 12:31:05,188 - INFO - TCR-BERT Batch 25/25.
2025-09-24 12:31:05,211 - INFO - TCR-BERT embedding took 0.59 seconds
2025-09-24 12:31:05,212 - INFO - Generated embeddings with dimensions torch.Size([50, 768])
2025-09-24 12:31:05,212 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:31:05,213 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:31:05,215 - INFO - Saved embedding at tutorial/tcr_embeddings_tcrbert.pt

TCRT5 (TCR beta chain only)

[16]:
# Embed TCR beta chains using TCRT5 (only supports H/beta chains)
! amulety embed --input-airr tutorial/AIRR_tcr_sample.tsv --chain H --model tcrt5 --batch-size 2 --output-file-path tutorial/tcr_embeddings_tcrt5.pt

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

2025-09-24 12:31:11,221 - INFO - Detected single-cell data format
2025-09-24 12:31:11,221 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:31:11,222 - INFO - Removed 100 sequences not matching H chain
2025-09-24 12:31:11,222 - INFO - Loading TCRT5 model for TCR embedding...
tokenizer_config.json: 21.1kB [00:00, 23.3MB/s]
spiece.model: 100%|██████████████████████████| 238k/238k [00:00<00:00, 2.78MB/s]
added_tokens.json: 2.35kB [00:00, 16.2MB/s]
special_tokens_map.json: 2.64kB [00:00, 12.0MB/s]
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'TCRT5Tokenizer'.
The class this function is called from is 'T5Tokenizer'.
config.json: 100%|█████████████████████████████| 970/970 [00:00<00:00, 8.67MB/s]
model.safetensors: 100%|█████████████████████| 168M/168M [00:03<00:00, 46.0MB/s]
/opt/anaconda3/envs/torchen/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:817: UserWarning: `return_dict_in_generate` is NOT set to `True`, but `output_attentions` is. When `return_dict_in_generate` is not `True`, `output_attentions` is ignored.
  warnings.warn(
/opt/anaconda3/envs/torchen/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:817: UserWarning: `return_dict_in_generate` is NOT set to `True`, but `output_hidden_states` is. When `return_dict_in_generate` is not `True`, `output_hidden_states` is ignored.
  warnings.warn(
/opt/anaconda3/envs/torchen/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:817: UserWarning: `return_dict_in_generate` is NOT set to `True`, but `output_scores` is. When `return_dict_in_generate` is not `True`, `output_scores` is ignored.
  warnings.warn(
generation_config.json: 100%|██████████████████| 249/249 [00:00<00:00, 3.32MB/s]
2025-09-24 12:31:17,681 - INFO - TCRT5 Batch 1/50.
2025-09-24 12:31:17,707 - INFO - TCRT5 Batch 2/50.
2025-09-24 12:31:17,719 - INFO - TCRT5 Batch 3/50.
2025-09-24 12:31:17,730 - INFO - TCRT5 Batch 4/50.
2025-09-24 12:31:17,740 - INFO - TCRT5 Batch 5/50.
2025-09-24 12:31:17,753 - INFO - TCRT5 Batch 6/50.
2025-09-24 12:31:17,763 - INFO - TCRT5 Batch 7/50.
2025-09-24 12:31:17,773 - INFO - TCRT5 Batch 8/50.
2025-09-24 12:31:17,784 - INFO - TCRT5 Batch 9/50.
2025-09-24 12:31:17,795 - INFO - TCRT5 Batch 10/50.
2025-09-24 12:31:17,806 - INFO - TCRT5 Batch 11/50.
2025-09-24 12:31:17,817 - INFO - TCRT5 Batch 12/50.
2025-09-24 12:31:17,827 - INFO - TCRT5 Batch 13/50.
2025-09-24 12:31:17,837 - INFO - TCRT5 Batch 14/50.
2025-09-24 12:31:17,848 - INFO - TCRT5 Batch 15/50.
2025-09-24 12:31:17,860 - INFO - TCRT5 Batch 16/50.
2025-09-24 12:31:17,871 - INFO - TCRT5 Batch 17/50.
2025-09-24 12:31:17,882 - INFO - TCRT5 Batch 18/50.
2025-09-24 12:31:17,893 - INFO - TCRT5 Batch 19/50.
2025-09-24 12:31:17,904 - INFO - TCRT5 Batch 20/50.
2025-09-24 12:31:17,914 - INFO - TCRT5 Batch 21/50.
2025-09-24 12:31:17,924 - INFO - TCRT5 Batch 22/50.
2025-09-24 12:31:17,934 - INFO - TCRT5 Batch 23/50.
2025-09-24 12:31:17,944 - INFO - TCRT5 Batch 24/50.
2025-09-24 12:31:17,955 - INFO - TCRT5 Batch 25/50.
2025-09-24 12:31:17,966 - INFO - TCRT5 Batch 26/50.
2025-09-24 12:31:17,976 - INFO - TCRT5 Batch 27/50.
2025-09-24 12:31:17,987 - INFO - TCRT5 Batch 28/50.
2025-09-24 12:31:17,997 - INFO - TCRT5 Batch 29/50.
2025-09-24 12:31:18,008 - INFO - TCRT5 Batch 30/50.
2025-09-24 12:31:18,018 - INFO - TCRT5 Batch 31/50.
2025-09-24 12:31:18,028 - INFO - TCRT5 Batch 32/50.
2025-09-24 12:31:18,038 - INFO - TCRT5 Batch 33/50.
2025-09-24 12:31:18,049 - INFO - TCRT5 Batch 34/50.
2025-09-24 12:31:18,060 - INFO - TCRT5 Batch 35/50.
2025-09-24 12:31:18,070 - INFO - TCRT5 Batch 36/50.
2025-09-24 12:31:18,080 - INFO - TCRT5 Batch 37/50.
2025-09-24 12:31:18,090 - INFO - TCRT5 Batch 38/50.
2025-09-24 12:31:18,100 - INFO - TCRT5 Batch 39/50.
2025-09-24 12:31:18,111 - INFO - TCRT5 Batch 40/50.
2025-09-24 12:31:18,121 - INFO - TCRT5 Batch 41/50.
2025-09-24 12:31:18,131 - INFO - TCRT5 Batch 42/50.
2025-09-24 12:31:18,141 - INFO - TCRT5 Batch 43/50.
2025-09-24 12:31:18,150 - INFO - TCRT5 Batch 44/50.
2025-09-24 12:31:18,160 - INFO - TCRT5 Batch 45/50.
2025-09-24 12:31:18,171 - INFO - TCRT5 Batch 46/50.
2025-09-24 12:31:18,181 - INFO - TCRT5 Batch 47/50.
2025-09-24 12:31:18,191 - INFO - TCRT5 Batch 48/50.
2025-09-24 12:31:18,201 - INFO - TCRT5 Batch 49/50.
2025-09-24 12:31:18,212 - INFO - TCRT5 Batch 50/50.
2025-09-24 12:31:18,224 - INFO - TCRT5 embedding took 7.0 seconds
2025-09-24 12:31:18,225 - INFO - Generated embeddings with dimensions torch.Size([100, 256])
2025-09-24 12:31:18,226 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:31:18,226 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:31:18,228 - INFO - Saved embedding at tutorial/tcr_embeddings_tcrt5.pt

Checking dependencies

Some models require additional dependencies that are not installed by default. You can check which dependencies are missing:

[1]:
# Check which optional dependencies are missing
! amulety check-deps

 █████  ███    ███ ██    ██ ██      ███████ ████████     ██    ██
██   ██ ████  ████ ██    ██ ██      ██         ██         ██  ██
███████ ██ ████ ██ ██    ██ ██      █████      ██          ████
██   ██ ██  ██  ██ ██    ██ ██      ██         ██           ██
██   ██ ██      ██  ██████  ███████ ███████    ██           ██

AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
 version 2.0

Checking AMULETY dependencies...

IgBlast (for translate-igblast command):
  IgBlast (igblastn) is available

Embedding model dependencies:
2025-09-24 12:51:20,234 - INFO - Available models: AntiBERTy, AbLang, TCR-BERT, TCRT5, ESM2, ProtT5
2025-09-24 12:51:20,234 - WARNING - Missing model dependencies: Immune2Vec
  1 dependencies are missing.
  AMULETY will raise ImportError with installation instructions when these models are used.

  To install missing dependencies:
    • Immune2Vec: git clone https://bitbucket.org/yaarilab/immune2vec_model.git && add to Python path

  Note: Models will provide detailed installation instructions when used.