AMULETY CLI Tutorial
Introduction
This tutorial demonstrates how to use AMULETY command line interface (CLI) to translate and embed both BCR (B-cell receptor) and TCR (T-cell receptor) sequences.
AMULETY supports a wide range of embedding models for different immune receptor types. For a full list of the supported models, please check the Usage documentation page.
Installation
Before getting started, please install AMULETY. You can install AMULETY through conda or pip. The conda installation will already install the IgBlast dependency, while if installing via pip, the IgBLAST dependency will need to be installed separately.
Install AMULETY through conda:
[ ]:
conda install -c conda-forge -c bioconda amulety --strict-channel-priority
To verify the installation and print the help message, run:
[1]:
! amulety --help
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
Usage: amulety [OPTIONS] COMMAND [ARGS]...
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy │
│ it or customize the installation. │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────╮
│ translate-igblast Translates nucleotide sequences to amino acid sequences │
│ using IgBlast. │
│ embed Embeds sequences from an AIRR rearrangement file using │
│ the specified model. It returns the │
│ check-deps Check if optional embedding dependencies and tools are │
│ installed. │
╰──────────────────────────────────────────────────────────────────────────────╯
Downloading example data and reference database
The following command downloads an example AIRR format file of BCR sequences and the reference IgBlast database.
[ ]:
# Create tutorial directory and download example data
! mkdir -p tutorial
! wget -P tutorial https://zenodo.org/records/17186858/files/AIRR_subject1_FNA_d0_1_Y1.tsv
# Download and extract IgBlast reference database
! wget -P tutorial -c https://github.com/nf-core/test-datasets/raw/airrflow/database-cache/igblast_base.zip
! unzip tutorial/igblast_base.zip -d tutorial
! rm tutorial/igblast_base.zip
--2025-09-24 13:00:01-- https://zenodo.org/records/17186858/files/AIRR_subject1_FNA_d0_1_Y1.tsv
Resolving zenodo.org (zenodo.org)... 188.185.45.92, 188.185.43.25, 188.185.48.194, ...
Connecting to zenodo.org (zenodo.org)|188.185.45.92|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 479753 (469K) [application/octet-stream]
Saving to: 'tutorial/AIRR_subject1_FNA_d0_1_Y1.tsv.2'
AIRR_subject1_FNA_d 100%[===================>] 468.51K 578KB/s in 0.8s
2025-09-24 13:00:02 (578 KB/s) - 'tutorial/AIRR_subject1_FNA_d0_1_Y1.tsv.2' saved [479753/479753]
--2025-09-24 13:00:03-- https://github.com/nf-core/test-datasets/raw/airrflow/database-cache/igblast_base.zip
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip [following]
--2025-09-24 13:00:03-- https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1204742 (1.1M) [application/zip]
Saving to: 'tutorial/igblast_base.zip'
igblast_base.zip 100%[===================>] 1.15M --.-KB/s in 0.1s
2025-09-24 13:00:03 (7.89 MB/s) - 'tutorial/igblast_base.zip' saved [1204742/1204742]
Archive: tutorial/igblast_base.zip
creating: tutorial/igblast_base/
creating: tutorial/igblast_base/internal_data/
creating: tutorial/igblast_base/internal_data/rhesus_monkey/
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.pin
creating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/Repository
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/Entries
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/CVS/Root
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.pdm.imgt
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.pdm.kabat
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nsd
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nin
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nog
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nsq
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nhr
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nsi
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.psq
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nhr
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.phr
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.pog
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nog
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nsd
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nsi
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nhr
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nsd
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nsi
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.psd
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.nsq
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nsq
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.ndm.kabat
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nin
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_D.nog
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_V.psi
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey_J.nin
inflating: tutorial/igblast_base/internal_data/rhesus_monkey/rhesus_monkey.ndm.imgt
creating: tutorial/igblast_base/internal_data/human/
inflating: tutorial/igblast_base/internal_data/human/human.ndm.imgt
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nsq
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nsd
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.psq
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.pog
inflating: tutorial/igblast_base/internal_data/human/human_V.pog
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nog
inflating: tutorial/igblast_base/internal_data/human/human_V.psd
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.phr
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nsi
inflating: tutorial/igblast_base/internal_data/human/human_V.nsi
inflating: tutorial/igblast_base/internal_data/human/human_V.nsq
inflating: tutorial/igblast_base/internal_data/human/human_V.pin
inflating: tutorial/igblast_base/internal_data/human/human.ndm.kabat
inflating: tutorial/igblast_base/internal_data/human/human_V.nin
inflating: tutorial/igblast_base/internal_data/human/human_V.nhr
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.psi
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.psd
inflating: tutorial/igblast_base/internal_data/human/human_V.nog
inflating: tutorial/igblast_base/internal_data/human/human_V.psi
inflating: tutorial/igblast_base/internal_data/human/human_V.phr
inflating: tutorial/igblast_base/internal_data/human/human_V.psq
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nin
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.nhr
inflating: tutorial/igblast_base/internal_data/human/human.pdm.kabat
inflating: tutorial/igblast_base/internal_data/human/human.pdm.imgt
inflating: tutorial/igblast_base/internal_data/human/human_V.nsd
inflating: tutorial/igblast_base/internal_data/human/human_TR_V.pin
creating: tutorial/igblast_base/internal_data/rat/
inflating: tutorial/igblast_base/internal_data/rat/rat_V.nhr
inflating: tutorial/igblast_base/internal_data/rat/rat_V.nin
inflating: tutorial/igblast_base/internal_data/rat/rat_V.nog
inflating: tutorial/igblast_base/internal_data/rat/rat_V.pin
inflating: tutorial/igblast_base/internal_data/rat/rat_V.nsd
inflating: tutorial/igblast_base/internal_data/rat/rat.pdm.kabat
inflating: tutorial/igblast_base/internal_data/rat/rat_V.psq
inflating: tutorial/igblast_base/internal_data/rat/rat.ndm.imgt
inflating: tutorial/igblast_base/internal_data/rat/rat_V.pog
inflating: tutorial/igblast_base/internal_data/rat/rat_V.nsq
inflating: tutorial/igblast_base/internal_data/rat/rat_V.phr
inflating: tutorial/igblast_base/internal_data/rat/rat_V.psi
inflating: tutorial/igblast_base/internal_data/rat/rat.ndm.kabat
inflating: tutorial/igblast_base/internal_data/rat/rat_V.psd
inflating: tutorial/igblast_base/internal_data/rat/rat.pdm.imgt
inflating: tutorial/igblast_base/internal_data/rat/rat_V.nsi
creating: tutorial/igblast_base/internal_data/mouse/
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nsi
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nin
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nsd
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nog
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nhr
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.pog
inflating: tutorial/igblast_base/internal_data/mouse/mouse.pdm.kabat
inflating: tutorial/igblast_base/internal_data/mouse/mouse.ndm.kabat
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.pin
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nsq
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.pin
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.psq
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.phr
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.psq
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nsd
inflating: tutorial/igblast_base/internal_data/mouse/mouse.ndm.imgt
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nsq
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.psi
inflating: tutorial/igblast_base/internal_data/mouse/mouse.pdm.imgt
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.phr
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nog
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.pog
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nhr
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.psd
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.psi
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.nin
inflating: tutorial/igblast_base/internal_data/mouse/mouse_V.psd
inflating: tutorial/igblast_base/internal_data/mouse/mouse_TR_V.nsi
inflating: tutorial/igblast_base/internal_data/readme
creating: tutorial/igblast_base/internal_data/rabbit/
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.psq
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.ndm.imgt
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.pdm.imgt
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.pdm.kabat
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nsq
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.phr
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nin
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.psi
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nsi
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nog
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit.ndm.kabat
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nhr
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.pin
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.pog
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.nsd
inflating: tutorial/igblast_base/internal_data/rabbit/rabbit_V.psd
creating: tutorial/igblast_base/fasta/
inflating: tutorial/igblast_base/fasta/imgt_human_tr_d.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_ig_d.fasta
inflating: tutorial/igblast_base/fasta/imgt_aa_human_ig_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_aa_human_tr_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_d.fasta
inflating: tutorial/igblast_base/fasta/imgt_aa_mouse_ig_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_tr_c.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_ig_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_j.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_c.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_tr_j.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_j.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_ig_c.fasta
inflating: tutorial/igblast_base/fasta/imgt_aa_mouse_tr_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_ig_c.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_d.fasta
inflating: tutorial/igblast_base/fasta/imgt_mouse_tr_v.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_ig_j.fasta
inflating: tutorial/igblast_base/fasta/imgt_human_tr_v.fasta
creating: tutorial/igblast_base/optional_file/
inflating: tutorial/igblast_base/optional_file/human_gl.aux
inflating: tutorial/igblast_base/optional_file/human_gl.aux.testonly
inflating: tutorial/igblast_base/optional_file/rabbit_gl.aux
inflating: tutorial/igblast_base/optional_file/mouse_gl.aux
inflating: tutorial/igblast_base/optional_file/readme
inflating: tutorial/igblast_base/optional_file/rat_gl.aux
inflating: tutorial/igblast_base/optional_file/rhesus_monkey_gl.aux
creating: tutorial/igblast_base/database/
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.ntf
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nhr
inflating: tutorial/igblast_base/database/rhesus_monkey_V.pin
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pjs
inflating: tutorial/igblast_base/database/imgt_human_ig_v.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.ntf
inflating: tutorial/igblast_base/database/imgt_human_ig_c.nog
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nos
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.njs
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.ntf
inflating: tutorial/igblast_base/database/imgt_human_ig_v.not
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nin
inflating: tutorial/igblast_base/database/imgt_human_tr_c.njs
inflating: tutorial/igblast_base/database/imgt_human_tr_j.nhr
inflating: tutorial/igblast_base/database/imgt_human_tr_c.nos
inflating: tutorial/igblast_base/database/imgt_human_tr_c.nog
inflating: tutorial/igblast_base/database/imgt_human_ig_c.nin
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nos
inflating: tutorial/igblast_base/database/imgt_human_ig_j.njs
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nin
inflating: tutorial/igblast_base/database/imgt_human_ig_d.nos
inflating: tutorial/igblast_base/database/mouse_gl_V.pog
inflating: tutorial/igblast_base/database/imgt_human_ig_v.njs
inflating: tutorial/igblast_base/database/mouse_gl_V.nsd
inflating: tutorial/igblast_base/database/imgt_human_tr_d.not
inflating: tutorial/igblast_base/database/imgt_human_tr_c.nin
inflating: tutorial/igblast_base/database/imgt_human_ig_c.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nog
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.ptf
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nos
inflating: tutorial/igblast_base/database/ncbi_human_c_genes.tar
inflating: tutorial/igblast_base/database/imgt_human_tr_c.nto
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pto
inflating: tutorial/igblast_base/database/imgt_human_tr_d.nos
inflating: tutorial/igblast_base/database/mouse_gl_V.nsq
inflating: tutorial/igblast_base/database/imgt_human_ig_v.nhr
inflating: tutorial/igblast_base/database/imgt_human_tr_c.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.ntf
inflating: tutorial/igblast_base/database/imgt_human_ig_d.ndb
inflating: tutorial/igblast_base/database/mouse_gl_D.nsd
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.ndb
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pdb
inflating: tutorial/igblast_base/database/imgt_human_tr_j.nsq
extracting: tutorial/igblast_base/database/imgt_mouse_tr_d.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nto
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pot
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.ndb
inflating: tutorial/igblast_base/database/mouse_gl_D.nsi
inflating: tutorial/igblast_base/database/imgt_human_ig_v.nos
inflating: tutorial/igblast_base/database/imgt_human_ig_d.not
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pog
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nog
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.njs
inflating: tutorial/igblast_base/database/imgt_human_tr_d.njs
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pto
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.njs
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nto
inflating: tutorial/igblast_base/database/imgt_human_tr_v.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nos
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.ndb
inflating: tutorial/igblast_base/database/imgt_human_tr_v.nin
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nin
inflating: tutorial/igblast_base/database/rhesus_monkey_V.nsd
inflating: tutorial/igblast_base/database/rhesus_monkey_V.nin
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pos
inflating: tutorial/igblast_base/database/mouse_gl_J.nog
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nhr
inflating: tutorial/igblast_base/database/imgt_human_tr_d.nin
inflating: tutorial/igblast_base/database/imgt_human_tr_c.not
inflating: tutorial/igblast_base/database/mouse_gl_D.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nsq
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pdb
inflating: tutorial/igblast_base/database/imgt_human_tr_v.nog
inflating: tutorial/igblast_base/database/rhesus_monkey_J.nog
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.ntf
inflating: tutorial/igblast_base/database/imgt_human_tr_j.ntf
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.not
inflating: tutorial/igblast_base/database/mouse_gl_J.nsi
inflating: tutorial/igblast_base/database/imgt_human_ig_c.not
inflating: tutorial/igblast_base/database/imgt_human_tr_d.nog
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.psq
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pot
inflating: tutorial/igblast_base/database/imgt_human_ig_c.njs
inflating: tutorial/igblast_base/database/imgt_human_tr_d.nto
inflating: tutorial/igblast_base/database/mouse_gl_V.nog
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nog
inflating: tutorial/igblast_base/database/imgt_human_ig_v.nog
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.not
inflating: tutorial/igblast_base/database/imgt_human_ig_v.ntf
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pin
inflating: tutorial/igblast_base/database/imgt_human_ig_d.nsq
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pos
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nog
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.not
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nos
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.ndb
inflating: tutorial/igblast_base/database/imgt_human_ig_j.ndb
inflating: tutorial/igblast_base/database/imgt_human_tr_v.nos
inflating: tutorial/igblast_base/database/imgt_human_ig_c.nto
inflating: tutorial/igblast_base/database/imgt_human_ig_j.nog
inflating: tutorial/igblast_base/database/imgt_human_ig_d.nto
inflating: tutorial/igblast_base/database/mouse_gl_J.nsd
inflating: tutorial/igblast_base/database/imgt_human_ig_d.nin
inflating: tutorial/igblast_base/database/rhesus_monkey_J.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.njs
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.ntf
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.phr
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nto
inflating: tutorial/igblast_base/database/imgt_human_tr_j.not
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nos
inflating: tutorial/igblast_base/database/imgt_human_ig_v.nto
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nog
inflating: tutorial/igblast_base/database/rhesus_monkey_J.nsi
inflating: tutorial/igblast_base/database/mouse_gl_V.phr
inflating: tutorial/igblast_base/database/imgt_human_ig_v.nin
inflating: tutorial/igblast_base/database/imgt_human_ig_j.nin
inflating: tutorial/igblast_base/database/rhesus_monkey_V.psq
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pot
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nin
inflating: tutorial/igblast_base/database/mouse_gl_V.pin
inflating: tutorial/igblast_base/database/imgt_human_ig_v.nsq
inflating: tutorial/igblast_base/database/imgt_human_ig_j.nhr
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pog
inflating: tutorial/igblast_base/database/imgt_human_tr_j.nin
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.njs
inflating: tutorial/igblast_base/database/imgt_human_tr_v.nsq
inflating: tutorial/igblast_base/database/imgt_human_ig_j.nsq
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.psq
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.ptf
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pin
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.ntf
inflating: tutorial/igblast_base/database/rhesus_monkey_V.nhr
inflating: tutorial/igblast_base/database/rhesus_monkey_V.phr
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.phr
inflating: tutorial/igblast_base/database/imgt_human_tr_j.ndb
inflating: tutorial/igblast_base/database/mouse_gl_V.nsi
inflating: tutorial/igblast_base/database/imgt_human_tr_d.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nsq
inflating: tutorial/igblast_base/database/imgt_human_ig_d.ntf
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nto
inflating: tutorial/igblast_base/database/imgt_human_tr_d.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nos
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.not
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pto
inflating: tutorial/igblast_base/database/rhesus_monkey_V.pog
inflating: tutorial/igblast_base/database/rhesus_monkey_V.nog
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.psq
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nin
inflating: tutorial/igblast_base/database/imgt_human_tr_j.njs
extracting: tutorial/igblast_base/database/mouse_gl_J.nsq
inflating: tutorial/igblast_base/database/imgt_human_tr_c.ntf
inflating: tutorial/igblast_base/database/mouse_gl_D.nog
inflating: tutorial/igblast_base/database/imgt_human_ig_j.not
extracting: tutorial/igblast_base/database/imgt_human_tr_d.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.not
inflating: tutorial/igblast_base/database/imgt_human_tr_v.nto
inflating: tutorial/igblast_base/database/imgt_human_ig_c.ntf
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pjs
inflating: tutorial/igblast_base/database/imgt_human_tr_v.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nin
creating: tutorial/igblast_base/database/airr/
inflating: tutorial/igblast_base/database/airr/airr_c_human.tar
inflating: tutorial/igblast_base/database/airr/airr_c_mouse.tar
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.not
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nos
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.njs
inflating: tutorial/igblast_base/database/rhesus_monkey_J.nsd
inflating: tutorial/igblast_base/database/imgt_human_tr_c.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nsq
inflating: tutorial/igblast_base/database/imgt_human_tr_v.njs
inflating: tutorial/igblast_base/database/imgt_human_tr_j.nog
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pin
inflating: tutorial/igblast_base/database/imgt_mouse_ig_j.nto
inflating: tutorial/igblast_base/database/rhesus_monkey_J.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nto
inflating: tutorial/igblast_base/database/imgt_human_ig_c.nhr
inflating: tutorial/igblast_base/database/imgt_human_ig_j.nos
inflating: tutorial/igblast_base/database/rhesus_monkey_VJ.tar
inflating: tutorial/igblast_base/database/mouse_gl_VDJ.tar
inflating: tutorial/igblast_base/database/imgt_human_ig_j.nto
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.ptf
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nog
inflating: tutorial/igblast_base/database/imgt_human_tr_v.not
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.phr
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.ntf
inflating: tutorial/igblast_base/database/imgt_human_tr_v.ntf
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nto
inflating: tutorial/igblast_base/database/rhesus_monkey_V.nsi
inflating: tutorial/igblast_base/database/imgt_mouse_tr_v.nin
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.nsq
inflating: tutorial/igblast_base/database/imgt_human_tr_j.nos
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pdb
inflating: tutorial/igblast_base/database/mouse_gl_V.nhr
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.phr
inflating: tutorial/igblast_base/database/imgt_human_ig_d.njs
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nhr
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pog
inflating: tutorial/igblast_base/database/mouse_gl_V.psq
inflating: tutorial/igblast_base/database/rhesus_monkey_V.psd
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.ptf
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.nog
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pdb
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nhr
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.njs
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.ndb
inflating: tutorial/igblast_base/database/imgt_aa_human_tr_v.pjs
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pog
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.not
inflating: tutorial/igblast_base/database/rhesus_monkey_V.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nin
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pos
inflating: tutorial/igblast_base/database/imgt_human_tr_d.ntf
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pos
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.nhr
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pto
inflating: tutorial/igblast_base/database/imgt_aa_mouse_tr_v.pot
inflating: tutorial/igblast_base/database/mouse_gl_V.nin
inflating: tutorial/igblast_base/database/imgt_aa_mouse_ig_v.pjs
inflating: tutorial/igblast_base/database/mouse_gl_D.nin
inflating: tutorial/igblast_base/database/mouse_gl_J.nin
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.ndb
inflating: tutorial/igblast_base/database/mouse_gl_D.nhr
inflating: tutorial/igblast_base/database/imgt_human_ig_d.nog
inflating: tutorial/igblast_base/database/imgt_human_ig_c.nsq
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nto
inflating: tutorial/igblast_base/database/imgt_human_ig_j.ntf
inflating: tutorial/igblast_base/database/imgt_mouse_tr_j.nsq
inflating: tutorial/igblast_base/database/imgt_human_tr_j.nto
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.psq
inflating: tutorial/igblast_base/database/imgt_mouse_ig_v.not
inflating: tutorial/igblast_base/database/imgt_human_ig_d.nhr
inflating: tutorial/igblast_base/database/imgt_human_ig_c.nos
inflating: tutorial/igblast_base/database/mouse_gl_V.psd
inflating: tutorial/igblast_base/database/imgt_mouse_tr_d.nog
inflating: tutorial/igblast_base/database/imgt_human_tr_c.nhr
inflating: tutorial/igblast_base/database/rhesus_monkey_V.psi
inflating: tutorial/igblast_base/database/imgt_mouse_ig_c.nsq
inflating: tutorial/igblast_base/database/imgt_aa_human_ig_v.pin
inflating: tutorial/igblast_base/database/mouse_gl_J.nhr
inflating: tutorial/igblast_base/database/rhesus_monkey_J.nin
inflating: tutorial/igblast_base/database/imgt_mouse_tr_c.ndb
inflating: tutorial/igblast_base/database/imgt_mouse_ig_d.njs
inflating: tutorial/igblast_base/database/mouse_gl_V.psi
Translating nucleotides to amino acid sequences
The inputs to the embedding models are AIRR format files with immune receptor amino acid sequences. If the AIRR file only contains nucleotide sequences, the amulety translate-igblast command can help with the translation:
[ ]:
! amulety translate-igblast -i tutorial/AIRR_subject1_FNA_d0_1_Y1.tsv -o tutorial -r tutorial/igblast_base
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 11:28:42,713 - INFO - Converting AIRR table to FastA for IgBlast translation...
2025-09-24 11:28:42,720 - INFO - Calling IgBlast for running translation...
2025-09-24 11:28:44,404 - INFO - Saved the translations in the dataframe (sequence_aa contains the full translation and sequence_vdj_aa contains the VDJ translation).
2025-09-24 11:28:44,407 - INFO - Took 1.69 seconds
2025-09-24 11:28:44,408 - INFO - Saved the translations in tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv file.
Embedding sequences
Now we are ready to embed the sequences using various models. AMULETY uses a unified embed command that supports all available models.
To print the help message for the embedding command run:
[5]:
! amulety embed --help
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
Usage: amulety embed [OPTIONS]
Embeds sequences from an AIRR rearrangement file using the specified model. It
returns the
Example usage:
amulety embed --chain HL --model antiberta2 --output-file-path out.pt
airr_rearrangement.tsv
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ * --input-airr TEXT The path to the input data file. │
│ The data file should be in AIRR │
│ format. │
│ [default: None] │
│ [required] │
│ * --chain TEXT Input sequences. For BCR: H=Heavy, │
│ L=Light, HL=Heavy-Light pairs, │
│ LH=Light-Heavy pairs, H+L=Both │
│ chains separately. For TCR: │
│ H=Beta/Delta, L=Alpha/Gamma, │
│ HL=Beta-Alpha/Delta-Gamma pairs, │
│ LH=Alpha-Beta/Gamma-Delta pairs, │
│ H+L=Both chains separately. │
│ [default: None] │
│ [required] │
│ * --model TEXT The embedding model to use. BCR: │
│ ['ablang', 'antiberta2', │
│ 'antiberty', 'balm-paired']. TCR: │
│ ['tcr-bert', 'tcrt5']. Immune (BCR │
│ & TCR): ['immune2vec']. Protein: │
│ ['esm2', 'prott5', 'custom']. Use │
│ 'custom' for fine-tuned models with │
│ --model-path, │
│ --embedding-dimension, and │
│ --max-length parameters. │
│ [default: None] │
│ [required] │
│ * --output-file-path TEXT The path where the generated │
│ embeddings will be saved. The file │
│ extension should be .csv, or .tsv. │
│ for a dataframe, .pt for a pickled │
│ torch object, or .h5ad for an │
│ anndata object. │
│ [default: None] │
│ [required] │
│ --cache-dir TEXT Cache dir for storing the │
│ pre-trained model weights. │
│ [default: /tmp/amulety] │
│ --sequence-col TEXT The name of the column containing │
│ the amino acid sequences to embed. │
│ [default: sequence_vdj_aa] │
│ --cell-id-col TEXT The name of the column containing │
│ the single-cell barcode. │
│ [default: cell_id] │
│ --batch-size INTEGER The batch size of sequences to │
│ embed. │
│ [default: 50] │
│ --model-path TEXT Path to custom model (HuggingFace │
│ model name or local path). Required │
│ for 'custom' model. │
│ [default: None] │
│ --embedding-dimension INTEGER Embedding dimension for custom │
│ model. Required for 'custom' model. │
│ [default: None] │
│ --max-length INTEGER Maximum sequence length for custom │
│ model. Required for 'custom' model. │
│ [default: None] │
│ --duplicate-col TEXT The name of the numeric column used │
│ to select the best chain when │
│ multiple chains of the same type │
│ exist per cell. Default: │
│ 'duplicate_count'. Custom columns │
│ must be numeric and user-defined. │
│ [default: duplicate_count] │
│ --installation-path TEXT Custom path to Immune2Vec │
│ installation directory. Only │
│ applies to 'immune2vec' model. │
│ [default: None] │
│ --residue-level If True, returns residue-level │
│ embeddings of dimension sequence │
│ length x embedding dimension (L x │
│ D) instead of sequence-level (1 x │
│ D). │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
BCR embedding examples
Let’s demonstrate embedding BCR sequences using different models:
AntiBERTy (BCR-specific model)
[4]:
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model antiberty --batch-size 2 --output-file-path tutorial/test_embedding.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 11:28:55,583 - INFO - Detected single-cell data format
2025-09-24 11:28:55,585 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:28:55,586 - INFO - Removed 102 sequences not matching H chain
2025-09-24 11:29:02,850 - INFO - AntiBERTy loaded. Size: 26.03 M
2025-09-24 11:29:02,850 - INFO - Batch 1/48
2025-09-24 11:29:02,887 - INFO - Batch 2/48
2025-09-24 11:29:02,912 - INFO - Batch 3/48
2025-09-24 11:29:02,933 - INFO - Batch 4/48
2025-09-24 11:29:02,955 - INFO - Batch 5/48
2025-09-24 11:29:02,976 - INFO - Batch 6/48
2025-09-24 11:29:02,997 - INFO - Batch 7/48
2025-09-24 11:29:03,017 - INFO - Batch 8/48
2025-09-24 11:29:03,037 - INFO - Batch 9/48
2025-09-24 11:29:03,059 - INFO - Batch 10/48
2025-09-24 11:29:03,079 - INFO - Batch 11/48
2025-09-24 11:29:03,099 - INFO - Batch 12/48
2025-09-24 11:29:03,119 - INFO - Batch 13/48
2025-09-24 11:29:03,140 - INFO - Batch 14/48
2025-09-24 11:29:03,161 - INFO - Batch 15/48
2025-09-24 11:29:03,181 - INFO - Batch 16/48
2025-09-24 11:29:03,202 - INFO - Batch 17/48
2025-09-24 11:29:03,222 - INFO - Batch 18/48
2025-09-24 11:29:03,243 - INFO - Batch 19/48
2025-09-24 11:29:03,290 - INFO - Batch 20/48
2025-09-24 11:29:03,312 - INFO - Batch 21/48
2025-09-24 11:29:03,332 - INFO - Batch 22/48
2025-09-24 11:29:03,352 - INFO - Batch 23/48
2025-09-24 11:29:03,373 - INFO - Batch 24/48
2025-09-24 11:29:03,393 - INFO - Batch 25/48
2025-09-24 11:29:03,414 - INFO - Batch 26/48
2025-09-24 11:29:03,433 - INFO - Batch 27/48
2025-09-24 11:29:03,452 - INFO - Batch 28/48
2025-09-24 11:29:03,472 - INFO - Batch 29/48
2025-09-24 11:29:03,492 - INFO - Batch 30/48
2025-09-24 11:29:03,514 - INFO - Batch 31/48
2025-09-24 11:29:03,534 - INFO - Batch 32/48
2025-09-24 11:29:03,554 - INFO - Batch 33/48
2025-09-24 11:29:03,575 - INFO - Batch 34/48
2025-09-24 11:29:03,594 - INFO - Batch 35/48
2025-09-24 11:29:03,614 - INFO - Batch 36/48
2025-09-24 11:29:03,635 - INFO - Batch 37/48
2025-09-24 11:29:03,657 - INFO - Batch 38/48
2025-09-24 11:29:03,680 - INFO - Batch 39/48
2025-09-24 11:29:03,700 - INFO - Batch 40/48
2025-09-24 11:29:03,721 - INFO - Batch 41/48
2025-09-24 11:29:03,743 - INFO - Batch 42/48
2025-09-24 11:29:03,763 - INFO - Batch 43/48
2025-09-24 11:29:03,783 - INFO - Batch 44/48
2025-09-24 11:29:03,804 - INFO - Batch 45/48
2025-09-24 11:29:03,825 - INFO - Batch 46/48
2025-09-24 11:29:03,845 - INFO - Batch 47/48
2025-09-24 11:29:03,866 - INFO - Batch 48/48
2025-09-24 11:29:03,879 - INFO - Took 1.03 seconds
2025-09-24 11:29:03,880 - INFO - Generated embeddings with dimensions torch.Size([95, 512])
2025-09-24 11:29:03,880 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:29:03,881 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:29:03,885 - INFO - Saved embedding at tutorial/test_embedding.pt
AntiBERTa2 (BCR-specific model)
[10]:
# Embed heavy-light chain pairs using AntiBERTa2
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model antiberta2 --batch-size 2 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_antiberta2.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 11:44:05,686 - INFO - Detected single-cell data format
2025-09-24 11:44:05,688 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:44:05,688 - INFO - Removed 102 sequences not matching H chain
tokenizer_config.json: 100%|████████████████████| 116/116 [00:00<00:00, 339kB/s]
vocab.txt: 100%|█████████████████████████████| 80.0/80.0 [00:00<00:00, 1.56MB/s]
special_tokens_map.json: 100%|█████████████████| 124/124 [00:00<00:00, 1.24MB/s]
config.json: 100%|█████████████████████████████| 575/575 [00:00<00:00, 1.76MB/s]
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
2025-09-24 11:44:08,458 - WARNING - Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
model.safetensors: 100%|█████████████████████| 811M/811M [00:20<00:00, 40.2MB/s]
RoFormerForMaskedLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
- If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
- If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
- If you are not the owner of the model architecture class, please contact the model code owner to update it.
2025-09-24 11:44:29,008 - INFO - AntiBERTa2 loaded. Size: 202.642462 M
2025-09-24 11:44:29,009 - INFO - Batch 1/48.
2025-09-24 11:44:30,501 - INFO - Batch 2/48.
2025-09-24 11:44:30,743 - INFO - Batch 3/48.
2025-09-24 11:44:30,963 - INFO - Batch 4/48.
2025-09-24 11:44:31,181 - INFO - Batch 5/48.
2025-09-24 11:44:31,400 - INFO - Batch 6/48.
2025-09-24 11:44:31,614 - INFO - Batch 7/48.
2025-09-24 11:44:31,839 - INFO - Batch 8/48.
2025-09-24 11:44:32,061 - INFO - Batch 9/48.
2025-09-24 11:44:32,275 - INFO - Batch 10/48.
2025-09-24 11:44:32,496 - INFO - Batch 11/48.
2025-09-24 11:44:32,714 - INFO - Batch 12/48.
2025-09-24 11:44:32,926 - INFO - Batch 13/48.
2025-09-24 11:44:33,146 - INFO - Batch 14/48.
2025-09-24 11:44:33,370 - INFO - Batch 15/48.
2025-09-24 11:44:33,588 - INFO - Batch 16/48.
2025-09-24 11:44:33,812 - INFO - Batch 17/48.
2025-09-24 11:44:34,033 - INFO - Batch 18/48.
2025-09-24 11:44:34,255 - INFO - Batch 19/48.
2025-09-24 11:44:34,474 - INFO - Batch 20/48.
2025-09-24 11:44:34,692 - INFO - Batch 21/48.
2025-09-24 11:44:34,916 - INFO - Batch 22/48.
2025-09-24 11:44:35,129 - INFO - Batch 23/48.
2025-09-24 11:44:35,391 - INFO - Batch 24/48.
2025-09-24 11:44:35,613 - INFO - Batch 25/48.
2025-09-24 11:44:35,832 - INFO - Batch 26/48.
2025-09-24 11:44:36,059 - INFO - Batch 27/48.
2025-09-24 11:44:36,281 - INFO - Batch 28/48.
2025-09-24 11:44:36,500 - INFO - Batch 29/48.
2025-09-24 11:44:36,714 - INFO - Batch 30/48.
2025-09-24 11:44:36,934 - INFO - Batch 31/48.
2025-09-24 11:44:37,151 - INFO - Batch 32/48.
2025-09-24 11:44:37,365 - INFO - Batch 33/48.
2025-09-24 11:44:37,590 - INFO - Batch 34/48.
2025-09-24 11:44:37,813 - INFO - Batch 35/48.
2025-09-24 11:44:38,037 - INFO - Batch 36/48.
2025-09-24 11:44:38,285 - INFO - Batch 37/48.
2025-09-24 11:44:38,503 - INFO - Batch 38/48.
2025-09-24 11:44:38,729 - INFO - Batch 39/48.
2025-09-24 11:44:38,949 - INFO - Batch 40/48.
2025-09-24 11:44:39,168 - INFO - Batch 41/48.
2025-09-24 11:44:39,391 - INFO - Batch 42/48.
2025-09-24 11:44:39,608 - INFO - Batch 43/48.
2025-09-24 11:44:39,838 - INFO - Batch 44/48.
2025-09-24 11:44:40,064 - INFO - Batch 45/48.
2025-09-24 11:44:40,283 - INFO - Batch 46/48.
2025-09-24 11:44:40,507 - INFO - Batch 47/48.
2025-09-24 11:44:40,735 - INFO - Batch 48/48.
2025-09-24 11:44:40,868 - INFO - Took 11.86 seconds
2025-09-24 11:44:40,872 - INFO - Generated embeddings with dimensions torch.Size([95, 1024])
2025-09-24 11:44:40,873 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:44:40,875 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:44:40,881 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_antiberta2.pt
AbLang (BCR-specific model with separate heavy/light models)
[11]:
# Embed both heavy and light chains separately using AbLang
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H+L --model ablang --batch-size 2 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_ablang.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 11:45:05,420 - INFO - Detected single-cell data format
2025-09-24 11:45:05,421 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:45:06,418 - INFO - AbLang heavy chain model loaded
2025-09-24 11:45:06,418 - INFO - Batch 1/99
2025-09-24 11:45:06,500 - INFO - Batch 2/99
2025-09-24 11:45:06,570 - INFO - Batch 3/99
2025-09-24 11:45:06,642 - INFO - Batch 4/99
2025-09-24 11:45:06,712 - INFO - Batch 5/99
2025-09-24 11:45:06,790 - INFO - Batch 6/99
2025-09-24 11:45:06,864 - INFO - Batch 7/99
2025-09-24 11:45:06,939 - INFO - Batch 8/99
2025-09-24 11:45:07,011 - INFO - Batch 9/99
2025-09-24 11:45:07,081 - INFO - Batch 10/99
2025-09-24 11:45:07,153 - INFO - Batch 11/99
2025-09-24 11:45:07,224 - INFO - Batch 12/99
2025-09-24 11:45:07,298 - INFO - Batch 13/99
2025-09-24 11:45:07,372 - INFO - Batch 14/99
2025-09-24 11:45:07,447 - INFO - Batch 15/99
2025-09-24 11:45:07,521 - INFO - Batch 16/99
2025-09-24 11:45:07,594 - INFO - Batch 17/99
2025-09-24 11:45:07,672 - INFO - Batch 18/99
2025-09-24 11:45:07,748 - INFO - Batch 19/99
2025-09-24 11:45:07,823 - INFO - Batch 20/99
2025-09-24 11:45:07,902 - INFO - Batch 21/99
2025-09-24 11:45:07,981 - INFO - Batch 22/99
2025-09-24 11:45:08,057 - INFO - Batch 23/99
2025-09-24 11:45:08,137 - INFO - Batch 24/99
2025-09-24 11:45:08,221 - INFO - Batch 25/99
2025-09-24 11:45:08,307 - INFO - Batch 26/99
2025-09-24 11:45:08,392 - INFO - Batch 27/99
2025-09-24 11:45:08,476 - INFO - Batch 28/99
2025-09-24 11:45:08,553 - INFO - Batch 29/99
2025-09-24 11:45:08,630 - INFO - Batch 30/99
2025-09-24 11:45:08,707 - INFO - Batch 31/99
2025-09-24 11:45:08,783 - INFO - Batch 32/99
2025-09-24 11:45:08,859 - INFO - Batch 33/99
2025-09-24 11:45:08,935 - INFO - Batch 34/99
2025-09-24 11:45:09,012 - INFO - Batch 35/99
2025-09-24 11:45:09,087 - INFO - Batch 36/99
2025-09-24 11:45:09,161 - INFO - Batch 37/99
2025-09-24 11:45:09,237 - INFO - Batch 38/99
2025-09-24 11:45:09,311 - INFO - Batch 39/99
2025-09-24 11:45:09,389 - INFO - Batch 40/99
2025-09-24 11:45:09,467 - INFO - Batch 41/99
2025-09-24 11:45:09,545 - INFO - Batch 42/99
2025-09-24 11:45:09,621 - INFO - Batch 43/99
2025-09-24 11:45:09,699 - INFO - Batch 44/99
2025-09-24 11:45:09,775 - INFO - Batch 45/99
2025-09-24 11:45:09,851 - INFO - Batch 46/99
2025-09-24 11:45:09,927 - INFO - Batch 47/99
2025-09-24 11:45:10,004 - INFO - Batch 48/99
2025-09-24 11:45:10,082 - INFO - Batch 49/99
2025-09-24 11:45:10,155 - INFO - Batch 50/99
2025-09-24 11:45:10,234 - INFO - Batch 51/99
2025-09-24 11:45:10,307 - INFO - Batch 52/99
2025-09-24 11:45:10,380 - INFO - Batch 53/99
2025-09-24 11:45:10,455 - INFO - Batch 54/99
2025-09-24 11:45:10,525 - INFO - Batch 55/99
2025-09-24 11:45:10,599 - INFO - Batch 56/99
2025-09-24 11:45:10,672 - INFO - Batch 57/99
2025-09-24 11:45:10,744 - INFO - Batch 58/99
2025-09-24 11:45:10,817 - INFO - Batch 59/99
2025-09-24 11:45:10,893 - INFO - Batch 60/99
2025-09-24 11:45:10,967 - INFO - Batch 61/99
2025-09-24 11:45:11,042 - INFO - Batch 62/99
2025-09-24 11:45:11,112 - INFO - Batch 63/99
2025-09-24 11:45:11,197 - INFO - Batch 64/99
2025-09-24 11:45:11,305 - INFO - Batch 65/99
2025-09-24 11:45:11,380 - INFO - Batch 66/99
2025-09-24 11:45:11,448 - INFO - Batch 67/99
2025-09-24 11:45:11,521 - INFO - Batch 68/99
2025-09-24 11:45:11,597 - INFO - Batch 69/99
2025-09-24 11:45:11,660 - INFO - Batch 70/99
2025-09-24 11:45:11,734 - INFO - Batch 71/99
2025-09-24 11:45:11,810 - INFO - Batch 72/99
2025-09-24 11:45:11,883 - INFO - Batch 73/99
2025-09-24 11:45:11,955 - INFO - Batch 74/99
2025-09-24 11:45:12,028 - INFO - Batch 75/99
2025-09-24 11:45:12,102 - INFO - Batch 76/99
2025-09-24 11:45:12,177 - INFO - Batch 77/99
2025-09-24 11:45:12,252 - INFO - Batch 78/99
2025-09-24 11:45:12,325 - INFO - Batch 79/99
2025-09-24 11:45:12,399 - INFO - Batch 80/99
2025-09-24 11:45:12,472 - INFO - Batch 81/99
2025-09-24 11:45:12,546 - INFO - Batch 82/99
2025-09-24 11:45:12,627 - INFO - Batch 83/99
2025-09-24 11:45:12,704 - INFO - Batch 84/99
2025-09-24 11:45:12,780 - INFO - Batch 85/99
2025-09-24 11:45:12,853 - INFO - Batch 86/99
2025-09-24 11:45:12,926 - INFO - Batch 87/99
2025-09-24 11:45:12,998 - INFO - Batch 88/99
2025-09-24 11:45:13,073 - INFO - Batch 89/99
2025-09-24 11:45:13,144 - INFO - Batch 90/99
2025-09-24 11:45:13,211 - INFO - Batch 91/99
2025-09-24 11:45:13,286 - INFO - Batch 92/99
2025-09-24 11:45:13,358 - INFO - Batch 93/99
2025-09-24 11:45:13,424 - INFO - Batch 94/99
2025-09-24 11:45:13,495 - INFO - Batch 95/99
2025-09-24 11:45:13,566 - INFO - Batch 96/99
2025-09-24 11:45:13,638 - INFO - Batch 97/99
2025-09-24 11:45:13,709 - INFO - Batch 98/99
2025-09-24 11:45:13,779 - INFO - Batch 99/99
2025-09-24 11:45:13,817 - INFO - AbLang embedding completed. Took 7.4 seconds
2025-09-24 11:45:13,828 - INFO - Generated embeddings with dimensions torch.Size([197, 768])
2025-09-24 11:45:13,829 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:45:13,831 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:45:13,838 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_ablang.pt
BALM-paired model (BCR paired chains)
BALM-paired is a specialized model for BCR trained on paired heavy-light chains. We can embed concatenated heavy and light chains with AMULETY with the --chain HL option.
[13]:
# Embed heavy-light chain pairs using BALM-paired
# The model will be automatically downloaded on first use
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain HL --model balm-paired --batch-size 2 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_balm_paired.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 11:51:36,752 - INFO - Detected single-cell data format
2025-09-24 11:51:36,754 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:02:54,987 - INFO - Model size: 303.92M
Batch 1/48
Batch 2/48
Batch 3/48
Batch 4/48
Batch 5/48
Batch 6/48
Batch 7/48
Batch 8/48
Batch 9/48
Batch 10/48
Batch 11/48
Batch 12/48
Batch 13/48
Batch 14/48
Batch 15/48
Batch 16/48
Batch 17/48
Batch 18/48
Batch 19/48
Batch 20/48
Batch 21/48
Batch 22/48
Batch 23/48
Batch 24/48
Batch 25/48
Batch 26/48
Batch 27/48
Batch 28/48
Batch 29/48
Batch 30/48
Batch 31/48
Batch 32/48
Batch 33/48
Batch 34/48
Batch 35/48
Batch 36/48
Batch 37/48
Batch 38/48
Batch 39/48
Batch 40/48
Batch 41/48
Batch 42/48
Batch 43/48
Batch 44/48
Batch 45/48
Batch 46/48
Batch 47/48
Batch 48/48
2025-09-24 12:03:21,260 - INFO - Took 26.27 seconds
2025-09-24 12:03:21,266 - INFO - Generated embeddings with dimensions torch.Size([95, 1024])
2025-09-24 12:03:21,267 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:03:21,270 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:03:21,273 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_balm_paired.pt
Protein Language Models
Then we want to use the same dataset to embed using the general protein language models.
ESM2 (Protein language model)
[5]:
# Embed heavy chains only using ESM2
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model esm2 --batch-size 1 --output-file-path tutorial/AIRR_subject1_FNA_d0_1_Y1_esm2.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 11:29:55,935 - INFO - Detected single-cell data format
2025-09-24 11:29:55,935 - INFO - Processing both BCR and TCR sequences from the file.
2025-09-24 11:29:55,936 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 11:29:55,936 - INFO - Removed 102 sequences not matching H chain
tokenizer_config.json: 100%|██████████████████| 95.0/95.0 [00:00<00:00, 157kB/s]
vocab.txt: 100%|█████████████████████████████| 93.0/93.0 [00:00<00:00, 1.33MB/s]
special_tokens_map.json: 100%|██████████████████| 125/125 [00:00<00:00, 448kB/s]
config.json: 100%|█████████████████████████████| 724/724 [00:00<00:00, 2.76MB/s]
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
2025-09-24 11:29:58,760 - WARNING - Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
model.safetensors: 100%|███████████████████| 2.61G/2.61G [01:08<00:00, 38.2MB/s]
2025-09-24 11:31:07,501 - INFO - ESM2 650M model size: 652.36 M
2025-09-24 11:31:07,501 - INFO - Batch 1/95.
2025-09-24 11:31:13,329 - INFO - Batch 2/95.
2025-09-24 11:31:14,066 - INFO - Batch 3/95.
2025-09-24 11:31:14,759 - INFO - Batch 4/95.
2025-09-24 11:31:15,492 - INFO - Batch 5/95.
2025-09-24 11:31:16,153 - INFO - Batch 6/95.
2025-09-24 11:31:16,798 - INFO - Batch 7/95.
2025-09-24 11:31:17,454 - INFO - Batch 8/95.
2025-09-24 11:31:18,110 - INFO - Batch 9/95.
2025-09-24 11:31:18,772 - INFO - Batch 10/95.
2025-09-24 11:31:19,412 - INFO - Batch 11/95.
2025-09-24 11:31:20,058 - INFO - Batch 12/95.
2025-09-24 11:31:20,715 - INFO - Batch 13/95.
2025-09-24 11:31:21,671 - INFO - Batch 14/95.
2025-09-24 11:31:22,346 - INFO - Batch 15/95.
2025-09-24 11:31:23,040 - INFO - Batch 16/95.
2025-09-24 11:31:23,723 - INFO - Batch 17/95.
2025-09-24 11:31:24,406 - INFO - Batch 18/95.
2025-09-24 11:31:25,055 - INFO - Batch 19/95.
2025-09-24 11:31:25,714 - INFO - Batch 20/95.
2025-09-24 11:31:26,358 - INFO - Batch 21/95.
2025-09-24 11:31:27,010 - INFO - Batch 22/95.
2025-09-24 11:31:27,664 - INFO - Batch 23/95.
2025-09-24 11:31:28,306 - INFO - Batch 24/95.
2025-09-24 11:31:28,956 - INFO - Batch 25/95.
2025-09-24 11:31:29,610 - INFO - Batch 26/95.
2025-09-24 11:31:30,291 - INFO - Batch 27/95.
2025-09-24 11:31:30,959 - INFO - Batch 28/95.
2025-09-24 11:31:31,616 - INFO - Batch 29/95.
2025-09-24 11:31:32,260 - INFO - Batch 30/95.
2025-09-24 11:31:32,915 - INFO - Batch 31/95.
2025-09-24 11:31:33,563 - INFO - Batch 32/95.
2025-09-24 11:31:34,215 - INFO - Batch 33/95.
2025-09-24 11:31:34,877 - INFO - Batch 34/95.
2025-09-24 11:31:35,533 - INFO - Batch 35/95.
2025-09-24 11:31:36,186 - INFO - Batch 36/95.
2025-09-24 11:31:36,835 - INFO - Batch 37/95.
2025-09-24 11:31:37,492 - INFO - Batch 38/95.
2025-09-24 11:31:38,145 - INFO - Batch 39/95.
2025-09-24 11:31:38,793 - INFO - Batch 40/95.
2025-09-24 11:31:39,455 - INFO - Batch 41/95.
2025-09-24 11:31:40,097 - INFO - Batch 42/95.
2025-09-24 11:31:40,755 - INFO - Batch 43/95.
2025-09-24 11:31:41,418 - INFO - Batch 44/95.
2025-09-24 11:31:42,113 - INFO - Batch 45/95.
2025-09-24 11:31:42,801 - INFO - Batch 46/95.
2025-09-24 11:31:43,463 - INFO - Batch 47/95.
2025-09-24 11:31:44,124 - INFO - Batch 48/95.
2025-09-24 11:31:44,776 - INFO - Batch 49/95.
2025-09-24 11:31:45,437 - INFO - Batch 50/95.
2025-09-24 11:31:46,097 - INFO - Batch 51/95.
2025-09-24 11:31:46,754 - INFO - Batch 52/95.
2025-09-24 11:31:47,421 - INFO - Batch 53/95.
2025-09-24 11:31:48,078 - INFO - Batch 54/95.
2025-09-24 11:31:48,740 - INFO - Batch 55/95.
2025-09-24 11:31:49,402 - INFO - Batch 56/95.
2025-09-24 11:31:50,067 - INFO - Batch 57/95.
2025-09-24 11:31:50,727 - INFO - Batch 58/95.
2025-09-24 11:31:51,390 - INFO - Batch 59/95.
2025-09-24 11:31:52,045 - INFO - Batch 60/95.
2025-09-24 11:31:52,711 - INFO - Batch 61/95.
2025-09-24 11:31:53,381 - INFO - Batch 62/95.
2025-09-24 11:31:54,035 - INFO - Batch 63/95.
2025-09-24 11:31:54,692 - INFO - Batch 64/95.
2025-09-24 11:31:55,358 - INFO - Batch 65/95.
2025-09-24 11:31:56,032 - INFO - Batch 66/95.
2025-09-24 11:31:56,698 - INFO - Batch 67/95.
2025-09-24 11:31:57,364 - INFO - Batch 68/95.
2025-09-24 11:31:58,028 - INFO - Batch 69/95.
2025-09-24 11:31:58,678 - INFO - Batch 70/95.
2025-09-24 11:31:59,360 - INFO - Batch 71/95.
2025-09-24 11:32:00,035 - INFO - Batch 72/95.
2025-09-24 11:32:00,710 - INFO - Batch 73/95.
2025-09-24 11:32:01,464 - INFO - Batch 74/95.
2025-09-24 11:32:02,132 - INFO - Batch 75/95.
2025-09-24 11:32:02,799 - INFO - Batch 76/95.
2025-09-24 11:32:03,452 - INFO - Batch 77/95.
2025-09-24 11:32:04,112 - INFO - Batch 78/95.
2025-09-24 11:32:04,775 - INFO - Batch 79/95.
2025-09-24 11:32:05,430 - INFO - Batch 80/95.
2025-09-24 11:32:06,089 - INFO - Batch 81/95.
2025-09-24 11:32:06,751 - INFO - Batch 82/95.
2025-09-24 11:32:07,423 - INFO - Batch 83/95.
2025-09-24 11:32:08,078 - INFO - Batch 84/95.
2025-09-24 11:32:08,731 - INFO - Batch 85/95.
2025-09-24 11:32:09,408 - INFO - Batch 86/95.
2025-09-24 11:32:10,059 - INFO - Batch 87/95.
2025-09-24 11:32:10,719 - INFO - Batch 88/95.
2025-09-24 11:32:11,371 - INFO - Batch 89/95.
2025-09-24 11:32:12,039 - INFO - Batch 90/95.
2025-09-24 11:32:12,692 - INFO - Batch 91/95.
2025-09-24 11:32:13,349 - INFO - Batch 92/95.
2025-09-24 11:32:14,010 - INFO - Batch 93/95.
2025-09-24 11:32:14,681 - INFO - Batch 94/95.
2025-09-24 11:32:15,338 - INFO - Batch 95/95.
2025-09-24 11:32:15,998 - INFO - Took 68.5 seconds
2025-09-24 11:32:16,012 - INFO - Generated embeddings with dimensions torch.Size([95, 1280])
2025-09-24 11:32:16,013 - INFO - Saving embedding as a pickled torch object.
2025-09-24 11:32:16,015 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 11:32:16,033 - INFO - Saved embedding at tutorial/AIRR_subject1_FNA_d0_1_Y1_esm2.pt
Custom/Fine-tuned models
You can use custom or fine-tuned models from HuggingFace or local paths using the custom model type:
[14]:
# Example: Using a fine-tuned ESM2 model from HuggingFace
! amulety embed --input-airr tutorial/AIRR_subject1_FNA_d0_1_Y1_translated.tsv --chain H --model custom \
--model-path "AmelieSchreiber/esm2_t6_8M_UR50D-finetuned-localization" \
--embedding-dimension 320 \
--max-length 512 \
--batch-size 2 \
--output-file-path tutorial/custom_model_embeddings.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 12:30:43,593 - INFO - Detected single-cell data format
2025-09-24 12:30:43,595 - INFO - Processing both BCR and TCR sequences from the file.
2025-09-24 12:30:43,596 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:30:43,597 - INFO - Removed 102 sequences not matching H chain
Some weights of EsmForMaskedLM were not initialized from the model checkpoint at AmelieSchreiber/esm2_t6_8M_UR50D-finetuned-localization and are newly initialized: ['lm_head.bias', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2025-09-24 12:30:46,532 - INFO - Model size: 7.84M
Batch 1/48
Batch 2/48
Batch 3/48
Batch 4/48
Batch 5/48
Batch 6/48
Batch 7/48
Batch 8/48
Batch 9/48
Batch 10/48
Batch 11/48
Batch 12/48
Batch 13/48
Batch 14/48
Batch 15/48
Batch 16/48
Batch 17/48
Batch 18/48
Batch 19/48
Batch 20/48
Batch 21/48
Batch 22/48
Batch 23/48
Batch 24/48
Batch 25/48
Batch 26/48
Batch 27/48
Batch 28/48
Batch 29/48
Batch 30/48
Batch 31/48
Batch 32/48
Batch 33/48
Batch 34/48
Batch 35/48
Batch 36/48
Batch 37/48
Batch 38/48
Batch 39/48
Batch 40/48
Batch 41/48
Batch 42/48
Batch 43/48
Batch 44/48
Batch 45/48
Batch 46/48
Batch 47/48
Batch 48/48
2025-09-24 12:30:51,159 - INFO - Took 4.63 seconds
2025-09-24 12:30:51,159 - INFO - Generated embeddings with dimensions torch.Size([95, 320])
2025-09-24 12:30:51,160 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:30:51,161 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:30:51,165 - INFO - Saved embedding at tutorial/custom_model_embeddings.pt
TCR embedding examples
AMULETY also supports TCR-specific models. Here we also provide TCR example data and you can download and have a try:
[7]:
# Download TCR example data
! wget -P tutorial https://zenodo.org/records/17186858/files/AIRR_tcr_sample.tsv
--2025-09-24 11:35:16-- https://zenodo.org/records/17186858/files/AIRR_tcr_sample.tsv
Resolving zenodo.org (zenodo.org)... 188.185.45.92, 188.185.48.194, 188.185.43.25, ...
Connecting to zenodo.org (zenodo.org)|188.185.45.92|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 40915 (40K) [application/octet-stream]
Saving to: 'tutorial/AIRR_tcr_sample.tsv'
AIRR_tcr_sample.tsv 100%[===================>] 39.96K 166KB/s in 0.2s
2025-09-24 11:35:17 (166 KB/s) - 'tutorial/AIRR_tcr_sample.tsv' saved [40915/40915]
TCR-BERT (TCR-specific model)
[15]:
# Embed TCR beta-alpha chain pairs using TCR-BERT
# Note: This assumes you have TCR data in AIRR format
! amulety embed --input-airr tutorial/AIRR_tcr_sample.tsv --chain HL --model tcr-bert --batch-size 2 --output-file-path tutorial/tcr_embeddings_tcrbert.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 12:31:02,594 - INFO - Detected single-cell data format
2025-09-24 12:31:02,595 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:31:02,599 - INFO - Dropping 100 cells with missing heavy or light chain...
2025-09-24 12:31:02,600 - INFO - Loading TCR-BERT model for TCR embedding...
2025-09-24 12:31:04,618 - INFO - Successfully loaded TCR-BERT model
2025-09-24 12:31:04,619 - INFO - TCR-BERT model loaded. Size: 57.39 M
2025-09-24 12:31:04,619 - INFO - TCR-BERT Batch 1/25.
2025-09-24 12:31:04,663 - INFO - TCR-BERT Batch 2/25.
2025-09-24 12:31:04,689 - INFO - TCR-BERT Batch 3/25.
2025-09-24 12:31:04,712 - INFO - TCR-BERT Batch 4/25.
2025-09-24 12:31:04,735 - INFO - TCR-BERT Batch 5/25.
2025-09-24 12:31:04,756 - INFO - TCR-BERT Batch 6/25.
2025-09-24 12:31:04,780 - INFO - TCR-BERT Batch 7/25.
2025-09-24 12:31:04,802 - INFO - TCR-BERT Batch 8/25.
2025-09-24 12:31:04,827 - INFO - TCR-BERT Batch 9/25.
2025-09-24 12:31:04,849 - INFO - TCR-BERT Batch 10/25.
2025-09-24 12:31:04,872 - INFO - TCR-BERT Batch 11/25.
2025-09-24 12:31:04,895 - INFO - TCR-BERT Batch 12/25.
2025-09-24 12:31:04,917 - INFO - TCR-BERT Batch 13/25.
2025-09-24 12:31:04,940 - INFO - TCR-BERT Batch 14/25.
2025-09-24 12:31:04,961 - INFO - TCR-BERT Batch 15/25.
2025-09-24 12:31:04,984 - INFO - TCR-BERT Batch 16/25.
2025-09-24 12:31:05,006 - INFO - TCR-BERT Batch 17/25.
2025-09-24 12:31:05,028 - INFO - TCR-BERT Batch 18/25.
2025-09-24 12:31:05,052 - INFO - TCR-BERT Batch 19/25.
2025-09-24 12:31:05,074 - INFO - TCR-BERT Batch 20/25.
2025-09-24 12:31:05,097 - INFO - TCR-BERT Batch 21/25.
2025-09-24 12:31:05,119 - INFO - TCR-BERT Batch 22/25.
2025-09-24 12:31:05,143 - INFO - TCR-BERT Batch 23/25.
2025-09-24 12:31:05,166 - INFO - TCR-BERT Batch 24/25.
2025-09-24 12:31:05,188 - INFO - TCR-BERT Batch 25/25.
2025-09-24 12:31:05,211 - INFO - TCR-BERT embedding took 0.59 seconds
2025-09-24 12:31:05,212 - INFO - Generated embeddings with dimensions torch.Size([50, 768])
2025-09-24 12:31:05,212 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:31:05,213 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:31:05,215 - INFO - Saved embedding at tutorial/tcr_embeddings_tcrbert.pt
TCRT5 (TCR beta chain only)
[16]:
# Embed TCR beta chains using TCRT5 (only supports H/beta chains)
! amulety embed --input-airr tutorial/AIRR_tcr_sample.tsv --chain H --model tcrt5 --batch-size 2 --output-file-path tutorial/tcr_embeddings_tcrt5.pt
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
2025-09-24 12:31:11,221 - INFO - Detected single-cell data format
2025-09-24 12:31:11,221 - INFO - Single-cell AIRR data detected (all entries have cell_id).
2025-09-24 12:31:11,222 - INFO - Removed 100 sequences not matching H chain
2025-09-24 12:31:11,222 - INFO - Loading TCRT5 model for TCR embedding...
tokenizer_config.json: 21.1kB [00:00, 23.3MB/s]
spiece.model: 100%|██████████████████████████| 238k/238k [00:00<00:00, 2.78MB/s]
added_tokens.json: 2.35kB [00:00, 16.2MB/s]
special_tokens_map.json: 2.64kB [00:00, 12.0MB/s]
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'TCRT5Tokenizer'.
The class this function is called from is 'T5Tokenizer'.
config.json: 100%|█████████████████████████████| 970/970 [00:00<00:00, 8.67MB/s]
model.safetensors: 100%|█████████████████████| 168M/168M [00:03<00:00, 46.0MB/s]
/opt/anaconda3/envs/torchen/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:817: UserWarning: `return_dict_in_generate` is NOT set to `True`, but `output_attentions` is. When `return_dict_in_generate` is not `True`, `output_attentions` is ignored.
warnings.warn(
/opt/anaconda3/envs/torchen/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:817: UserWarning: `return_dict_in_generate` is NOT set to `True`, but `output_hidden_states` is. When `return_dict_in_generate` is not `True`, `output_hidden_states` is ignored.
warnings.warn(
/opt/anaconda3/envs/torchen/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:817: UserWarning: `return_dict_in_generate` is NOT set to `True`, but `output_scores` is. When `return_dict_in_generate` is not `True`, `output_scores` is ignored.
warnings.warn(
generation_config.json: 100%|██████████████████| 249/249 [00:00<00:00, 3.32MB/s]
2025-09-24 12:31:17,681 - INFO - TCRT5 Batch 1/50.
2025-09-24 12:31:17,707 - INFO - TCRT5 Batch 2/50.
2025-09-24 12:31:17,719 - INFO - TCRT5 Batch 3/50.
2025-09-24 12:31:17,730 - INFO - TCRT5 Batch 4/50.
2025-09-24 12:31:17,740 - INFO - TCRT5 Batch 5/50.
2025-09-24 12:31:17,753 - INFO - TCRT5 Batch 6/50.
2025-09-24 12:31:17,763 - INFO - TCRT5 Batch 7/50.
2025-09-24 12:31:17,773 - INFO - TCRT5 Batch 8/50.
2025-09-24 12:31:17,784 - INFO - TCRT5 Batch 9/50.
2025-09-24 12:31:17,795 - INFO - TCRT5 Batch 10/50.
2025-09-24 12:31:17,806 - INFO - TCRT5 Batch 11/50.
2025-09-24 12:31:17,817 - INFO - TCRT5 Batch 12/50.
2025-09-24 12:31:17,827 - INFO - TCRT5 Batch 13/50.
2025-09-24 12:31:17,837 - INFO - TCRT5 Batch 14/50.
2025-09-24 12:31:17,848 - INFO - TCRT5 Batch 15/50.
2025-09-24 12:31:17,860 - INFO - TCRT5 Batch 16/50.
2025-09-24 12:31:17,871 - INFO - TCRT5 Batch 17/50.
2025-09-24 12:31:17,882 - INFO - TCRT5 Batch 18/50.
2025-09-24 12:31:17,893 - INFO - TCRT5 Batch 19/50.
2025-09-24 12:31:17,904 - INFO - TCRT5 Batch 20/50.
2025-09-24 12:31:17,914 - INFO - TCRT5 Batch 21/50.
2025-09-24 12:31:17,924 - INFO - TCRT5 Batch 22/50.
2025-09-24 12:31:17,934 - INFO - TCRT5 Batch 23/50.
2025-09-24 12:31:17,944 - INFO - TCRT5 Batch 24/50.
2025-09-24 12:31:17,955 - INFO - TCRT5 Batch 25/50.
2025-09-24 12:31:17,966 - INFO - TCRT5 Batch 26/50.
2025-09-24 12:31:17,976 - INFO - TCRT5 Batch 27/50.
2025-09-24 12:31:17,987 - INFO - TCRT5 Batch 28/50.
2025-09-24 12:31:17,997 - INFO - TCRT5 Batch 29/50.
2025-09-24 12:31:18,008 - INFO - TCRT5 Batch 30/50.
2025-09-24 12:31:18,018 - INFO - TCRT5 Batch 31/50.
2025-09-24 12:31:18,028 - INFO - TCRT5 Batch 32/50.
2025-09-24 12:31:18,038 - INFO - TCRT5 Batch 33/50.
2025-09-24 12:31:18,049 - INFO - TCRT5 Batch 34/50.
2025-09-24 12:31:18,060 - INFO - TCRT5 Batch 35/50.
2025-09-24 12:31:18,070 - INFO - TCRT5 Batch 36/50.
2025-09-24 12:31:18,080 - INFO - TCRT5 Batch 37/50.
2025-09-24 12:31:18,090 - INFO - TCRT5 Batch 38/50.
2025-09-24 12:31:18,100 - INFO - TCRT5 Batch 39/50.
2025-09-24 12:31:18,111 - INFO - TCRT5 Batch 40/50.
2025-09-24 12:31:18,121 - INFO - TCRT5 Batch 41/50.
2025-09-24 12:31:18,131 - INFO - TCRT5 Batch 42/50.
2025-09-24 12:31:18,141 - INFO - TCRT5 Batch 43/50.
2025-09-24 12:31:18,150 - INFO - TCRT5 Batch 44/50.
2025-09-24 12:31:18,160 - INFO - TCRT5 Batch 45/50.
2025-09-24 12:31:18,171 - INFO - TCRT5 Batch 46/50.
2025-09-24 12:31:18,181 - INFO - TCRT5 Batch 47/50.
2025-09-24 12:31:18,191 - INFO - TCRT5 Batch 48/50.
2025-09-24 12:31:18,201 - INFO - TCRT5 Batch 49/50.
2025-09-24 12:31:18,212 - INFO - TCRT5 Batch 50/50.
2025-09-24 12:31:18,224 - INFO - TCRT5 embedding took 7.0 seconds
2025-09-24 12:31:18,225 - INFO - Generated embeddings with dimensions torch.Size([100, 256])
2025-09-24 12:31:18,226 - INFO - Saving embedding as a pickled torch object.
2025-09-24 12:31:18,226 - INFO - Saving sequence filtered metadata as TSV file.
2025-09-24 12:31:18,228 - INFO - Saved embedding at tutorial/tcr_embeddings_tcrt5.pt
Checking dependencies
Some models require additional dependencies that are not installed by default. You can check which dependencies are missing:
[1]:
# Check which optional dependencies are missing
! amulety check-deps
█████ ███ ███ ██ ██ ██ ███████ ████████ ██ ██
██ ██ ████ ████ ██ ██ ██ ██ ██ ██ ██
███████ ██ ████ ██ ██ ██ ██ █████ ██ ████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
██ ██ ██ ██ ██████ ███████ ███████ ██ ██
AMULETY: Adaptive imMUne receptor Language model Embedding tool for TCR and
antibodY
version 2.0
Checking AMULETY dependencies...
IgBlast (for translate-igblast command):
IgBlast (igblastn) is available
Embedding model dependencies:
2025-09-24 12:51:20,234 - INFO - Available models: AntiBERTy, AbLang, TCR-BERT, TCRT5, ESM2, ProtT5
2025-09-24 12:51:20,234 - WARNING - Missing model dependencies: Immune2Vec
1 dependencies are missing.
AMULETY will raise ImportError with installation instructions when these models are used.
To install missing dependencies:
• Immune2Vec: git clone https://bitbucket.org/yaarilab/immune2vec_model.git && add to Python path
Note: Models will provide detailed installation instructions when used.