Date: 18.08.2025
TensorFlow-friendly causal LMs in 🤗 Transformers 4.x
This table lists decoder-only (causal LM) model families that have TensorFlow classes in Transformers 4.x and publish TensorFlow weights (tf_model.h5
) on the Hugging Face Hub.
If tf_model.h5
is present for the checkpoint, you can load with TFAutoModelForCausalLM.from_pretrained(...)
without converting from PyTorch.
Model family | Hugging Face repo | TF weights present? (Conversion needed?) |
---|---|---|
OpenAI GPT (GPT‑1) | openai-community/openai-gpt | Yes — tf_model.h5 available → No conversion needed |
GPT‑2 (family) | openai-community/gpt2 | Yes — tf_model.h5 available → No conversion needed |
DistilGPT‑2 | distilbert/distilgpt2 | Yes — tf_model.h5 available → No conversion needed |
DialoGPT (GPT‑2 based) | microsoft/DialoGPT-medium | Yes — tf_model.h5 available → No conversion needed |
CTRL | Salesforce/ctrl | Yes — tf_model.h5 available → No conversion needed |
Transformer‑XL | transfo-xl/transfo-xl-wt103 | Yes — tf_model.h5 available → No conversion needed |
XLNet | xlnet/xlnet-base-cased | Yes — tf_model.h5 available → No conversion needed |
XLM (CLM variant) | FacebookAI/xlm-clm-ende-1024 | Yes — tf_model.h5 available → No conversion needed |
OPT (small) | facebook/opt-125m | Yes — tf_model.h5 available → No conversion needed |
OPT (2.7B) | facebook/opt-2.7b | Yes — tf_model.h5 available → No conversion needed |
GPT‑J‑6B | EleutherAI/gpt-j-6b | Yes — tf_model.h5 available → No conversion needed |
Notes
- Transformers v4.x only. TensorFlow classes (e.g.,
TFAutoModelForCausalLM
,TFGPT2LMHeadModel
) are part of the 4.x line. In Transformers v5, TF support has been removed. Pintransformers<5
for TF usage. - Per‑checkpoint variation. Some orgs publish TF weights for certain sizes but not others. Always check the target checkpoint’s Files & versions tab.
- GPT‑2
medium
/large
/xl
also provide TF weights (tf_model.h5
).
- GPT‑2
- If there is no
tf_model.h5
but the family has a TF class: you can often load from PyTorch withfrom_pt=True
and thensave_pretrained(...)
to create a localtf_model.h5
. - Modern LLMs (e.g., LLaMA, Mistral, Mixtral, Gemma) generally do not ship TF classes in Transformers; they are PyTorch‑only in the library.
- Keras 3 compatibility: with TF ≥ 2.16 use
tf-keras~=2.16
and setTF_USE_LEGACY_KERAS=1
before importing TensorFlow sotransformers
4.x works smoothly.