Language models

We provide a range of models on our huggingface:

first version of the model in small architecture (we recommend the successor, version two): radlab/polish-gpt2-small
second version of the model in small architecture: radlab/polish-gpt2-small-v2
second version of the model in medium architecture (the first version is no longer publicly available due to low accuracy): radlab/polish-gpt2-medium-v2

model for extracting answers to questions from any text: radlab/polish-qa-v2 (and bitsandbytes quantized version radlab/polish-qa-v2-bnb)
a model that detects the polarity of information from news articles, running on playgroundzie and available on huggingface: radlab/polarity-3c

model in t5-base architecture for text cleaning: radlab/polish-denoiser-t5-base

bi-encoder for texts written in Polish (we recommend the newer version of this model described below): radlab/polish-sts-v2
a newer version of the bi-encoder, with much higher correlation during training with an averaged pooling layer: radlab/polish-bi-encoder-mean
The semantic bi‑encoder radlab/semantic-euro-bert-encoder-v1, trained on examples from Słowosieć, maps semantic relationships between Polish and English meanings, as well as cross‑lingual relationships.
cross-encoder for re-evaluation: radlab/polish-cross-encoder

radlab/pLLama3-8B-creator, a model that provides fairly short, specific answers to user queries;
radlab/pLLama3-8B-chat – a model that is a chatty version, reflecting the behavior of the original meta-llama/Meta-Llama-3-8B-Instruct.
radlab/pLLama3-70B – probably the largest PL model to date?!
radlab/pLLama3.1-8B-content: this model, following SFT and DPO, provides short and concise answers.
radlab/pLLama3.1-8B-chat is a more talkative version of the model (after SFT and DPO), ideal for chatting.
radlab/pLLama3.1-8B-base-ft-16bit this is the pLLama3.18B model directly after SFT with LoRa.
radlab/pLLama-L31-adapters-MIX-SFT-DPO experimental model with transfer of adaptive layers between models.
radlab/pLLama3.2-1B – pLLama3.2 models in 1B architecture only ofter fine-tuning.
radlab/pLLama3.2-1B-DPO pLLama3.2 models in 1B architecture only ofter fine-tuning and DPO.
radlab/pLLama3.2-3B – pLLama3.2 model in 3B architecture only after fine-tuning
radlab/pLLama3.2-3B-DPO pLLama3.2 models in 3B architecture only ofter fine-tuning and DPO.

fast tokenizer trained on a large volume (approx. 30 GB of text in Polish): radlab/polish-fast-tokenizer

Based on word2vec vector models, we have developed a semantic similarity list.