Skip to content

Menu

About us
- What do we do?
- Projects we are implementing
Our Solutions
- RDL Playground Ai
- LLM Router Cloud
Open source
Blog
Other
PL

Text resources

Resources available on the huggingface platform:

dataset for training models used in information retrieval/building embeddings radlab/polish-sts-dataset
data for pre-training/fine-tuning models with a dominant legal language available in jsonl radlab/legal-mc4-pl
similar to legal-mc4-pl data for training models, this time Polish Wikipedia radlab/wikipedia-pl
Wrocław University of Technology corpus kgr10 available as jsonl text format, data for model pre-training/fine-tuning: radlab/kgr10

Search

LATEST POSTS:

“Are crypto mining farms frozen?”2026-01-01
Can a knowledge base be easily introduced for GenAI?2025-12-27
Is it possible to break the laws of nature?2025-12-26
Llm-Router- a connector between an application and generative models2025-10-23
We make the codes publicly available – open source.2025-09-27

Categories:

Bez kategorii
Crypto Mining
embedding
experiment
GenAI
github
gpt2
huggingface
interesting
llama
LLM
LLM Router
Load Balancing
MachineLearning
method
models
Open Source
Q&A
rag
repository
SaaS
t5
tools
transformers
word2vec

Archives:

2026
2025
2024
2023
2022
2021
2020

Keep Updated

Copyright © 2026 RadLab – OnePress theme by FameThemes