Paper – LLama

LLaMA is a collection of foundation language models ranging from 7B to 65B parameters, trained on trillions of tokens using publicly available datasets exclusively. Training Data English CommonCrawl [67%] Five CommonCrawl dumps, ranging from 2017 to 2020, are preprocessed using the CCNet pipeline. This process deduplicates the data at the line level, performs language identification … Continue reading Paper – LLama