IBM Unveils Full 6.48 TB LLM Training Dataset
IBM recently announced the release of an open source language model, Your Granite 13B LLM, designed for enterprise applications. Armand Ruiz, IBM’s vice president of AI platform products, has now shared details of the extensive 6.48TB dataset used to train Granite 13B. This dataset, which underwent thorough preprocessing, was ultimately reduced to 2.07TB, representing a … Read more