Osma Suominen: "@hisham If there are any "frag…"

@hisham
If there are any "fragments of your work" left in the output, they are very few and far between.

Take the recent Llama 3 8B model from Meta. It was trained on 15T tokens, around 100 terabytes (10^14 bytes) of text, including some written by you and me. The trained model can be downloaded as a set of files totalling around 16GB (1.6 * 10^10 bytes). There's no way all that text can be compressed by four magnitudes while retaining the original works within.

@mcpinson @mcc @WomanCorn

May 10, 2024, 05:49 AM··Fedilab

0boosts·2favorites

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back