What are Text Tokens?

What are tokens?

We utilise OpenAI's language models (LLMs) to analyze your data. These models process text by breaking it down into tokens, which are essentially sequences of characters commonly found in text.

Are words and tokens the same thing?

No, the relationship between words and tokens can vary depending on several factors. However, a general rule of thumb is that 100 tokens correspond to roughly 75 words.

What is the maximum word count?

There is a limit of 8,000 tokens per file for analysis. Because the relationship between tokens and words is not fixed, we cannot provide an exact word count limit. However, in most cases, this equates to approximately 6,000 words. This is why any word count over the limit is an approximation.

Can I check my files before uploading them to MyRA?

Yes, you can find more information about tokens and use a tokenizer tool here.