Character n-gram tokenizers and filters.