Mongodb
Search Tokens
How to implement and use search tokens for efficient text search in MongoDB
Search tokens are a powerful mechanism in Orionjs that enable efficient and flexible text search capabilities in MongoDB without the overhead of full-text search. They work by preprocessing text fields into normalized tokens that can be indexed and queried efficiently.
Why Use Search Tokens?
- Simplicity: No need to create complex regex queries or text indexes
- Performance: Significantly faster than regex or text queries
- Flexibility: Combine text search with category filtering
- Normalized Search: Case-insensitive and accent-insensitive matching
- Prefix Matching: Find results that start with search terms
Implementation
1. Add Search Tokens Field to Your Schema
First, add a searchTokens
field to your schema:
2. Create an Index on Search Tokens
Add an index on the searchTokens
field in your repository:
3. Implement a Method to Generate Search Tokens
Add a method to generate search tokens from relevant fields:
4. Update Search Tokens When Creating or Updating Documents
5. Query Using Search Tokens
How Search Tokens Work
- Text Tokenization: Text fields are split into tokens, converted to lowercase, and normalized
- Prefix Generation: Additional tokens are created for prefixes to enable prefix searching
- Category Markers: Category fields are converted to tokens with prefixes to enable category filtering
- Query Building: The
getSearchQueryForTokens
function converts search terms into MongoDB queries
Best Practices
- Include Important Text Fields: Add all searchable text fields to the tokens
- Short MongoDB IDs: Use
shortenMongoId
to include readable portions of IDs - Category Fields: Include fields used for filtering in the second argument of
getSearchTokens
- Ensure Tokens are Updated: Always update search tokens when document fields change
- Checking Token Equality: Use a deep comparison like
isEqual
to avoid unnecessary updates - Error Handling: Implement proper error handling for token updates
- Background Updates: Update tokens in the background to avoid blocking user operations
Complete Example
Performance Considerations
- Keep the number of tokens reasonable (< 100 per document)
- Consider sharding for very large collections
- For extremely complex search needs, consider using a dedicated search engine
Related Resources
Was this page helpful?