Vector search has revolutionized the way we approach information retrieval, offering unprecedented accuracy and semantic understanding. However, implementing vector search from scratch can be a daunting task, especially for those new to the technology. Enter the
OpenSearch Neural Plugin – a game-changing tool that simplifies vector search implementation, making it accessible even to newcomers in the field.
The best part? This powerful plugin comes pre-packaged with vanilla OpenSearch, eliminating the need for complex installations or configurations.
In this three-part guide, we’ll walk you through the process of setting up vector and hybrid search using OpenSearch and its Neural Plugin, demonstrating how easy it can be to harness the power of advanced search technologies. Whether you’re a seasoned developer or just starting your journey with vector search, this tutorial will equip you with the knowledge to elevate your search capabilities to new heights.
Enable Cluster Settings
To begin our journey into advanced search capabilities, we must first configure our OpenSearch cluster to accommodate machine learning tasks. This crucial step involves modifying the cluster settings to optimize performance and ensure compatibility with the Neural Plugin. Execute the following command to adjust the cluster settings:
copyPUT _cluster/settings
{
"persistent": {
"plugins": {
"ml_commons": {
"only_run_on_ml_node": "false",
"model_access_control_enabled": "true",
"native_memory_threshold": "99"
}
}
}
}
This configuration serves several important purposes:
It allows machine learning tasks to run on any node in the cluster, not just dedicated ML nodes, increasing flexibility and resource utilization. In production, you might want to allow machine learning tasks on certain nodes only.
It enables access control for machine learning models, enhancing security and governance.
It sets a high native memory threshold, ensuring that the cluster has ample resources for computationally intensive tasks.
By implementing these settings, we lay the foundation for the subsequent steps in our vector and hybrid search implementation.
Create a Model Group (Optional)
While this step is optional, it is highly recommended for organizational purposes and streamlined access control. Creating a model group allows you to logically group related models, facilitating easier management and access control. Implement this step using the following API call:
copyPOST /_plugins/_ml/model_groups/_register
{
"name": "NLP_model_group",
"description": "A model group for NLP models",
"access_mode": "public"
}
This operation creates a dedicated group for Natural Language Processing (NLP) models. The resulting model group ID will be utilized in subsequent steps when initializing your model. By categorizing models into groups, you establish a structured approach to model management, which proves invaluable as your machine learning infrastructure grows in complexity.
The response should look like that:
copy{
"model_group_id": "jpGt1pAB2ktECB55tBBC",
"status": "CREATED"
}
Note down the model group id, we need it for later.
Register a Model
With our cluster configured and model group established, we proceed to register our chosen model with OpenSearch. This step is critical as it makes the model available for use in our search operations. In this example, we’re utilizing a pre-trained model from the Hugging Face model hub:
copyPOST /_plugins/_ml/models/_register
{
"name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
"version": "1.0.1",
"model_group_id": "jpGt1pAB2ktECB55tBBC",
"model_format": "TORCH_SCRIPT"
}
This registration process accomplishes several objectives:
It specifies the exact model we intend to use for our vector embeddings.
It defines the version of the model, ensuring consistency and reproducibility.
It associates the model with our previously created model group.
It indicates the format of the model (in this case, TORCH_SCRIPT), which is crucial for proper execution.
Upon successful registration, OpenSearch will provide a model ID. This identifier is of utmost importance and will be referenced in subsequent steps when configuring our ingest and search pipelines.
Note that this step is downloading the model from Huggingface to your machine. This is done asynchronously. Hence, the respone we get back does not contain the model ID directly, but a task ID. The task is downloading the model in the background, and we will get a model ID only when the task is finished.
The response of this step looks like this:
copy{
"task_id": "j5Gw1pAB2ktECB55MRCC",
"status": "CREATED"
}
With the task ID, we can now query OpenSearch for the status of the task:
copyGET "https://localhost:9200/_plugins/_ml/tasks/j5Gw1pAB2ktECB55MRCC"
When the model is downloaded and ready to use, the response looks like:
copy{
"model_id": "kJGw1pAB2ktECB55NhDC",
"task_type": "REGISTER_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"4uiHK3eFS-yYQbvnYJIJyw"
],
"create_time": 1721588789549,
"last_update_time": 1721588842941,
"is_async": true
}
Now, finally, we have our model ID (in this case “kJGw1pAB2ktECB55NhDC”) which we can now use to refer to the model in the next steps.
Deploy Model
The final step in our model setup process is to ensure that the model is properly deployed and ready for use. In this context, deployed means the model is loaded into memory. This step is also executed in an asynchronous way, so as before, the response of the deploy command will give us a task ID which we can then use to query the state of the deployment task.
Let’s deploy the model using the model ID of the previous step:
copyPOST "https://localhost:9200/_plugins/_ml/models/kJGw1pAB2ktECB55NhDC/_deploy"
The immediate response looks like:
copy{
"task_id": "kZG51pAB2ktECB55vxAe",
"task_type": "DEPLOY_MODEL",
"status": "CREATED"
}
After waiting a bit, we use the task API to check whether our model is already deployed:
copyGET "https://localhost:9200/_plugins/_ml/tasks/kZG51pAB2ktECB55vxAe"
If the deployment is finished, the response should look like:
copy{
"model_id": "kJGw1pAB2ktECB55NhDC",
"task_type": "DEPLOY_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"pMJ8x72HTM-w9t-a0U34Bw",
"4uiHK3eFS-yYQbvnYJIJyw"
],
"create_time": 1721589415708,
"last_update_time": 1721589435911,
"is_async": true
}
That’s it. Our model is now up and running and ready to do some AI magic. As a last step, we can use OpenSearch Dashboards to check our installation.
Open Dashboards and login, and then navigate to the Machine Learning plugin settings:
And on the next screen we can see that our model is deployed correctly and ready to use:
Which model should I use?
We have not spoken about the actual model we have used. Why did we use this model, which model should you chose, and what options are available?
Well, the short answer is - it depends. It depends on your use-case, on your requirements, and on the nature of the ingested data. In the end, you need to test the
supported models and see which one performs best for your case. I know this is not really satisfying, but it is the best answer I can give at this point in time.
What’s Next?
In the next article we will use our deployed model to create an ingest pipeline that will automatically create and store vector embeddings for any ingested document. In the last part of this series we will explore how to use lexical, vector and hybrid searches on the ingested documents.
Articles in this Series
Implementing Vector and Hybrid Search with OpenSearch and the Neural Plugin - Part 1
Implementing Vector and Hybrid Search with OpenSearch and the Neural Plugin - Part 2 (coming soon)
Implementing Vector and Hybrid Search with OpenSearch and the Neural Plugin - Part 3 (coming soon)